Project Gutenberg Australia
a treasure-trove of literature
treasure found hidden with no evidence of ownership

Home Our FREE ebooks Search Site Site Map Contact Us Reading, Downloading and Converting files


Background to Project Gutenberg

Project Gutenberg (PG) began in 1971 when Michael Stern Hart (Note1) was given one hundred million dollars worth of computer time by the operators of the Xerox Sigma V mainframe at the Materials Research Laboratory at the University of Illinois. Mr Hart suggests that he happened to be in the right place at the right time as there was more computer time than people knew what to do with and the operators were encouraged to do whatever they wanted with that fortune in "spare time" in the hope that they would become more proficient at their jobs. After due reflection, Mr Hart decided that one of the most effective uses of computers would be the storage, retrieval, searching and reading of material stored in computer libraries. He then proceeded to key in the American Declaration of Independence and produced the first electronic text (ebook) in the PG library. The rest, as they say, is history.

Creation of an ebook of the Declaration of Independence was followed by the American Bill of Rights, the US Constitution, the Bible, Shakespeare (a play at a time), and then by general work in the areas of literature and reference. From December 1971 to December 1993 one hundred ebooks were produced. This was no mean feat when one considers that the list includes Shakespeare, the Bible and other considerable works. All had to be keyed in and then checked by proof reading and comparison with the printed work. Appropriately, and not coincidentally, ebook one hundred was The Complete Works of William Shakespeare.

Now, with the advent of computer scanners (which enable one to "read in" printed pages and convert them to editable electronic text) and the increase in popularity of the internet, there are nearly twenty thousand ebooks available in the Project Gutenberg library. A prodigious effort by the many volunteers involved in converting printed works into ebook.

One might think that the pool of printed works will run dry, however this can never happen because every year new works become available as the copyright on them runs out. Furthermore, volunteers have begun the work of converting to ebooks the literary gems of other languages, thus opening further rich veins of literary ore for plundering.

Electronic Data

The premise on which Michael Hart based the Project Gutenberg concept was that electronic data stored in a computer can be reproduced indefinitely by passing it from computer to computer. Once a book or any other item (including pictures and sounds) has been stored in a computer then any number of copies can be made. Everyone in the world, or even not in this world (given satellite transmission) can have a copy of a book that has been entered into a computer. When people holiday on Mars, later this century, they might have a copy of Homer's Iliad beamed up to them. The book that they always meant to read. They would only need to specify the required language.

It was decided to store ebooks in the simplest, easiest to use form available: the "plain vanilla " or ACSII (Note2) format, the basic characters one reads on a normal printed page. Italics, underlines, and bolds would be capitalized as they are not supported by many basic text readers. This decision was made because 99% of the hardware and software in use all over the world can read and search these files. Any other system of ebook storage will fall short of an audience of 99%. Furthermore, ebooks stored in this format are easily converted to many other formats, such as that used in word processing and that used to represent text on internet web pages (i.e. HTML). (Note 3)

Michael Hart has said that he wants people to be able to use PG ebooks to look up quotations that they have heard in conversation or in movies, or which they have read in other books. He envisages a compact disc(CD) containing all PG titles, which will constitute a library containing all these quotations within the individual ebooks. One could easily search the entire library without any program more sophisticated than a plain search program found on every personal computer.

The text of an average book will fit on a standard 3.5inch floppy disk, available on most personal computers. However, pictures such as those in the book Alice in Wonderland present special problems for electronic reproduction because of the computer disc space which they take up. Nevertheless, Project Gutenberg is very interested in including pictures and other graphics and will continue to take advantage of developments in computer technology to add to the richness of its library of free, readily available literary and reference works.

Scope of the library

The cataloguing and indexing of the library is still under review and is, in itself, a major undertaking. However, works may be broadly classified as follows:

* Light literature such as Alice in Wonderland, Through the Looking-Glass, Peter Pan and Aesop's Fables.

* Heavy Literature such as the Bible and other religious documents, Shakespeare, Moby Dick, and Paradise Lost.

* References such as Roget's Thesaurus, almanacs, a set of encyclopedia and dictionaries, philosophy and natural history texts.

There is no substitute for a good book

Many people point out that there is no substitute for the look, feel and smell of a book and that it is easy to browse through it, mark relevant passages and look at the illustrations. This is perfectly true, and one might say that the use of ebooks has until now been largely restricted to using them to find specific references, since one needs to sit at a computer to view them. Until now, that is.

Sometimes we must wait for technology to catch up before we can make use of an existing situation. The internet existed in only a crude form when Mr Hart started keying in the Declaration of Independence. We had to wait for computers to become cheap and ubiquitous for the production of PG ebooks to explode. In the same way, technology is only now making available portable electronic readers with which we will be able to read ebooks, or have them read aloud to us via text recognition software, wherever we can now read a book. As one sits on Mars and uses a voice command to open The Iliad to a bookmarked position one might issue the command "mouldy old paper" to have the electronic reader exude the smell one most associates with old books.

It is part of Michael Hart's genius that he saw the potential of Project Gutenberg and persisted with the concept for over twenty years before technology turned the project into something beyond, dare I say, even his wildest dreams. There is no substitute for a good book. It is just that its present form may not matter all that much to future generations.

Volunteers

The continuing success of Project Gutenberg depends on volunteers. As Michael Hart has frequently pointed out, PG is made up entirely of volunteers who produce ebooks, proof read them, post them to the PG internet site, post copies on "mirror" sites around the world, maintain the computer hardware and software involved in the project, correct errors in the text as noted by end-users, do copyright checks and attend to the many administrative tasks involved with any major co-operative project.

Volunteers choose which texts they wish to work on and hence which ebooks are posted to the PG site. Since any book out of copyright (Note4) may be used, there is a bewildering choice of titles. Any title chosen is subject to a copyright "clearance" after which it will usually be accepted for posting. Some volunteers prefer to proof-read work prepared by others. Or, one may become involved in "helping" Mr Hart put the finishing touches to texts before posting, such as adding headers and footers or making minor formatting changes.

Many books which might otherwise slip into oblivion are rescued by PG volunteers for future generations. For example "A New and Comprehensive Vocabulary of the Flash Language", a dictionary of Australian slang and the earliest dictionary of any sort produced in Australia, was written by James Hardy Vaux in 1812. A PG volunteer recently submitted an ebook of this book to Project Gutenberg Australia thus making it now readily available to all.

When you are reading your ebook of The Iliad whist holidaying on Mars, spare a thought for the prodigious amount of work which has been undertaken by Michael S Hart and the PG team to bring it to you just when and where you want it. It is priceless, yet it doesn't cost a cent.

Conclusion

When Johann Gutenberg invented the printing press he unleashed an unstoppable process which facilitated communication between members of the human race and the passing of knowledge and ideas in ways previously undreamed of. The invention of the computer and the expansion of the internet have extended the capacity to pass on such knowledge and ideas. Project Gutenberg, as the repository of the condensed knowledge and ideas of some of the greatest minds in human history, contributes in no small way to this process.

Acknowledgments

Much of the background information in this article was drawn from The Project Gutenberg internet site

Notes:

1. Michael S. Hart, Professor of Electronic Text at Benedictine University (Illinois, U.S.A.) and Visiting Scientist at Carnegie Mellon University (Pennsylvania, U.S.A.), founded Project Gutenberg in 1971 and is currently its Executive Director. In a November 1998 article in Wired Magazine, Hart was chosen among "The Wired 25: A Salute to Dreamers, Inventors, Mavericks, and Leaders."

2. ASCII is an acronym for American Standard Code for Information Interchange, a standard for storing characters and numbers in computers.

3. HTML is an acronym for Hyper Text Markup Language.

4. In the USA, books are generally out of copyright seventy-five years after publication. As a rule of thumb, books published before 1923 are eligible. Full details are provided on the PG site. Under Australian copyright law, literary works published and offered for sale in Australia during an author's lifetime are protected for the life of the author plus seventy years from the end of the year of the author's death. However, see this article.

Home

Updated 9 Jan 07