These three articles have previously been published in the Swedish daily Svenska Dagbladet in 1994:
Svensk version från Svenska Dagbladet 1994.
I. The electronic age - on the verge of total memory loss?
II. The hacker - archaeologist of the future?
III. Dare we trust the authenticity of electronic texts?
I. The electronic age - on the verge of total memory loss?
by Karl-Erik Tallmo
Computers have memory. Our civilization has a kind of memory as well, a collective memory, in the form of libraries, archives and museums.
What will happen to all this, when so much is and will be published and stored solely on electronic media? Global computer networks can transmit information as fast as we think. But, is there a risk that this global memory could just as fast be stricken with cybernetic amnesia?
Vice-president Al Gore envisions an information superhighway that will provide both schoolchildren and scientists with information and reference materials from all corners of the world. Within the European Community there are projects along the same lines. Fiber-optic cables are able to transmit 100 Gigabytes a second, which means that everything ever written could be transmitted in an hour with such cables.
The computerization of public libraries started with the catalogues. Now the collections themselves are being transferred to electronic media, partly by digitizing old printed or microfilmed material and partly by aquisition of new material that has never been published in any other way than electronically.
The Center for Electronic Texts in the Humanities (CETH, at Rutgers University in New Brunswick, New Jersey) estimates that 8,000 series of source texts within the humanities already have been converted to machine-readable form world-wide.
Thesaurus Linguae Graecae is text-only, but text in abundant quantity: 57 million words. This database provides users in at least 32 different countries with classical Greek source-texts. Patrologia Latina, published by Chadwyck-Healey in Alexandria, Virginia, is a database containing the Latin part of Jacques Paul Migne's editions of the Church Fathers.
Pictures are also accessible through computers and modems. The Smithsonian Institute in Washington hosts a huge database with images of objects from their collection. In the great project American Memory, pursued by the Library of Congress, selected parts of the collections are transferred to digital media, for instance texts by the founding fathers. Apart from historical documents, there are also some 40,000 photographs that will be digitized and put on CD-ROM (the same kind of disk that the record-industry now uses for audio). The pictures in question are, for instance, the famous Mathew Brady photographs from the civil war, street scenes from New York at the turn of the century, etc.
Perseus is a database at Harvard (also available on CD-ROM), containing classical Greek texts as well as pictures of vases, urns, coins, sculptures and maps showing archeological sites.
You hear a lot today about the virtual library, which is not a place but an electronic representation. The collection that appears on the computer screen might actually be several collections, several databases situated on different continents. The keyword for the library of tomorrow is access, not storage.
There are experiments going on, for instance at the university library in Milan, Italy, to visualize book catalogues on the computer screen with pictures of shelves that you can browse through. When you find something interesting you can pick up the book from the virtual shelf, and in an instant you will be connected to a database, where the work is stored in full-length.
Material that is not available in any full-text database may be scanned and digitized on demand and then transmitted via telephone and modem. Such experiments are being carried out in, for instance, Holland, where different libraries are responsible for different periodicals. They are obligated to deliver any article within 24 hours. North Carolina State University has developed a special tool with their own interface for transmission of articles scanned on request.
Pictures of objects do not need to be static. With a new filming technique you can get a three-dimensional effect. Here, the film does not convey motion but dimension. The film is viewed on the computer screen, and you can turn the object around and look at it from all sides. Such video films may be transmitted on the computer networks.
Among the new electronically published material are encyclopedias with sound and moving pictures. Both Grolier and Compton have a multimedia encyclopedia. In a small window you can see for instance the airship Hindenburg catching fire over New York in 1937 or hear John F. Kennedy speak... There are also periodicals, which are published exclusively in digital form, and a lot of technical documentation is released on CD-ROM.
The prophecies, made by the pioneers within information technology, are now indeed coming true.
Back in 1937, H. G. Wells had a dream of a "Permanent World Encyclopedia," a synthesis of bibliography and documentation with the indexed archives of the world:
There is no practical obstacle whatever now to the creation of an efficient index to all human knowledge, ideas and achievements, to the creation, that is, of a complete planetary memory for all mankind. And not simply an index; the direct reproduction of the thing itself can be summoned to any properly prepared spot. A microfilm, coloured where necessary, occupying an inch or so of space and weighing little more than a letter, can be duplicated from the records and sent anywhere, and thrown enlarged upon the screen so that the student may study it in every detail.
Just change microfilm here to computer files! Wells also claimed that facts and information, however multitudinous, can be managed and recalled once they have been put in place in a well-ordered scheme of reference and reproduction. This global memory need not be concentrated in any single place and thus it would not be vulnerable:
It can be reproduced exactly and fully, in Peru, China, Iceland, Central Africa, or wherever else seems to afford an insurance against danger and interruption. It can have at once, the concentration of a craniate animal and the diffused vitality of an amoeba.
A few years later (1946), Roosevelt's scientific adviser Vannevar Bush (who appointed Robert Oppenheimer to direct the nuclear experiments in New Mexico) envisioned a machine, Memex, which would store the accomplishments of human thought:
[...] the intricacy of the trails, the detail of mental pictures is awe inspiring [...] Man cannot hope to fully duplicate this mental process artificially, but he certainly ought to be able to learn from it.
The intricate trails of human thought also inspired author and interactive software designer Ted Nelson. In 1965 he coined the term hypertext, a notion that now is realized as "intelligent links" between words or objects in one or a number of databases.
Hypertext exists solely on computers. Through common word processing we are now accustomed to text that can be formed and re-formed like a piece of clay, time after time. With hypertext you can in addition turn a word (or a paragraph or a headline) into a sort of button, which you can press or click on (with the "mouse"). Then a footnote appears, or maybe a bibliographic reference; perhaps you will be able to read the whole work that was just quoted in the main text, or you will see a picture, a diagram or hear or watch a short audio or video recording. These texts and pictures might then be linked further.
"Writing is sequential, since it grew out of speech," Ted Nelson writes in his book Dream Machines. "But the structure of ideas is not sequential. They tie together every whichway."
Electronic publications are often cheaper to produce than printed ones. The collection "The Papers of George Washington" contains some 135,000 documents, which are being published in a very expensive 17 volume book edition as well as on CD-ROM. The publishers claim that the CD-version will be affordable even for smaller libraries, colleges and schools. Updating of technical documentation is much cheaper when stored on electronic media, since you don't need to reprint perhaps hundreds of books.
One great advantage with source text stored in large databases is the possibility to execute very fast and flexible searches. Suppose you are writing an essay on stomach diseases through the ages. Usually you search through encyclopedias and book catalogues for keywords such as `stomach', `metabolism', `ventricle', etc. But if you search with the computer in a database, containing the same encyclopedia and the same book catalogue, and the search routines allow not only searching among the entries but in the whole body of text, you would of course find the word `stomach' in contexts you never could have imagined otherwise. Researchers and scientists may find source material very fast, and it can be analyzed in different ways with the computer: the structure of Rembrandt's brush strokes, the occurrences of a certain word together with a certain other word in Milton.
Many people - Elli Mylonas, managing editor for the Perseus project for one - claim that the reason for digitizing source material is not only for research purposes, but to preserve knowledge for the future. This could be a better method than microfilming or photocopying old brittle books, now deteriorating due to the acidic paper on which they were printed. Such paper was manufactured from around 1850 and a hundred years on. Right now printed matter from a whole century is being silently destroyed on library shelves worldwide, a loss that could be far worse than the burning of the library in Alexandria.
So, let us discuss vulnerability. H. G. Wells was aware of the importance of not putting all the eggs in the same basket. But is it sufficient that this huge amount of global digitized information is divided among thousands of computers all over the world? Most people know that computer technology develops very fast. Approximately every five years the industry presents a totally new generation of computers, with new processors, which need new operating systems and new software. Dare we hope that all this digital information that is supposed to save us all from oblivion, will still be readable 20, 50 or 100 years from now?
We already have great difficulty with other media. Suppose somebody gives you a Betamax video from the 70's. Where can you find a player for such a tape today? Many of us also have audio tapes on large 7-inch reels at home, but we only have cassette decks in our stereo systems.
Not many national archives and libraries have a strategy for dealing with electronically published works. The Library of Congress in Washington has arranged some very interesting seminars about digitizing older material but they have no guidelines yet for new materials. In Europe most libraries await the technical development, although some (Sweden for instance) have set up committees for investigating this matter. Norway, however, has a unique law that regulates legal deposits of electronic documents.
National archives storing old government files often find themselves on the edge of a technological time-out. Many still have access to old hardware and spare parts, but for how long? At regular intervals old tapes must be recopied to prevent the magnetic field from deteriorating. And like winery workers who turn champagne bottles, archive staffs have to rotate the computer reels on their shelves 90 degrees every sixth month - for protection against earth magnetism.
"Several decades of U. S. census information are reportedly stored by an electronic data processing technology that is now so obsolete as to make all such records completely inaccessible," says Steven Newcomb, who works with standardization issues with TechnoTeacher Inc. in Tallahassee, Florida.
NASA has hundreds of thousands of obsolete computer tapes with earth satellite information that is already unreadable.
Many archives have routines for recopying and some plan to copy the information to some more resistant medium, because this maintenance is very expensive.
But nobody actually knows today what medium to choose. The CD-ROM is a candidate, but some experts believe that oxygen might diffuse through the plastic surface and then affect the metal foil where the information is engraved. And if the disk lasts, the problem to maintain hardware remains.
We hear a lot about how much space is saved by storing on optical disks compared to keeping piles of paper and rows of books. But what if the libraries of the future have to be museums too - with hundreds of antique computers set up in hundreds of different configurations?
The difficulty is great enough when we deal with simple documents containing pure alphanumerical information. But when it comes to multimedia publications, with text, pictures, video and sound, the complexity is almost incalculable. Today this mostly concerns dictionaries and other reference works, but in the years to come this will be an art form we come across every day, as we do now with ordinary books or VCR-tapes.
The James Joyce and Orson Welles of the future might express themselves through some form of multimedia technology. Suppose a groundbreaking work is created around the year 2010. Will anybody in the year 2050 be able to read/listen to/watch this work as the artist intended?
Many persons within the computer industry and the library community think that it will be possible in the future to restore older files, at least to get a rough idea of what an old masterpiece is all about. What they have in mind is the rescue of some text and maybe a picture or two but not to make it run together according to the artist's original intentions. Imagine if today, instead of watching the movie Citizen Kane, we would have to be content with reading a short summary in plain text, because no projectors could run the film format that Orson Welles used.
Is something being done to prevent this gloomy future? Can anything be done? In the next article in this series some representatives from the computer industry and the library community will comment on this.
[English Homepage]
[Svensk bassida]
Go to [II. The hacker - archaeologist of the future?]
Go to [III. Dare we trust the authenticity of electronic texts?]
Copyright Karl-Erik Tallmo 1993, 1994.