Abstract

On my computer I have a plain text file of Charles Dickens's novel Great Expectations , an M4V video file of the 1998 film adaptation starring Ethan Hawke and Gwyneth Paltrow, and an MP3 copy of the film's soundtrack. The digital film takes up 1,133,953,365 bytes on my hard drive, the soundtrack 158,060,500 bytes, and the full text of the novel only 1,013,777 bytes. Why should a novel, which fills 476 pages in the edition on my bookshelf, occupy only 0.6% of the hard drive space taken up by a one-hour soundtrack? Why should an average-length film – it runs for one hour and forty-six minutes – take up more than one thousand times the space of a long novel? The answer is simple: it is much easier to convert text into digital form than it is to convert any other modality. Despite the fact that digital video, music, and photography are now ubiquitous, whereas electronic books still have not entirely dislodged our affection for the printed book, text was in fact the first modality to “go digital.” This is because text – whose basic unit is the letter, of which there are relatively few – is easily translated into binary code, the native language of computers. The first widely accepted standard for turning text into binary code was the American Standard Code for Information Interchange, or ASCII, which was formalized in the early 1960s and is still in wide use. Using seven bits (seven ones or zeros), the earliest version of ASCII included codes for lowercase letters a–z, uppercase letters A–Z, numbers 0–9, punctuation, and some special commands. In 7-bit ASCII, the phrase “Search me” becomes 101001111001011100001111001011000111101000010000011011011100101, where 1010011 = S 1100101 = e 1100001 = a 1110010 = r 1100011 = c 1101000 = h 0100000 = (space) 1101101 = m 1100101 = e Although confusing to a human eye, this relatively simple translation of human written language into the language of machines makes text a natural fit for computing. Not only are computer text files very small but they are also very easy to search: simply by seeking out a particular string of ones and zeros – child's play for even the slowest of machines – a computer can instantly identify any combination of words in a digital text.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call