University of Edinburgh

From Transcriber to multimedia producer: New Documents, New Labours

Kevin Carey 1997

The evolution of documents has been very slow: first came line drawings, then linear script and, finally in the period we now think of as ancient history, the two-columned inventory or financial record. Later came the illumination of linear text and, finally, printing which potentially broadened the scope of publishing but didn't actually make very much difference to the essence of the documents published but did, of course, affect the variety of subjects and the numbers in circulation. Indeed, right up to the end of the 17th Century most documents - excluding poetry and drama - were monologues or dialogues in linear script and all knowledge was classified according to a hierarchy or tree structure. This meant that any kind of knowledge could only be one twig on one branch of a tree; it was either this or that.

It was the great philosopher, Leibniz (1646-1716) who developed the concept that certain ideas and even objects could be analysed from different points of view and therefore might be included in more than one classification. In a real sense - and we will come back to this before very long - he invented the idea of hypertext. What this meant then was the development of cross-referenced encyclopaedias and the notion that all knowledge could be integrated and organised in a variety of ways depending upon the required objective. About the same time, incidentally, a political theorist (Vico 1668-1744) hit upon the idea that words - and therefore concepts changed their meaning through time and these two developments together put an end to the monopoly of tree-like classification.

It was a long time, largely because of technology, before documents caught up; encyclopaedias were, of course, cross-referenced and Roget's Thesaurus is actually a piece of hypertext forced by technology into a linear form with cross-referencing. What computers have made possible in the area of document presentation, is real hypertext.

Until computers we only experienced hypertext in the clumsy form of the cross-reference; you would look up an historic period and each major figure would be cross-referenced as might be armaments, costume, husbandry, pottery. You could keep track of all of these things with an array of book-marks but your access would be immediately limited by the content of the book you were reading, in no matter how many volumes. Without references to other books or your own knowledge of them you would be stuck with what you have.

Whereas the physical attributes of books limit your exploration - you have to keep marking them, picking them up and putting them down and nobody has all of them - the potential of linked computers to supply information is practically unlimited. The main service of this sort at the moment is provided by what we call the World Wide Web. Here, typically, you start off with an address, key it into your computer and are presented with what is called a "home page"; this is like a table of contents and reminds us of the good old tree structure; click the item you want and you are taken there. But any Web location or "site" as it is called, is only as good as the number of links that are provided. So, if you are reading Web pages - or visiting a site, as it is called - on a particular painter you would expect to find links to the Web pages of art galleries where his pictures are hanging or, alternatively, if you are reading the pages of a particular gallery you would expect to find links to biographies of its painters. That is the ultra-formal side of it; but you might also find links to the fan clubs of various

painters, complete lists of their works, theories about their technique, journal articles devoted to them, newspaper clipping references to thefts and forgeries, and simple opinion and gossip without any sort of organisation or peer review. Imagine, then, how many more links you would find to something more popular and topical such as rock music; and then imagine that web sites dealing with rock music also provided samples of their recorded music and a concert ticketing service; and imagine that sites about film stars included clips of their movies. Now you are getting close to understanding the World Wide Web.

Let us leave the Web for a moment and go back to old-fashioned printing. The invention of photography, of images on film, brought about the possibility of imposing one image on top of another and the huge range of possibilities presented by colour photography and colour printing made superimposition yet more viable. You will all have seen children's books, advertising copy and, of course, the end credits of films, where text is superimposed on a picture. It was first, too, in children's books and then magazines where the idea of uniform blocks of linear type was first challenged. Now we would be rather dismissive of any book other than a novel which simply gave us one block of uniform linear print per page. The main point I want to make here is not to confuse the carrier with the message; it isn't the computer or the world wide web that made the presentation of information less uniform, the process was already well under way in the print industry; it is just that the computer has presented us with the phenomenon in a more acute, immediate form. What the Web has done to documents is to make them more varied and more discrete but all that computers are making us do is face up to problems we have until now mostly avoided.

Let me add two further complexities, both referring to text. We have become accustomed, with our computers, to thinking of text as quite distinct from pictures: you have ASCII text which you edit on a word processor, each character being given an electronic value and you have graphics packages which allow you to draw pictures. But what if you draw letters of the alphabet with your graphics package. We then have the modern equivalent of a painting by an old master where the subject is reading a bible where the text can clearly be seen in the oil painting. the second complexity is letters of the alphabet being used in a dynamic form, like a moving picture, so that the letters blink, or expand and contract. These phenomena have two consequences for transcribers: first, there is the technical problem of rendering the letters into braille through a translation programme. Because I think the problem is exaggerated, that a solution can be found, I will leave this to one side; but the second, describing exactly what the phenomenon is, is part of the wider problem of describing objects which are other than static, linear text.

Let me now gather together all these points by describing a possible document with which we will be confronted. Let me invent a CD-ROM of the life of Henry VIII. And let us assume that we are not as transcribers trying to prepare a braille book with some tactile diagrams; let us instead think of ourselves as multi media producers enabling a blind person to access the information; we will prepare an electronic file accessible on a modern computer and reserve the option to accompany this with add-ons such as tactile diagrams sent through the post.

  • First, we have the text. It is displayed in four different fonts: one which is normal; the second which indicates key points to be remembered; the third is the trigger for links to other documents; the fourth signifies nothing but is merely a decorative device. The first three are easy for a braille transcriber to cope with: standard is standard; key points can be represented by italics; links can be shown using a special code for what is called Hypertext Mark-up Language, or HTML; but what do we do about the gratuitously fancy text? Do we ignore the differentiation because it signifies nothing and is merely decorative; or do we devise special symbols and, if these are not recognised by a legislating braille authority, how far are we entitled to do this?
  • Secondly, if the text is distributed in boxes, in what order should we arrange it? This is a problem similar to that we face for interspersing tables into a text; the reader may be asked to oscillate between two or even three blocks of text; how best should we order these?
  • Thirdly, there are headings sub-dividing the text. These are far greater in variety than the range offered by a transcriber's manual. Do we simply follow the hierarchy in the manual or do we tell our reader what the print headings are, their size and the degree of inking.
  • Fourthly, there are Henry's accounts which we know how to set out in paragraph form representing a table; but our CD-ROM also shows graphs with a pointer moving along the curve; is this simply a helpful device or is it an integral part of the text?
  • Fifthly, there is a series of line drawings of various complexity; which can we represent with tactile drawings and which should we describe; and should we allow considerations of relevance to influence our judgment. If the pictures are not an integral part of the work, should we bother with them?
  • Then, of course, there are pictures of Henry. Are we satisfied with simply saying "Picture of Henry VIII by Holbein, 1535, National Portrait Gallery" or would we care to remark upon the state of his temper or his health? And would we say, for that matter, whether Anne of Cleves really does look like the Mare of Flanders or do we find her rather fetching?
  • To add to our misery, or excitement, depending on which way you look at it; there is a clip of Charles Lawton without the sound; and a clip of Keith Michelle as Henry which requires an AUDETEL input.
  • Finally, of course, the text is full of links to other Web sites. Here we have the essence of the problem. With hypertext there's no frame that you can put around a document; at some point you have to decide that that is as far as you are going to go.

Coming at the problem from a completely different angle, the authors of the language used by the Web, HTML 4.0 are trying to build into that system functions which will allow blind people to access text; and they are trying to persuade those who use pictures to include descriptions; but you can see that there is still a huge need for an intermediary multi-media producer.

That is where we are now. It is fondly imagined that what I have just described is a long way in the future; it isn't and the consequences for blind and visually impaired people of what is happening are enormous.

First, the good news. Nobody will be able to argue any longer that it is not worth learning braille because it's a scarce commodity. We have already seen the development of computer produced braille with the kind of translation software you will see today; what has now happened is that there is enough electronic text available to supply the most voracious and the most specialist reader. Good electronic translation and judicious editing will give braille readers more text than they could possibly have imagined even five years ago so to that extent we are at the beginning of a new era for braille readers as well as for those who prepare braille.

The bad news, as we all know in our daily lives, is that the amount of information dependent on images, as opposed to pure text, is ever increasing. English may be becoming the international language of news and entertainment, business and science but a global information system, particularly embracing such massive countries as China, Indonesia, Russia and Brazil will require an ever-increasing pictorial content; what happens at international airports today and what has crept in through computer icons, will be ever more prevalent in our information society.

Some people say that the dilemma I am presenting only applies to a few people; that the vast majority of people, including blind and visually impaired people, are not going to be bothered with the World Wide Web or even with computers. That, of course, is literally true but not actually very important. In spite of the pronouncements of the intelligentsia, it isn't computers that have brought about the information revolution in the second half of this century, it is television; and that is how the revolution will continue. What we think of as the telephone, the television and the computer are going to merge into one instrument; the set top box or the 'thing in the corner'. Within a decade almost all our domestic televisions will be digital; we will be using them for all kinds of transactions: for banking, shopping, enjoying films and sport when we want them instead of being bound by schedules; we will be able to listen to our talking books and look at today's pictures of our grandson in Australia. The opportunities are incredible but you can see that there is also a potential information gap between those who can absorb visual information and those who can't; what radio gave to blind people with one hand, television took away with the other; and what ASCII gave with one hand, the graphics file is in danger of taking away with the other. What are we to do?

The first and obvious answer is not to get overwhelmed by all of this. We all have to face the fact that none of us will ever access all the information available in the world; it's being created at an enormous extent and its availability outside deposit libraries makes the growth look even greater still. We must also recognise that those who can't see pictures will still benefit enormously from the Information technology revolution. What we have to do, as multi-media producers, as I would describe us, is to grasp the essence of whatever document we have before us, to comprehend the author's intention and to render it as best we can into information that a blind person can access. One of the great benefits of braille translation software is that it can free us from the simple, tedious task of transcribing code and allow us to tackle these much more challenging aspects of helping visually impaired people to understand the world we live in.

There was a time, not very long ago, when traditional transcribers with their Perkins Braillers thought that people like Pia would be doing them out of a job; what has happened is that they are being freed to do a much more difficult job. The more successful the technicians are, the more opportunity we will have to exercise our creative faculties; that is what I call progress.