Below is a paper, “Early Histories of OCR (Optical Character Recognition): Mary Jameson and Reading Optophones,” I wrote and delivered with Jentery Sayers for the 2017 Modern Language Convention in Philadelphia. The talk was part of the “Histories of Digital Labor” panel convened by the MLA Committee on Information Technology and organized by Shawna Ross. Thank you, Shawna!
Today, the term “technological innovation” is typically understood as a teleological process by which ideas become prototypes and then viable products. Such a definition privileges invention, seamlessness, and ease of use over the people and labour involved in developing, maintaining, using, and repurposing technologies over time. It encodes biases about who innovators are and how innovation spreads: for example, we might consider innovators who have been previously ignored based on their gender, race, and ability. We might also consider how such identities and relationships are constructed and negotiated both within and through technologies over time. This prompts us to think about “sustainability” differently—that is, not as a product, technology, or standard built to last in as stable a form for as long as possible. Instead, we can more closely attend to the people, labour, and conditions that made technologies possible and also maintained them throughout history.
Consider, for example, the reading optophone (see Figure 1): an aid for the blind that converted text into sound during the twentieth century, from the 1910s until at least the 1960s. Although it never existed in a stable or fixed form—and, to our knowledge, no stable, working version survives today—one common configuration involved operators placing books and other print materials on its curved glass top.
They then used a handle to move a reading head (called a “tracer”) located below the glass, sliding it back and forth horizontally to scan pages. The tracer used an element called selenium to detect contrasts between white page and black type and convert this pattern of contrast into a stream of sound or silence. To listen, operators wore telephone receivers over their ears like headphones, converting these patterns of sound into characters or words. Operators could also tune the optophone with a knob, as well as physically control the pace and location of reading. Here are a few example tones, recorded by Patrick Nye in the 1960s, likely captured for Haskins Laboratories, and which were acquired and digitized by Mara Mills at New York University.
Today, the reading optophone could be considered a precursor to Optical Character Recognition (OCR) technologies, which convert vast amounts of page images into machine-readable and searchable text for digitization projects such as Google Books. Additionally, the optophone’s use of selenium to detect contrasts between light and dark area could be understood as a precursor to various computer vision technologies, such as facial recognition algorithms in software like iPhoto. However, histories of the optophone ignore these correspondences between old and new media, typically treating the device as a playful, avant-garde experiment by Raoul Hausmann and Dada during the early 1920s. While it’s true that Hausmann experimented with the optophone for art installations, his optophone work consisted exclusively of illustrations and schematics. To our knowledge, he never built or installed an optophone; he may have patented one, though. More important, attention to Hausmann eclipses the histories of disability entwined with optophones and selenium. Meanwhile, other popular histories of optophonics are anchored in commonplace narratives of the lone male inventor, attributing the reading optophone to E. E. Fournier d’Albe, who patented it in 1919. Consequently, with the exception of research by Mara Mills, no scholarly attention has been given to the development and maintenance of reading optophones from the 1920s through the 1960s. Yet this labour point of view is crucial, both practically and conceptually, to their relevance today.
Pictured in Figure 2, Mary Jameson played a central role in optophone development, even though historical documentation only describes her as a “user” or demonstrator. Such descriptions diminish her role in media history by failing to recognize her significant contributions to the optophone project as the device changed from one version to the next. At the University of Victoria, we are currently remaking the optophone with a combination of physical computing and other fabrication techniques (see Figure 3). Remaking the optophone with Jameson in mind offers us a way to stress maintenance, development, and incremental change in contrast to masculine, “make or break” narratives of innovation and hyperbole. It also helps us better understand how Jameson “helped the machine,” without assuming we can ever inhabit her embodied position in time or empathize with how she experienced the world during the 1900s. Instead of collapsing distinctions between the present and past, the remaking process actually emphasizes them, making them more explicit. In this sense, remaking is not about perfectly recreating objects or reclaiming or fetishizing embodied experiences. By contrast, it turns our gaze outwards to scrutinize how categories of identity materialize and congeal between and within our technologies, histories, and environments—and to reflect on our own embodied relationships to them.
For example, in the case of the optophone, historical sources such as Fournier d’Albe’s The Moon Element may gloss over the labour involved in learning to interpret tones and the instability of reading speed as a reliable standard. Jameson’s reading speed varied throughout her lifetime and across contexts. After her first two hundred hours of practice, Jameson was able to read at a public demonstration at a rate of one word per minute. Later in her life, however, she was said to read at speeds of up to sixty words per minute while most users achieved speeds of twenty to thirty words per minute. In addition to teaching and advocacy work for blind people, Jameson also displayed remarkable technical knowledge in her writing and articulated the advantages and disadvantages between different versions of the optophone to a community of scientific and technology researchers and developers. In this way, her work—and the work of other users who also circulated within these communities—may have fed forward into the design or development of optophones.
Furthermore, optophone users such as Harvey Lauer adapted optophone for uses that other developers, such as Fournier d’Albe, may never have anticipated, such as reading packaged goods and relabelling them in braille, or checking if a pen or pencil is working. Recognizing different types of labour and contributions involved in optophonics destabilizes easy binaries between developers and users, creators and audience. Technological development requires negotiation and collaboration of people with different abilities while recognizing that categories themselves are fluid or elastic. The writing of optophone users, informed by contexts of everyday use rather than those of laboratory settings, also allows us to understand how different technologies or communication systems, such as optophonics and braille, may complement rather than strictly compete with or simply replace one another.
Looking more broadly, asserting the optophone as part of the longer history of OCR and computer vision technologies counters the assumption that assistive technologies develop separately from normative ones. To echo Sara Hendren (2013) and Graham Pullin (2009), we might assume that innovation trickles down from a supposedly general populace to the specialized domain of assistive technology. In fact, innovation is more diffuse. It spreads in all directions, and design for disability is a rich source of both technical and aesthetic innovation. To ignore these avenues not only constrains possibilities for future design; it ignores the labour and people who were crucial to developing the technologies we know and use today.
We might also assume that innovation works teleologically, and that later technologies are naturally better: they are more efficient, less error-prone, and they neatly replace their predecessors and the labour involved without any remainders. By remaking the optophone using present-day OCR technology, what we find instead is that it cannot represent some of the behaviours and interactions that existed with past optophones.
Because OCR scans discretely, character by character, rather than continuously, our version of the optophone cannot account for the fine-grained feedback and control that a past optophone would have provided. Because OCR is purpose-built for machine-readable text, it also cannot account for the more creative or counter-intuitive uses that we mentioned earlier. Ironically, the newer technology seems inefficient and inadequate. Automation, then, does not, by default, always make technology easier or less problematic. That is not to say that difficulty or manual labour is desirable or that we should be nostalgic for pre-digital living. But we should recognize that fetishizing efficiency and ease risks obscuring past labour and the people who contributed work. When studying media history, we can seek out difficulty and absence rather than see these instances as things that need to be resolved or eliminated. Rather than focus on grand narratives of invention or disruption, we might also attend to everyday negotiations and actors, and track historical change in terms of smaller gains or losses over time.
Following our discussion of difficulty, we might then ask what an appropriate technical goal or endpoint for recreating an optophone might be, especially if we do not privilege ease of use. One impulse might be to design and recreate an optophone where tones would be as easy to distinguish and interpret as possible. But to remake an ideal optophone—one that follows commercial attitudes requiring technology to be rigorously optimized—would be to create an optophone that never existed.
Throughout the first half of the twentieth century, sighted people exaggerated the accessibility of reading optophones. In publications such as The Moon Element, Fournier d’Albe stresses the listening process over the demanding physical procedures of setting up, moving, and navigating books on an optophone. For instance, an operator would have had to plug the optophone in, find and line up the first line of the print material with the tracer, and calibrate the optophone to the size of the type—all before reading a word of the print at hand. Not only were these procedures difficult for new users to learn; operators had to physically navigate the optophone from moment to moment throughout the reading process in a way that is difficult to capture with two-dimensional images or text alone. Remaking the optophone in three dimensions reminds us that the auditory experience of listening to its tones was entwined with the tactile experience of moving the handle to adjust the speed of reading or shifting the print material to read subsequent lines (see Figure 4).
Several aspects of the optophone’s design, such as differently-sized plugs, or dials with clicking notches, suggest users would have learned to navigate the optophone without help. Furthermore, although the optophone’s frame was open and transparent, its headphones also created a private acoustic sphere for listening. Combined with its physical features, this space emphasizes optophonic reading as a private, individual experience. In this sense, the added labour required of users, built into the optophone’s very design, may have accompanied small gains in independence and control. Similarly, we might also consider how the optophone, in contrast to other contemporary reading aids like braille or talking books, allowed users private access to the same materials intended for and read by sighted readers, circumventing the costs associated with creating parallel reading materials. This could be read as either a move towards integration or an attempt to conceal signs of difference or disability.
Although the optophone has few, if any, present-day users or technical applications, situating it within a broader history of machine reading or computer vision technology leads us to recognize how previously siloed areas of knowledge or innovation actually enrich each other, while simultaneously rewriting contributions and people such as Mary Jameson back into the archive. The unrecognized labour of optophone users, such as Mary Jameson and others, can never be completely recovered or represented, but remaking can offer us glimpses of what and who has been forgotten, ignored, or lost in typical historical narratives. Making gradual development, maintenance, and labour explicit is essential to how we negotiate history now and how we develop technologies in and for the future.
Thanks to Katherine Goertz, Evan Locke, Danielle Morgan, and Victoria Murawski for collaborating with us on this research, which is supported by SSHRC, CFI, and BCKDF in Canada. Many thanks to Robert Baker (Blind Veterans UK), Mara Mills (NYU), and Matthew Rubery (Queen Mary University London) for their support and feedback. Visit the repository of files we created to remake a reading optophone.