Monthly Archives: February 2009

why is text on screens so ugly?

There have been a raft of reviews of the new Kindle and the various iPhone reading applications lately. In general, reviewers are more positive about the experience of reading from a screen than they have been in the past. However, I’ve noticed that one enormous factor in reading tends to get passed by; maybe it’s not something that people notice if they don’t think about book design. See if you can identify it from these screenshots, which you can click to enlarge:
The new Kindle 2:

kindle2SMALL.jpg

The latest Sony Reader:

sonyreaderSMALL.jpg

Stanza, a popular iPhone ebook app:

iphonestanzaBIG.jpg

eReader, another popular iPhone ebook app:

iphoneereaderBIG.png

All of these screen-reading environments fully justify their paragraphs of text: there’s not a ragged right margin. This is what we tend to expect books to look like: typically, a book page has an even rectangle of text on it, a tradition that extends back to Gutenberg’s 42-line Bible:

gutenbergbibleSMALL.jpg

One might notice here, however, that Gutenberg’s page has something that the screen-reading environments do not: hyphenation. When Gutenberg’s words don’t fit in a line (see, for example, the third line down in the right column) he broke them with a hyphen, starting a tradition in book design that has made its way to the present moment. The reason for hyphenation is apparent if you look at the shots of the screen-reading devices: if words aren’t split, often the spacing between words must be increased, making it harder for the eye to follow. This is more apparent when the width of the text column (called the measure) is narrow, as is the case on iPhone apps: notice how spaced-out the penultimate line, “necessary to effectiveness in an”, is in the eReader screenshot. The Kindle and the Sony Reader look a little bit better because there aren’t such glaring white spaces in the text, although weirdly both appear to have lines in the middle of paragraphs that aren’t fully justified.
Why don’t these reading devices hyphenate their lines if they fully justify them? This isn’t, for what it’s worth, a problem that affects more than just these devices; plenty of text on the web is fully justified and has no hyphenation. The problem is that hyphenation is trickier than it might initially appear. To properly hyphenate a paragraph, the hyphenator needs to understand at least something about how the language that the paragraph of text is written in works. Here’s how Robert Bringhurst outlines what he calls the “etiquette of hyphenation and pagination” as rules for compositors in his authoritative Elements of Typographic Style:

2.4.1. At hyphenated line-ends, leave at least two characters behind and take at least three forward.
2.4.2. Avoid leaving the stub-end of a hyphenated word, or any word shorter than four letters, as the last line of a paragraph.
2.4.3. Avoid more than three consecutive hyphenated lines.
2.4.4. Hyphenate proper names only as a last resort unless they occur with the frequency of common nouns.
2.4.5. Hyphenate according to the conventions of the language.
2.4.6. Link short numerical and mathematical expressions with hard spaces.
2.4.7. Avoid beginning more than two consecutive lines with the same word.
2.4.8. Never begin a page with the last line of a multi-line paragraph.
2.4.9. Balance facing pages by moving single lines.
2.4.10. Avoid hyphenated breaks where the text is interrupted.
2.4.11. Abandon any and all rules of hyphenation and pagination that fail to serve the needs of the text.

Rule 2.4.5 might be worth quoting in full:

In English we hyphenate cab-ri-o-let but in French ca-brio-let. The old German rule which hyphenated Glockenspiel as Glok-kenspiel was changed by law in 1998, but when össze is broken in Hungarian, it still tuns into ösz-sze. In Spanish the double consonants ll and rr are never divided. (The only permissible hyphenation in the phrase arroz con pollo is thus arroz con po-llo.) The conventions of each language are part of its typographic heritage and should normally be followed, even when setting single foreign words or brief quotations.

Can a computer hyphenate texts? Sure: if these rules can be made comprehensible to a computer, it can sensibly hyphenate a text. Donald Knuth’s TeX typesetting program, for example, contains hyphenation dictionaries: lists of words in which the various points at which they can be hyphenated are marked. Hyphenation points are arranged by “badness”: it’s worse to use hy-phenation than hyphen-ation, for example, but it would be even worse not to break the word and leave a gap of white space in the line. The TeX engine tries to find the least bad way to set a line; it usually does a reasonable job. Not all hyphenation is equal, however: Adobe InDesign, for example, will do a much better job of hyphenating a paragraph than Microsoft Word will.
And: as rule 2.4.5 suggests, if a computer is going to hyphenate something, it needs to know what language the text is in. This is a job for metadata: electronic books could have an indicator of what language they’re in, and the reader application could hyphenate automatically. But that won’t always help: in the text on the Kindle screen, for example, der Depperte isn’t English and wouldn’t be recognized as such. A human compositor could catch that; a computer wouldn’t guess, and would have to default to not breaking it. The same problem will happen with proper names.
There aren’t really easy solutions for this problem. A smarter ebook reading device (and smarter ebooks) might hyphenate automatically; if this were the case, the reader would need to rehyphenate whenever the user changed the font or the font size. (There are some possibilities in HTML, but they do require a lot of work on the part of the author or designer; some day this might work better.) It’s not a problem with PDFs, of course, but PDFs don’t allow reflowing text. There’s no shame in using a ragged right margin; at least then one might not subject to Bringhurst’s opprobrium towards to poorly justified in The Elements of Typographic Style:

A typewriter (or a computer-driven printer of similar quality) that justifies its lines in imitation of typesetting is a presumptuous machine, mimicking the outer form instead of the inner truth of typography.

briefly noted: iphones & o’reilly

  • Ars Technica has a review of an interesting-sounding iPhone application called Papers, designed to make it easy to carry around a library of scientific papers on your iPhone. It works with a desktop app also called Papers; it also interfaces with various scientific search engines so you can download more papers on the go. It’s not free, and it’s not for everyone, but it’s nice to see software that seems to understand that different kinds of reading need to be done differently.
  • Thematically related: Adam Hodgkin argues that dedicated e-book devices generally lack an awareness of the place of the network in the task of reading; this is more natural in things like the iPhone.
  • Jason Epstein’s keynote from O’Reilly’s Tools of Change conference is now online. There’s not much in here that’s particularly surprising to anyone who’s been paying attention to the field for the past few years – the Espresso Book Machine is still his hope for the future of publishing.
  • And James Long, over at the digitalist has a wrap-up of Tools of Change.
  • Michael Cairns points out that the trouble with e-books is that publishers still think of them only as an electronic version of the print book.
  • Ted Nelson, who we mention here from time to time, has a new, self-published book out, entitled Geeks Bearing Gifts, which is his own deeply idiosyncratic take on the history of the computer and how we use them, starting from the invention of the alphabet and explaining exactly where things went wrong along the way. Ted Nelson, of course, is the inventor of hypertext among other things; I hope to have an interview with him up here soon.
  • And there’s a new issue of Triple Canopy out; not all the content is up yet, but Ed Halter’s piece on Jeff Krulik and public-access TV – something of a Youtube-before-Youtube and Bidisha Banerjee & George Collins’s memoir/video game combo are worth inspection.

briefly noted

announcement: itin film on sunday

Alex Itin, the Institute’s artist-in-residence, currently has a show up in Frost Space in Williamsburg, Brooklyn. If you’re around this Sunday afternoon, he’s screening his films and giving an artist’s talk. I’m not sure exactly what he’ll be up to – Alex is nothing if not unpredictable – but it will certainly be interesting and entertaining.

ITINFilm.jpg

itinfilm2.jpg

itinfilm5.jpg

wikipedia before wikipedia

I’ve been reading Tom McArthur’s Worlds of Reference: lexicography, learning and language from the clay tablet to the computer, a history of dictionaries, encyclopedias and reference materials published in 1986. The last section, titled “Tomorrow’s World” is interesting in hindsight: having looked at the major shifts that have occurred in how cultures have used lexicography, McArthur is aware that things change in unimaginable ways. He shies away from making detailed predictions about how the computer will change the dictionary or the encyclopedia; but he does find an interesting model for how the collaborative creation of knowledge might work in the future. Because I haven’t seen this mentioned online, I’ll quote this at length:

. . . I am considering something much more radically interesting: turning students on occasion into once-in-a-lifetime Sam Johnsons and Noah Websters.

At least one remarkable precedent exists for this idea: a project undertaken between 1978 and 1982 on the lower East Side of New York City which produced a “Trictionary” without a single really-truly lexicographer being involved.

The Trictionary is a 400-page trilingual English/Spanish/Chinese wordbook, covering a base vocabulary of some 3,000 items per language. Much of the basic cost was covered by a grant from the National Endowment for the Humanities, and the bulk of the work was done on the premises of the Chatham Square branch of the New York Public Library, on East Broadway. The librarian there, Virginia Swift, was glad to provide the accommodation, while the original idea was developed by Jane Shapiro, a teacher of English as a Second Language at Junior High School 65, helped through all the stages the work by Mary Scherbatoskoy of ARTS (Art Resources for Teachers and Students). They organized the work, but they were not the compilers as such.

The compilation was done, as The New Yorker reports (10 May 1982) “by the spare-time energy of some 150 young people from the neighborhood”, aged between 10 and 15, two afternoons a week over three years. New York is the multilingual city par excellence, in which, as the report points out, “some of its citizens live in a kind of linguistic isolation, islanded in their languages”. The Trictionary was an effort to do something about that kind of isolation and separateness. One method used in the project was getting together a group of youngsters variously skilled in English, Spanish and Chinese and “brainstorming” over, say, the word ANIMALS written on an otherwise empty blackboard. They would think of animals and considered how they were labelled in each language, putting their triples on the board and arguing about the legitimacy of particular terms. Another method was the review session, a more sophisticated activity where a stack of blue cards with English words on them was used to create equivalent stacks in pink for Spanish and yellow for Chinese. It was out of this kind of interactive effort that the Trictionary developed, until in its final form it had a blue section with English first, a second section that was yellow and Chinese, and a third section that was pink and Spanish. Each part had three columns per page, with each language appropriately presented. In all three sections, the material was punctuated here and there by line drawings done by the youngsters themselves.

Jane Shapiro told The New Yorker that when she first went to work in the area she had had no idea what the language situation was like. The neighbourhood is about 80% Chinese and 20% Spanish-speaking, and in class she had often found herself in the position of comparing all three languages. Out of that “small United Nations” came the idea for the book, because she had often wished for such a book, but of course no right-minded publisher had ever thought of that particular combination as commercially viable or academically interesting. Additionally, and damningly, Shapiro felt that what dictionaries were available “were either too stiff or out of date or written on a linguistic level far different from that of the students”. In other words, because formal lexicography had nothing to offer, grass-roots lexicography had to serve instead.

As with Vaugelas, neighbourhood usage was the authority, and as work progressed the women and their charges actually kept away from published dictionaries so as to be sure that the words came from the youngsters themselves. Reality also prevailed, in that there is a fair quantity of legal and medical terms in the book. These were included because the children often served as interpreters between their elders and lawyers and doctors. Motivation was high, despite a shifting population of helpers, evidently because the children could see the practical utility of what they were doing. One youngster engaged in the work was Iris Chu, born in Venezuela of Chinese parents and brought to New York about five years earlier. She told The New Yorker that she made a lot of friends while working on the Trictionary (the opposite to what often happens to lexicographers), adding: “It’s funny to see it as a book now – before, it was just something we did every week. I’m really sorry it’s over. For us, it was a whole lot of fun.”

It was also a prototype for a whole new kind of educational lexicography (with or without the additional advantage, where available, of electronic and other aids). The ancient Sumerians and the medieval Scholastics would have understood the general idea of the Trictionary very well, and Comenius would certainly have approved of it. I approve of it whole-heartedly because it simply broke the mould of conventional thinking. Additionally, we can see in the brain-storming sessions and the use of the cards the two modes of lexicography creatively at work side by side: themes and word relationships on one side, alphabetic order on the other. The women of the lower East Side certainly discovered a formula for getting the taxonomic urge working in ways that are just as spectacular as any instrument that beeps, blinks and hums.

(pp. 181–183.) There doesn’t seem to be much trace of the Trictionary online; a cursory search finds the New Yorker article that McArthur refers to (behind their paywall). The New York Center for Urban Folk Culture seems to be selling copies for $5; their page for the project gives it a minimal description and has images of some of the pages.

Using the back and forth of a wikipedia article to get closer to the truth

When Jaron Lanier disparaged the Wikipedia in his 2006 essay on “the hazards of the new online collectivism” I wrote an impassioned defense including our oft-mentioned point that the most interesting thing about wikipedia articles, especially controversial ones is not necessarily what’s on the surface, but the back and forth underneath.

Jaron misunderstands the Wikipedia. In a traditional encyclopedia, experts write articles that are permanently encased in authoritative editions. The writing and editing goes on behind the scenes, effectively hiding the process that produces the published article. The standalone nature of print encyclopedias also means that any discussion about articles is essentially private and hidden from collective view. The Wikipedia is a quite different sort of publication, which frankly needs to be read in a new way. Jaron focuses on the “finished piece”, ie. the latest version of a Wikipedia article. In fact what is most illuminative is the back-and-forth that occurs between a topic’s many author/editors. I think there is a lot to be learned by studying the points of dissent; indeed the “truth” is likely to be found in the interstices, where different points of view collide. Network-authored works need to be read in a new way that allows one to focus on the process as well as the end product.

Recently a group of researchers at Palo Alto Research Center (formerly Xerox Parc) announced that they have created a prototype of a tool called “wikidashboard” which they hope will help reveal the back and forth beneath wiki articles in a way that will help readers get closer to the truth of a matter. in their own words:

“Because the information [the back and forth history of a wiikipedia artilcle] is out there for anyone to examine and to question, incorrect information can be fixed and two disputed points of view can be listed side-by-side. In fact, this is precisely the academic process for ascertaining the truth. Scholars publish papers so that theories can be put forth and debated, facts can be examined, and ideas challenged. Without publication and without social transparency of attribution of ideas and facts to individual researchers, there would be no scientific progress.”

judging a book by its contents

There’s a post at the Harper Studio blog about Stephen King’s recent denigration of Stephenie Meyer’s talents as a writer. Meyer is, of course, the author of the Twilight books, a chaste vampire saga. The post asks:

Can a book be deemed “good” or “bad” based solely of the quality of its writing?

I haven’t read the Twilight books so I can’t weigh in on King’s assessment. But it seems to me that Stephenie Meyer has activated something profound in people- mostly teenage girls – and the ability to do that may be as rare as the literary gifts of a writer like… Stephen King. Put another way: In terms of literary merit, Twilight may not be “good,” but that doesn’t mean it’s not great.

I have not read these books, though people whose taste in writing I trust more than Stephen King’s have assured me that the writing is abysmal. I have been repeatedly entertained by having what goes on in these books described to me; I have also seen the movie based upon the first of them, which I found quite thoroughly astonishing. From my perspective, it seems clear that these books are a Jesse Helms-level assault on American morality. It’s tempting to pull out Theodor Adorno, bête noire of the blogosphere: should you need a fix, his miniature essay “Morality and Style”, from Minima Moralia, will do the trick nicely.
But I’m interested not so much in Twilight‘s merit but in the attitude toward books that’s on display in this post. Books can be many things, but by this argument they stand mostly as commodity: Twilight is culturally valuable not because of anything that it might be saying – or the method in which it’s said – but because it’s reached a lot of people. By this reductionist perspective, Twilight might as well be a movie or a videogame as a book. And I think it’s this sort of thinking which is causing the downfall of publishing: for big publishers, a great book is simply one that sells a lot of copies. This is an attitude which makes sense to the people in charge of the numbers at a big publishing house, but I’m not sure that it plays so well with consumers. I can’t imagine that anyone – beside their employees – would be particularly upset if Hachette (publishers of Twilight) goes under. If a book is just a vehicle for the consumer to get content as quickly as possible, another vehicle can easily be found.

on john updike

If:book certainly isn’t an obvious venue for a John Updike remembrance. In 2006, his “The End of Authorship” vehemently misconstrued the ideals of digital publishing. At remix culture, he bristled; at collaborative reading, he balked; at the notion of books on screens, he cringed, seeking the refuge of his conventional library and its dusty tomes. In a single, hair-pullingly obtuse sentence, Updike pegged his era’s headstrong mentality: “Books traditionally have edges.”
At the time, if:book responded, to much less fanfare, with a scorched-earth rebuke in which Updike’s entire oeuvre was reduced to “juvenilia,” his brain purportedly “addled” by decades upon decades of “hero worship.”
And yet.
Updike, who died last week at 76, came to me on the recommendation of my high school English teacher, shortly after I’d realized that reading was not an altogether painful pastime. I fawned over the glittery prose of his early fiction and promptly tackled the Rabbit tetralogy; soon enough, I was writing the requisite rip-off stories and mimicking his vow “to give the mundane its beautiful due.” By now an unseemly number of Updike imitators have weaseled their way into print, but without his delicate touch, the mundane only yields the saccharine.
No detail was too minute for Updike — he was at his best when he pursued the microcosmic, finding analogs for the Big Questions in the small ones. Atomically, his sentences were as expansive and accommodating as any I’ve read. At Slate.com, Sven Birkerts eloquently elected him “the sentence guru; he showed me just what lyric accuracy a string of words could accomplish.”
Lyric accuracy, indeed. Updike’s brand of prose, however stylized, rarely sullied the acuity of his observations. The people and places that he conjured felt alive to me in a way that few had, prior to then. Those worlds were immediate. Puzzlingly, their immediacy materialized from the calm, considered measure of the prose. The perspective of an Updike piece is always enveloping. I’ll outsource my thoughts again to Birkerts: “Harry Angstrom working the remains of a caramel from his molar is a straight shunt to the living human now.”
Nowadays, I find those tribulations of suburbia moderately less gripping, but some of the passages I marveled over have retained their luster. Though it may be damning it with faint praise, I regard Updike’s work as a kind of gateway drug. Certainly it whetted my appetite for capital-L Literature, for words that faced the thrum of contemporary life without further obfuscating it. What he generated in his finest work is the crackle of a full-fledged consciousness, a voice: the sentences have a cadence, the cadence has a tone, and the tone, somehow, becomes human.
His death is, in a sense, another nail in the coffin of a kind of literary vanguard. I can understand why this blog’s readership might relish, openly or in private, the extinction of these writers, particularly given the old school’s knee-jerk aversion to new methodologies and shifting boundaries. By 2006, as the sensationally-titled “The End of Authorship” attests, it seemed that Updike opposed progress in the humanities more than he furthered it. The voguish sentiment, for better or worse, was disdain for his belletristic ways.
Still, I’m saddened by his passing. Updike and his ilk presided over fiction when more Americans read it, debated it, engaged with it. He took his writing seriously, yes, without proffering it as panacean. By the time I picked him up, to be sure, his heyday had come and gone. Legend has it, though, that the phrase “man of letters” was in those days uttered without an ironic smirk, and that one could reasonably endeavor to devote his or her life to words without appearing highfalutin or deranged. Writers could even expect to see their work in mass-market paperback editions. Imagine that.
I hope that, through the very transformation that Updike disparaged, literature in any and all forms will see an era of renewed relevance, and soon. Even he, after all, regarded books as “an encounter, in silence, of two minds.” There’s plenty to cavil about, sure — the degree of silence, the number of minds, the mode of the encounter — but at an elemental level, his assessment rings true to me. To millions of readers, Updike demonstrated the solicitous vitality of that truth, of those encounters. He will be missed.

a defense of the webcomics business model

Syndicated comics artists who are seeing their livelihood disappear as the newspapers their work appears in shrink from sight, are starting to look with more interest at the world of online webcomics. Unfortunately, they seem to misunderstand what they see or are just too quick to disapprove. Jeph Jaques, one of the early webcomic pioneers posted a wonderful description and defense of the webcomic business model. Read it here.