Category Archives: dictionary

a dictionary in transition

James Gleick had a fascinating piece in the Times Sunday magazine on how the Oxford English Dictionary is reinventing itself in the digital age. The O.E.D. has always had to keep up with a rapidly evolving English language. It took over 60 years and two major supplements to arrive at a second edition in 1989, around the same time Tim Berners-Lee and others at the CERN particle physics lab in Switzerland were creating up with the world wide web. Ever since then, the O.E.D. been hard at work on a third edition but under radically different conditions. Now not only the language but the forms in which the language is transmitted are in an extreme state of flux:

In its early days, the O.E.D. found words almost exclusively in books; it was a record of the formal written language. No longer. The language upon which the lexicographers eavesdrop is larger, wilder and more amorphous; it is a great, swirling, expanding cloud of messaging and speech: newspapers, magazines, pamphlets; menus and business memos; Internet news groups and chat-room conversations; and television and radio broadcasts.

Crucial to this massive language research program is a vast alphabet soup known as the Oxford English Corpus, a growing database of more than a billion words, culled mostly from the web, which O.E.D. lexicographers analyze through various programs that compare and contrast contemporary word usages in contexts ranging from novels and academic papers to teen chat rooms and fan sites. Together this data comprises what the O.E.D. calls “the fullest, most accurate picture of the language today” (I’m curious to know how broadly they survey the world’s general adoption of English. I’m under the impression that it’s still largely an Anglo-American affair).
Marshall McLuhan famously summarized the shift from oral tradition to the written word as “an eye for an ear”: a general migration of thought and expression away from the folkloric soundscapes of tribal society toward encounters by individuals with visual symbols on a page, a movement that climaxed in the age of print, and which McLuhan saw at last reversed in the global village of electronic mass media. The curious thing that McLuhan did not live long enough to witness was the fusion of eye-ear cultures in the fast-moving textual traditions of cell phones and the Internet. Written language has acquired an immediacy and a malleability almost matching oral speech, and the effect is a disorienting blurring of boundaries where writing is almost the same as speaking, reading more like overhearing.
So what is a dictionary to do? Or be? Such fundamental change in the process of maintaining “the definitive record of the English language” must have an effect on the product. Might the third “edition” be its final never-ending one? Gleick again:

No one can say for sure whether O.E.D.3 will ever be published in paper and ink. By the point of decision, not before 20 years or so, it will have doubled in size yet again. In the meantime, it is materializing before the world’s eyes, bit by bit, online. It is a thoroughgoing revision of the entire text. Whereas the second edition just added new words and new usages to the original entries, the current project is researching and revising from scratch — preserving the history but aiming at a more coherent whole.

They’ve even experimented with bringing readers into the process, working with the BBC earlier this year to solicit public aid in locating first usages for a list of particularly hard-to-trace words. One wonders how far they’d go in this direction. It’s one thing to let people contribute at the edges — the 50 words in that list are all from the 20th century — but to open the full source code is quite another. It seems the dictionary’s challenge is to remain a sturdy ark for the English language during this period of flood, and to proceed under the assumption that we may have seen the last of the land.
(image by Kenneth Moyle)

the next dictionary

I found this Hartford Courant article on slashdot.
Martin Benjamin heads up an eleven year old project to create an online Swahili dictionary called the Kamusi Project. Despite 80 million speakers, the current Swahili dictionary is over 30 years old. Setting this project apart from other online dictionaries, these entries are created by, not only academics, but also by volunteers ranging from former Peace Corp workers to African linguistic hobbyists. The site also includes a discussion board for the community of users and developers.
It is also important to mention that, like wikipedia, donations and volunteers support this collaborative project. Unlike wikipedia, it does not have the broad audience and publicity that wikipedia enjoys, which makes funding a continual issue.