describing humanity in data sets

Yahoo’s recently released commemorative microsite, “Yahoo Netrospective: 10 years, 100 moments,” is a selection of one hundred significant moments in the history of the web (1995-2005). The format for the site was inspired by the work of information architect Jonathan Harris. Harris created 10 x 10, a piece visually identical to, but considerably more interesting than the Yahoo birthday card, (whose content leans quite heavily toward self-promotion, i.e. there are 20 mentions of Yahoo products and no mention at all of Google.) By contrast, Harris’ 10 x 10 builds its fascinating content from RSS feeds. The piece selects the most frequently used words from the major news networks to assemble an hourly “portrait” of our world. “What interests me is trying to find descriptions of humanity in very large data sets, creating programs that tell us something about ourselves,” Harris told Wired News. “We set them free and they come back and tell us what we are like.”
What makes Harris’ work interesting is the self-discipline he exercises in designing these objective systems. By withholding the urge to edit (except, perhaps, when Yahoo is involved) he allows an authentic “picture” of current events, of human behavior online, of the fluid exchange of words and images. His linguistic self-portrait WordCount, harvests data from the British National Corpus. WordCount displays the 86,800 most commonly used words in the English language in order of their commonness. Harris alleges that “observing closely ranked words tells us a great deal about our culture. For instance, “God” is one word from “began”, two words from “start”, and six words from “war”. I tried WordCount and was instantly addicted. To read WordCount or 10 x 10, you have to interact with it and bring meaning to it. Or put another way, you have to be willing to bring meaning to it. This is quite different from the way we experience traditional narratives, whose structure and meaning are crafted by the writer and handed down to the reader. I am eagerly anticipating his next project which, he told Wired, “involves looking at human feelings on a large scale from the web.”