Category Archives: wikipedia

nicholson baker on the charms of wikipedia

I finally got around to reading Nicholson Baker’s essay in the New York Review of Books, “The Charms of Wikipedia,” and it’s… charming. Baker has a flair for idiosyncratic detail, which makes him a particularly perceptive and entertaining guide through the social and procedural byways of the Wikipedia mole hill. Of particular interest are his delvings into the early Wikipedia’s reliance on public domain reference works, most notably the famous 1911 Encyclopedia Britannica: “The fragments from original sources persist like those stony bits of classical buildings incorporated in a medieval wall.”
Baker also has some smart things to say on the subject of vandalism:

Wikipedians see vandalism as a problem, and it certainly can be, but a Diogenes-minded observer would submit that Wikipedia would never have been the prodigious success it has been without its demons.
This is a reference book that can suddenly go nasty on you. Who knows whether, when you look up Harvard’s one-time warrior-president, James Bryant Conant, you’re going to get a bland, evenhanded article about him, or whether the whole page will read (as it did for seventeen minutes on April 26, 2006): “HES A BIG STUPID HEAD.” James Conant was, after all, in some important ways, a big stupid head. He was studiously anti-Semitic, a strong believer in wonder-weapons – ?a man who was quite as happy figuring out new ways to kill people as he was administering a great university. Without the kooks and the insulters and the spray-can taggers, Wikipedia would just be the most useful encyclopedia ever made. Instead it’s a fast-paced game of paintball.
Not only does Wikipedia need its vandals – ?up to a point – ?the vandals need an orderly Wikipedia, too. Without order, their culture-jamming lacks a context. If Wikipedia were rendered entirely chaotic and obscene, there would be no joy in, for example, replacing some of the article on Archimedes with this:
Archimedes is dead.
He died.
Other people will also die.
All hail chickens.
The Power Rangers say “Hi”
The End.
Even the interesting article on culture jamming has been hit a few times: “Culture jamming,” it said in May 2007, “is the act of jamming tons of cultures into 1 extremely hot room.”

the future of the sustainable book

On New Year’s Eve, I got lost in Yonkers trying to take my son’s gently-used toys to the Salvation Army. The Yonkers store was the only one I could find willing to take them. The guy on the phone hesitated, “Are they in good condition?” he asked, clearly unhappy about my impending donation. I assured him they were, and he sighed and told me to come on over.
On principle, I try (really hard) to give away anything that is not completely worn out. But it is getting harder and harder to do. Nobody wants my old furniture or clothes or books. And they especially don’t want used children’s toys. My attempt to give them away was ill-fated. A police barricade stopped me at Nepperhan Avenue (a construction site disaster). Then I drove around for forty minutes until I found an alternate route but was twarted at Ashburton Ave (building on fire, streets blocked). I gave up and went home. With stomach full of guilt, I put the plastic toys in the dumpster. My son didn’t mind because he had a brand new pile of toys in his playroom, Christmas gifts from relatives and friends who couldn’t be dissuaded.
Point is, it seems increasingly difficult to opt out of the cycle of waste-creation. Plastic kids’ toys are just one example. I’m also guilty of consuming and transforming lots of other things into waste: clothes, computers, cell phones, magazines, all sorts of complicatedly-packaged food and beverage items, etc… So yesterday, when I contemplated how best to spend 2008, I decided to focus on figuring out how to create a more sustainable lifestyle. And since I work in book publishing, job one is to figure out what it means to create a sustainable book. Lots of models come to mind. Good ones like Wikipedia (device-neutral and always in the latest, free, edition) and bad ones like the Kindle, (which tries to create a market for an ebook reader with designed obsolescence).
Anyway, I thought it might be useful to weave the sustainability discussion into if:book’s ongoing consideration of networked ebooks, because at this stage in their developement, networked books could be shaped with sustainability in mind. So, I’m hoping to stir up some interesting discussion and serious contemplation of the perfectly sustainable book: one that is constantly revised, but never needs to be reprinted (or repurchased); one that is lean and simple and doesn’t require a small server farm or a special device; one that makes an enormous impact, but leaves a teeny tiny carbon footprint; one we can live with for ever and ever without getting bored or satiated.

a few rough notes on knols

Think you’ve got an authoritative take on a subject? Write up an article, or “knol,” and see how the Web judgeth. If it’s any good, you might even make a buck.
knol.jpg
Google’s new encyclopedia will go head to head with Wikipedia in the search rankings, though in format it more resembles other ad-supported, single-author info sources like the About.com or Squidoo. The knol-verse (how the hell do we speak of these things as a whole?) will be a Darwinian writers’ market where the fittest knols rise to the top. Anyone can write one. Google will host it for free. Multiple knols can compete on a single topic. Readers can respond to and evaluate knols through simple community rating tools. Content belongs solely to the author, who can license it in any way he/she chooses (all rights reserved, Creative Commons, etc.). Authors have the option of having contextual ads run to the side, revenues from which are shared with Google. There is no vetting or editorial input from Google whatsoever.
Except… Might not the ads exert their own subtle editorial influence? In this entrepreneurial writers’ fray, will authors craft their knols for AdSense optimization? Will they become, consciously or not, shills for the companies that place the ads (I’m thinking especially of high impact topic areas like health and medicine)? Whatever you may think of Wikipedia, it has a certain integrity in being ad-free. The mission is clear and direct: to build a comprehensive free encyclopedia for the Web. The range of content has no correlation to marketability or revenue potential. It’s simply a big compendium of stuff, the only mention of money being a frank electronic tip jar at the top of each page. The Googlepedia, in contrast, is fundamentally an advertising platform. What will such an encyclopedia look like?
In the official knol announcement, Udi Manber, a VP for engineering at Google, explains the genesis of the project: “The challenge posed to us by Larry, Sergey and Eric was to find a way to help people share their knowledge. This is our main goal.” You can see embedded in this statement all the trademarks of Google’s rhetoric: a certain false humility, the pose of incorruptible geek integrity and above all, a boundless confidence that every problem, no matter how gray and human, has a technological fix. I’m not saying it’s wrong to build a business, nor that Google is lying whenever it talks about anything idealistic, it’s just that time and again Google displays an astonishing lack of self-awareness in the way it frames its services -? a lack that becomes especially obvious whenever the company edges into content creation and hosting. They tend to talk as though they’re building the library of Alexandria or the great Encyclopédie, but really they’re describing an advanced advertising network of Google-exclusive content. We shouldn’t allow these very different things to become as muddled in our heads as they are in theirs. You get a worrisome sense that, like the Bushies, the cheerful software engineers who promote Google’s products on the company’s various blogs truly believe the things they’re saying. That if we can just get the algorithm right, the world can bask in the light of universal knowledge.
The blogosphere has been alive with commentary about the knol situation throughout the weekend. By far the most provocative thing I’ve read so far is by Anil Dash, VP of Six Apart, the company that makes the Movable Type software that runs this blog. Dash calls out this Google self-awareness gap, or as he puts it, its lack of a “theory of mind”:

Theory of mind is that thing that a two-year-old lacks, which makes her think that covering her eyes means you can’t see her. It’s the thing a chimpanzee has, which makes him hide a banana behind his back, only taking bites when the other chimps aren’t looking.
Theory of mind is the awareness that others are aware, and its absence is the weakness that Google doesn’t know it has. This shortcoming exists at a deep cultural level within the organization, and it keeps manifesting itself in the decisions that the company makes about its products and services. The flaw is one that is perpetuated by insularity, and will only be remedied by becoming more open to outside ideas and more aware of how people outside the company think, work and live.

He gives some examples:

Connecting PageRank to economic systems such as AdWords and AdSense corrupted the meaning and value of links by turning them into an economic exchange. Through the turn of the millennium, hyperlinking on the web was a social, aesthetic, and expressive editorial action. When Google introduced its advertising systems at the same time as it began to dominate the economy around search on the web, it transformed a basic form of online communication, without the permission of the web’s users, and without explaining that choice or offering an option to those users.

He compares the knol enterprise with GBS:

Knol shares with Google Book Search the problem of being both indexed by Google and hosted by Google. This presents inherent conflicts in the ranking of content, as well as disincentives for content creators to control the environment in which their content is published. This necessarily disadvantages competing search engines, but more importantly eliminates the ability for content creators to innovate in the area of content presentation or enhancement. Anything that is written in Knol cannot be presented any better than the best thing in Knol. [his emphasis]

And lastly concludes:

An awareness of the fact that Google has never displayed an ability to create the best tools for sharing knowledge would reveal that it is hubris for Google to think they should be a definitive source for hosting that knowledge. If the desire is to increase knowledge sharing, and the methods of compensation that Google controls include traffic/attention and money/advertising, then a more effective system than Knol would be to algorithmically determine the most valuable and well-presented sources of knowledge, identify the identity of authorites using the same journalistic techniques that the Google News team will have to learn, and then reward those sources with increased traffic, attention and/or monetary compensation.

For a long time Google’s goal was to help direct your attention outward. Increasingly we find that they want to hold onto it. Everyone knows that Wikipedia articles place highly in Google search results. Makes sense then that they want to capture some of those clicks and plug them directly into the Google ad network. But already the Web is dominated by a handful of mega sites. I get nervous at the thought that www.google.com could gradually become an internal directory, that Google could become the alpha and omega, not only the start page of the Internet but all the destinations.
It will be interesting to see just how and to what extent knols start creeping up the search results. Presumably, they will be ranked according to the same secret metrics that measure all pages in Google’s index, but given the opacity of their operations, who’s to say that subtle or unconscious rigging won’t occur? Will community ratings factor in search rankings? That would seem to present a huge conflict of interest. Perhaps top-rated knols will be displayed in the sponsored links area at the top of results pages. Or knols could be listed in order of community ranking on a dedicated knol search portal, providing something analogous to the experience of searching within Wikipedia as opposed to finding articles through external search engines. Returning to the theory of mind question, will Google develop enough awareness of how it is perceived and felt by its users to strike the right balance?
One last thing worth considering about the knol -? apart from its being possibly the worst Internet neologism in recent memory -? is its author-centric nature. It’s interesting that in order to compete with Wikipedia Google has consciously not adopted Wikipedia’s model. The basic unit of authorial action in Wikipedia is the edit. Edits by multiple contributors are combined, through a complicated consensus process, into a single amalgamated product. On Google’s encyclopedia the basic unit is the knol. For each knol (god, it’s hard to keep writing that word) there is a one to one correspondence with an individual, identifiable voice. There may be multiple competing knols, and by extension competing voices (you have this on Wikipedia too, but it’s relegated to the discussion pages).
Viewed in this way, Googlepedia is perhaps a more direct rival to Larry Sanger’s Citizendium, which aims to build a more authoritative Wikipedia-type resource under the supervision of vetted experts. Citizendium is a strange, conflicted experiment, a weird cocktail of Internet populism and ivory tower elitism -? and by the look of it, not going anywhere terribly fast. If knols take off, could they be the final nail in the coffin of Sanger’s awkward dream? Bryan Alexander wonders along similar lines.
While not explicitly employing Sanger’s rhetoric of “expert” review, Google seems to be banking on its commitment to attributed solo authorship and its ad-based incentive system to lure good, knowledgeable authors onto the Web, and to build trust among readers through the brand-name credibility of authorial bylines and brandished credentials. Whether this will work remains to be seen. I wonder… whether this system will really produce quality. Whether there are enough checks and balances. Whether the community rating mechanisms will be meaningful and confidence-inspiring. Whether self-appointed experts will seem authoritative in this context or shabby, second-rate and opportunistic. Whether this will have the feeling of an enlightened knowledge project or of sleezy intellectual link farming (or something perfectly useful in between).
The feel of a site -? the values it exudes -? is an important factor though. This is why I like, and in an odd way trust Wikipedia. Trust not always to be correct, but to be transparent and to wear its flaws on its sleeve, and to be working for a higher aim. Google will probably never inspire that kind of trust in me, certainly not while it persists in its dangerous self-delusions.
A lot of unknowns here. Thoughts?

wikipedia’s growing pains

Insularity, editorial abuses, jargon, anonymity, power… some of the difficulties that beset the great public knowledge experiment of our day. Our friend Karen Schneider has a smart piece on Wikipedia’s “awkward adolescence.” Worth a read.

Like a startup maturing into a real business, Wikipedia’s corporate culture seems, at times, conflicted between its role as a harmless nouveau-digital experiment and its broader ambitions.
…The quieter rumblings about Wikipedia have less to do with vanity edits or poor maintenance of content than they do with the organization’s increasingly arbitrary editorial overrides and deletions and rapidly thickening in-group culture.
…Sock puppets, spy-versus-spy hijinks, and super-secret-vocabularies may be fine for a short-term experiment in information management; but Wikipedia positions itself not as a free encyclopedia, but
the free encyclopedia. A FAQ claims, “We want Wikipedia to be around at least a hundred years from now, if it does not turn into something even more significant,” and Wikipedia’s fundraising page asks potential donors to “Imagine a world in which every single person can share freely in the sum of human knowledge.”

the open library

openLibrary.jpg A little while back I was musing on the possibility of a People’s Card Catalog, a public access clearinghouse of information on all the world’s books to rival Google’s gated preserve. Well thanks to the Internet Archive and its offshoot the Open Content Alliance, it looks like we might now have it – ?or at least the initial building blocks. On Monday they launched a demo version of the Open Library, a grand project that aims to build a universally accessible and publicly editable directory of all books: one wiki page per book, integrating publisher and library catalogs, metadata, reader reviews, links to retailers and relevant Web content, and a menu of editions in multiple formats, both digital and print.

Imagine a library that collected all the world’s information about all the world’s books and made it available for everyone to view and update. We’re building that library.

The official opening of Open Library isn’t scheduled till October, but they’ve put out the demo now to prove this is more than vaporware and to solicit feedback and rally support. If all goes well, it’s conceivable that this could become the main destination on the Web for people looking for information in and about books: a Wikipedia for libraries. On presentation of public domain texts, they already have Google beat, even with recent upgrades to the GBS system including a plain text viewing option. The Open Library provides TXT, PDF, DjVu (a high-res visual document browser), and its own custom-built Book Viewer tool, a digital page-flip interface that presents scanned public domain books in facing pages that the reader can leaf through, search and (eventually) magnify.
Page turning interfaces have been something of a fad recently, appearing first in the British Library’s Turning the Pages manuscript preservation program (specifically cited as inspiration for the OL Book Viewer) and later proliferating across all manner of digital magazines, comics and brochures (often through companies that you can pay to convert a PDF into a sexy virtual object complete with drag-able page corners that writhe when tickled with a mouse, and a paper-like rustling sound every time a page is turned).
This sort of reenactment of paper functionality is perhaps too literal, opting for imitation rather than innovation, but it does offer some advantages. Having a fixed frame for reading is a relief in the constantly scrolling space of the Web browser, and there are some decent navigation tools that gesture toward the ways we browse paper. To either side of the open area of a book are thin vertical lines denoting the edges of the surrounding pages. Dragging the mouse over the edges brings up scrolling page numbers in a small pop-up. Clicking on any of these takes you quickly and directly to that part of the book. Searching is also neat. Type a query and the book is suddenly interleaved with yellow tabs, with keywords highlighted on the page, like so:
openlibraryexample.jpg
But nice as this looks, functionality is sacrificed for the sake of fetishism. Sticky tabs are certainly a cool feature, but not when they’re at the expense of a straightforward list of search returns showing keywords in their sentence context. These sorts of references to the feel and functionality of the paper book are no doubt comforting to readers stepping tentatively into the digital library, but there’s something that feels disjointed about reading this way: that this is a representation of a book but not a book itself. It is a book avatar. I’ve never understood the appeal of those Second Life libraries where you must guide your virtual self to a virtual shelf, take hold of the virtual book, and then open it up on a virtual table. This strikes me as a failure of imagination, not to mention tedious. Each action is in a sense done twice: you operate a browser within which you operate a book; you move the hand that moves the hand that moves the page. Is this perhaps one too many layers of mediation to actually be able to process the book’s contents? Don’t get me wrong, the Book Viewer and everything the Open Library is doing is a laudable start (cause for celebration in fact), but in the long run we need interfaces that deal with texts as native digital objects while respecting the originals.
What may be more interesting than any of the technology previews is a longish development document outlining ambitious plans for building the Open Library user interface. This covers everything from metadata standards and wiki templates to tagging and OCR proofreading to search and browsing strategies, plus a well thought-out list of user scenarios. Clearly, they’re thinking very hard about every conceivable element of this project, including the sorts of things we frequently focus on here such as the networked aspects of texts. Acolytes of Ted Nelson will be excited to learn that a transclusion feature is in the works: a tool for embedding passages from texts into other texts that automatically track back to the source (hypertext copy-and-pasting). They’re also thinking about collaborative filtering tools like shared annotations, bookmarking and user-defined collections. All very very good, but it will take time.
Building an open source library catalog is a mammoth undertaking and will rely on millions of hours of volunteer labor, and like Wikipedia it has its fair share of built-in contradictions. Jessamyn West of librarian.net put it succinctly:

It’s a weird juxtaposition, the idea of authority and the idea of a collaborative project that anyone can work on and modify.

But the only realistic alternative may well be the library that Google is building, a proprietary database full of low-quality digital copies, a semi-accessible public domain prohibitively difficult to use or repurpose outside the Google reading room, a balkanized landscape of partner libraries and institutions left in its wake, each clutching their small slice of the digitized pie while the whole belongs only to Google, all of it geared ultimately not to readers, researchers and citizens but to consumers. Construed more broadly to include not just books but web pages, videos, images, maps etc., the Google library is a place built by us but not owned by us. We create and upload much of the content, we hand-make the links and run the search queries that program the Google brain. But all of this is captured and funneled into Google dollars and AdSense. If passive labor can build something so powerful, what might active, voluntary labor be able to achieve? Open Library aims to find out.

the people’s card catalog (a thought)

New partners and new features. Google has been busy lately building up Book Search. On the institutional end, Ghent, Lausanne and Mysore are among the most recent universities to hitch their wagons to the Google library project. On the user end, the GBS feature set continues to expand, with new discovery tools and more extensive “about” pages gathering a range of contextual resources for each individual volume.
Recently, they extended this coverage to books that haven’t yet been digitized, substantially increasing the findability, if not yet the searchability, of thousands of new titles. The about pages are similar to Amazon’s, which supply book browsers with things like concordances, “statistically improbably phrases” (tags generated automatically from distinct phrasings in a text), textual statistics, and, best of all, hot-linked lists of references to and from other titles in the catalog: a rich bibliographic network of interconnected texts (Bob wrote about this fairly recently). Google’s pages do much the same thing but add other valuable links to retailers, library catalogues, reviews, blogs, scholarly resources, Wikipedia entries, and other relevant sites around the net (an example). Again, many of these books are not yet full-text searchable, but collecting these resources in one place is highly useful.
It makes me think, though, how sorely an open source alternative to this is needed. Wikipedia already has reasonably extensive articles about various works of literature. Library Thing has built a terrific social architecture for sharing books. There are a great number of other freely accessible resources around the web, scholarly database projects, public domain e-libraries, CC-licensed collections, library catalogs.
Could this be stitched together into a public, non-proprietary book directory, a People’s Card Catalog? A web page for every book, perhaps in wiki format, wtih detailed bibliographic profiles, history, links, citation indices, social tools, visualizations, and ideally a smart graphical interface for browsing it. In a network of books, each title ought to have a stable node to which resources can be attached and from which discussions can branch. So far Google is leading the way in building this modern bibliographic system, and stands to turn the card catalogue of the future into a major advertising cash nexus. Let them do it. But couldn’t we build something better?

a new face(book) for wikipedia?

It was pointed out to me the other day that Facebook has started its own free classified ad service, Facebook Marketplace (must be logged in), a sort of in-network Craigslist poised to capitalize on a built-in userbase of over 20 million. Commenting on this, Nicholas Carr had another thought:

But if Craigslist is a big draw for Facebook members, my guess is that Wikipedia is an even bigger draw. I’m too lazy to look for the stats, but Wikipedia must be at or near the top of the list of sites that Facebookers go to when they leave Facebook. To generalize: Facebook is the dorm; Wikipedia is the library; and Craigslist is the mall. One’s for socializing; one’s for studying; one’s for trading.
Which brings me to my suggestion for Zuckerberg [Facebook founder]: He should capitalize on Wikipedia’s open license and create an in-network edition of the encyclopedia. It would be a cinch: Suck in Wikipedia’s contents, incorporate a Wikipedia search engine into Facebook (Wikipedia’s own search engine stinks, so it should be easy to build a better one), serve up Wikipedia’s pages in a new, better-designed Facebook format, and, yes, incorporate some advertising. There may also be some social-networking tools that could be added for blending Wikipedia content with Facebook content.
Suddenly, all those Wikipedia page views become Facebook page views – and additional ad revenues. And, of course, all the content is free for the taking. I continue to be amazed that more sites aren’t using Wikipedia content in creative ways. Of all the sites that could capitalize on that opportunity, Facebook probably has the most to gain.

I’ve often thought this — not the Facebook idea specifically, but simply that five to ten years (or less) down the road, we’ll probably look back bemusedly on the days when we read Wikipedia on the actual Wikipedia site. We’ve grown accustomed to, even fond of it, but Wikipedia does still look and feel very much like a place of production, the goods displayed on the factory floor amid clattering machinery and welding sparks (not to mention periodic labor disputes). This blurring of the line between making and consuming is, of course, what makes Wikipedia such a fascinating project. But it does seem like it’s just a matter of time before the encyclopedia starts getting spun off into glossier, ad-supported packages. What other captive audience networks might be able to pull off their own brand of Wikipedia?

chromograms: visualizing an individual’s editing history in wikipedia

The field of information visualization is cluttered with works that claim to illuminate but in fact obscure. These are what Brad Paley calls “write-only” visualizations. If you put information in but don’t get any out, says Paley, the visualization has failed, no matter how much it dazzles. Brad discusses these matters with the zeal of a spiritual seeker. Just this Monday, he gave a master class in visualization on two laptops, four easels, and four wall screens at the Institute’s second “Monkeybook” evening at our favorite video venue in Brooklyn, Monkeytown. It was a scintillating performance that left the audience in a collective state of synaptic arrest.
Jesse took some photos:
monkeybookpaley2.jpg monkeybookpaley3.jpg
monkeybookpaley4.jpg monkeybookpaley1.jpg
We stand at a crucial juncture, Brad says, where we must marshal knowledge from the relevant disciplines — design, the arts, cognitive science, engineering — in order to build tools and interfaces that will help us make sense of the huge masses of information that have been dumped upon us with the advent of computer networks. All the shallow efforts passing as meaning, each pretty piece of infoporn that obfuscates as it titillates, is a drag on this purpose, and a muddying of the principles of “cognitive engineering” that must be honed and mastered if we are to keep a grip on the world.
With this eloquent gospel still echoing in my brain, I turned my gaze the next day to a new project out of IBM’s Visual Communication Lab that analyzes individuals’ editing histories in Wikipedia. This was produced by the same team of researchers (including the brilliant Fernanda Viegas) that built the well known History Flow, an elegant technique for visualizing the revision histories of Wikipedia articles — a program which, I think it’s fair to say, would rate favorably on the Paley scale of readability and illumination. Their latest effort, called “Chromograms,” hones in the activities of individual Wikipedia editors.
The IBM team is interested generally in understanding the dynamics of peer to peer labor on the internet. They’ve focused on Wikipedia in particular because it provides such rich and transparent records of its production — each individual edit logged, many of them discussed and contextualized through contributors’ commentary. This is a juicy heap of data that, if placed under the right set of lenses, might help make sense of the massively peer-produced palimpsest that is the world’s largest encyclopedia, and, in turn, reveal something about other related endeavors.
Their question was simple: how do the most dedicated Wikipedia contributors divvy up their labor? In other words, when someone says, “I edit Wikipedia,” what precisely do they mean? Are they writing actual copy? Fact checking? Fixing typos and syntactical errors? Categorizing? Adding images? Adding internal links? External ones? Bringing pages into line with Wikipedia style and citation standards? Reverting vandalism?
All of the above, of course. But how it breaks down across contributors, and how those contributors organize and pace their work, is still largely a mystery. Chromograms shed a bit of light.
For their study, the IBM team took the edit histories of Wikipedia administrators: users to whom the community has granted access to the technical backend and who have special privileges to protect and delete pages, and to block unruly users. Admins are among the most active contributors to Wikipedia, some averaging as many as 100 edits per day, and are responsible more than any other single group for the site’s day-to-day maintenance and governance.
What the researches essentially did was run through the edit histories with a fine-toothed, color-coded comb. A chromogram consists of multiple rows of colored tiles, each tile representing a single edit. The color of the tile corresponds with the first letter of the text in the edit, or in the case of “comment chromograms,” the first letter of the user’s description of their edit. Colors run through the alphabet, starting with numbers 1-10 in hues of gray and then running through the ROYGBIV spectrum, A (red) to violet (Z).
color_mapping.gif
It’s a simple system, and one that seems arbitrary at first, but it accomplishes the important task of visually separating editorial actions, and making evident certain patterns in editors’ workflow.
Much was gleaned about the way admins divide their time. Acvitity often occurs in bursts, they found, either in response to specific events such as vandalism, or in steady, methodical tackling of nitpicky, often repetitive, tasks — catching typos, fixing wiki syntax, labeling images etc. Here’s a detail of a chromogram depicting an administrator’s repeated entry of birth and death information on a year page:
chromogramdetail.jpg
The team found that this sort of systematic labor was often guided by lists, either to-do lists in Wikiprojects, or lists of information in articles (a list of naval ships, say). Other times, an editing spree simply works progressively through the alphabet. The way to tell? Look for rainbows. Since the color spectrum runs A to Z, rainbow patterned chromograms depict these sorts of alphabetically ordered tasks. As in here:
chromogramrainbow.jpg
This next pair of images is almost moving. The top one shows one administrator’s crusade against a bout of vandalism. Appropriately, he’s got the blues, blue corresponding with “r” for “revert.” The bottom image shows the same edit history but by article title. The result? A rainbow. Vandalism from A to Z.
chromogramvandalism.jpg
Chromograms is just one tool that sheds light on a particular sort of editing activity in Wikipedia — the fussy, tedious labors of love that keep the vast engine running smoothly. Visualizing these histories goes some distance toward explaining how the distributed method of Wikipedia editing turns out to be so efficient (for a far more detailed account of what the IBM team learned, it’s worth reading this pdf). The chromogram technique is probably too crude to reveal much about the sorts of editing that more directly impact the substance of Wikipedia articles, but it might be a good stepping stone.
Learning how to read all the layers of Wikipedia is necessarily a mammoth undertaking that will require many tools, visualizations being just one of them. High-quality, detailed ethnographies are another thing that could greatly increase our understanding. Does anyone know of anything good in this area?

cathy davidson of duke on the value of wikipedia

Cathy Davidson at Duke continues to impress me with her willingness to publicly take on complicated issues. Here’s a link to an article she wrote for this week’s Chronicle of Higher Education (re-blogged on the Hastac site) in which she takes one of the most progressive and positive stances in relation to Wikipedia that i’ve seen from a senior and highly regarded scholar. [and here’s a link to a piece i wrote a few months back which takes on Jaron Lanier’s critique of Wikipedia.]