Category Archives: folksonomies

six blind men and an elephant

Thomas Mann, author of The Oxford Guide to Library Research, has published an interesting paper (pdf available) examining the shortcomings of search engines and the continued necessity of librarians as guides for scholarly research. It revolves around the case of a graduate student investigating tribute payments and the Peloponnesian War. A Google search turns up nearly 80,000 web pages and 700 books. An overwhelming retrieval with little in the way of conceptual organization and only the crudest of tools for measuring relevance. But, with the help of the LC Catalog and an electronic reference encyclopedia database, Mann manages to guide the student toward a manageable batch of about a dozen highly germane titles.
Summing up the problem, he recalls a charming old fable from India:

Most researchers – at any level, whether undergraduate or professional – who are moving into any new subject area experience the problem of the fabled Six Blind Men of India who were asked to describe an elephant: one grasped a leg and said “the elephant is like a tree”; one felt the side and said “the elephant is like a wall”; one grasped the tail and said “the elephant is like a rope”; and so on with the tusk (“like a spear”), the trunk (“a hose”) and the ear (“a fan”). Each of them discovered something immediately, but none perceived either the existence or the extent of the other important parts – or how they fit together.
Finding “something quickly,” in each case, proved to be seriously misleading to their overall comprehension of the subject.
In a very similar way, Google searching leaves remote scholars, outside the research library, in just the situation of the Blind Men of India: it hides the existence and the extent of relevant sources on most topics (by overlooking many relevant sources to begin with, and also by burying the good sources that it does find within massive and incomprehensible retrievals). It also does nothing to show the interconnections of the important parts (assuming that the important can be distinguished, to begin with, from the unimportant).

Mann believes that books will usually yield the highest quality returns in scholarly research. A search through a well tended library catalog (controlled vocabularies, strong conceptual categorization) will necessarily produce a smaller, and therefore less overwhelming quantity of returns than a search engine (books do not proliferate at the same rate as web pages). And those returns, pound for pound, are more likely to be of relevance to the topic:

Each of these books is substantially about the tribute payments – i.e., these are not just works that happen to have the keywords “tribute” and “Peloponnesian” somewhere near each other, as in the Google retrieval. They are essentially whole books on the desired topic, because cataloging works on the assumption of “scope-match” coverage – that is, the assigned LC headings strive to indicate the contents of the book as a whole….In focusing on these books immediately, there is no need to wade through hundreds of irrelevant sources that simply mention the desired keywords in passing, or in undesired contexts. The works retrieved under the LC subject heading are thus structural parts of “the elephant” – not insignificant toenails or individual hairs.

If nothing else, this is a good illustration of how libraries, if used properly, can still be much more powerful than search engines. But it’s also interesting as a librarian’s perspective on what makes the book uniquely suited for advanced research. That is: a book is substantial enough to be a “structural part” of a body of knowledge. This idea of “whole books” as rungs on a ladder toward knowing something. Books are a kind of conceptual architecture that, until recently, has been distinctly absent on the Web (though from the beginning certain people and services have endeavored to organize the Web meaningfully). Mann’s study captures the anxiety felt at the prospect of the book’s decline (the great coming blindness), and also the librarian’s understandable dread at having to totally reorganize his/her way of organizing things.
It’s possible, however, to agree with the diagnosis and not the prescription. True, librarians have gotten very good at organizing books over time, but that’s not necessarily how scholarship will be produced in the future. David Weinberg ponders this:

As an argument for maintaining human expertise in manually assembling information into meaningful relationships, this paper is convincing. But it rests on supposing that books will continue to be the locus of worthwhile scholarly information. Suppose more and more scholars move onto the Web and do their thinking in public, in conversation with other scholars? Suppose the Web enables scholarship to outstrip the librarians? Manual assemblages of knowledge would retain their value, but they would no longer provide the authoritative guide. Then we will have either of two results: We will have to rely on “‘lowest common denominator'”and ‘one search box/one size fits all’ searching that positively undermines the requirements of scholarly research”…or we will have to innovate to address the distinct needs of scholars….My money is on the latter.

As I think is mine. Although I would not rule out the possibility of scholars actually participating in the manual assemblage of knowledge. Communities like MediaCommons could to some extent become their own libraries, vetting and tagging a wide array of electronic resources, developing their own customized search frameworks.
There’s much more in this paper than I’ve discussed, including a lengthy treatment of folksonomies (Mann sees them as a valuable supplement but not a substitute for controlled taxonomies). Generally speaking, his articulation of the big challenges facing scholarly search and librarianship in the digital age are well worth the read, although I would argue with some of the conclusions.