Category Archives: research

fail again fail better have fun

A new research paper by Bruce Mason and Sue Thomas on A Million Penguins, the controversial wiki novel created last year by Penguin Books makes fascinating reading.
It includes amongst other delights an analysis of the activities of the contributor known as YellowBanana and whether s/he was vandal, genius or troll, and the report concludes:
“The final product itself, now frozen in time, is more akin to something produced by the wild,untrammelled creativity of the folk imagination. The contributors to A Million Penguins, like the
ordinary folk of Bakhtin’s carnivals, have produced something excessive. It is rude, chaotic, grotesque, sporadically brilliant, anti-authoritarian and, in places, devastatingly funny. As a cultural text it is unique, and it demonstrates the tremendous potential of this form to provide a stimulating social setting for writing, editing and publishing. The contributors may not have written one single novel but they did create something quite remarkable, an outstanding body of work that can be found both in the main sections as well as through the dramas and conversations lacing the backstage pages. And they had a damned good time while doing so.
As the user Crtrue writes.
“Hi hi hi hi hi!
Seriously. This is going to fail horribly. It’s still fun.””
Read more at:

digital livings

Alongside our research for Arts Council England, I’m also looking at how how new media writers earn their livings and make their way in the world.

The Online MA in Creative Writing and New Media
at De Montfort University is so innovative that there isn’t an obvious career path for its graduates nor an established group of successful role models for students to look to in the UK for inspiration. The Digital Livings project is finding out how writers are carving out professional careers, starting with a survey of UK writers and expanding worldwide later in the year.
Which skills do new media writers possess? Where do they sell their work? What advice do they have to offer those wishing to follow in their footsteps? Is the market for digital fiction growing or not? I’ll report back on our findings.

ace research news in the uk

The Institute for the Future of the Book has been appointed by Arts Council England to undertake research into digital developments in literature. This is exciting news for us, not least because it marks the official launch of our London office.
Over the next few months Chris Meade and Sebastian Mary Harrington will be talking to a wide range of organisations including Arts Council England literature clients and others whose work could provide useful models to the sector.
We’ll be looking at book publishing and magazines, reader development, writers including collaborative and new media authors and the blurring of distinctions between amateur and professional, live literature and festivals, plus other web activity that could provide inspiration to agencies working to spread the word about the word – and we’ll be posting questions and comments on the ifbook blog as we go along.

Sebastian Mary Harrington’s scarf captured live under construction at the Institute’s London HQ, skillfully knitted in the colours of The Institute for the Future of the Book – and The School of Everything – to celebrate the start of our new research project.

six blind men and an elephant

Thomas Mann, author of The Oxford Guide to Library Research, has published an interesting paper (pdf available) examining the shortcomings of search engines and the continued necessity of librarians as guides for scholarly research. It revolves around the case of a graduate student investigating tribute payments and the Peloponnesian War. A Google search turns up nearly 80,000 web pages and 700 books. An overwhelming retrieval with little in the way of conceptual organization and only the crudest of tools for measuring relevance. But, with the help of the LC Catalog and an electronic reference encyclopedia database, Mann manages to guide the student toward a manageable batch of about a dozen highly germane titles.
Summing up the problem, he recalls a charming old fable from India:

Most researchers – at any level, whether undergraduate or professional – who are moving into any new subject area experience the problem of the fabled Six Blind Men of India who were asked to describe an elephant: one grasped a leg and said “the elephant is like a tree”; one felt the side and said “the elephant is like a wall”; one grasped the tail and said “the elephant is like a rope”; and so on with the tusk (“like a spear”), the trunk (“a hose”) and the ear (“a fan”). Each of them discovered something immediately, but none perceived either the existence or the extent of the other important parts – or how they fit together.
Finding “something quickly,” in each case, proved to be seriously misleading to their overall comprehension of the subject.
In a very similar way, Google searching leaves remote scholars, outside the research library, in just the situation of the Blind Men of India: it hides the existence and the extent of relevant sources on most topics (by overlooking many relevant sources to begin with, and also by burying the good sources that it does find within massive and incomprehensible retrievals). It also does nothing to show the interconnections of the important parts (assuming that the important can be distinguished, to begin with, from the unimportant).

Mann believes that books will usually yield the highest quality returns in scholarly research. A search through a well tended library catalog (controlled vocabularies, strong conceptual categorization) will necessarily produce a smaller, and therefore less overwhelming quantity of returns than a search engine (books do not proliferate at the same rate as web pages). And those returns, pound for pound, are more likely to be of relevance to the topic:

Each of these books is substantially about the tribute payments – i.e., these are not just works that happen to have the keywords “tribute” and “Peloponnesian” somewhere near each other, as in the Google retrieval. They are essentially whole books on the desired topic, because cataloging works on the assumption of “scope-match” coverage – that is, the assigned LC headings strive to indicate the contents of the book as a whole….In focusing on these books immediately, there is no need to wade through hundreds of irrelevant sources that simply mention the desired keywords in passing, or in undesired contexts. The works retrieved under the LC subject heading are thus structural parts of “the elephant” – not insignificant toenails or individual hairs.

If nothing else, this is a good illustration of how libraries, if used properly, can still be much more powerful than search engines. But it’s also interesting as a librarian’s perspective on what makes the book uniquely suited for advanced research. That is: a book is substantial enough to be a “structural part” of a body of knowledge. This idea of “whole books” as rungs on a ladder toward knowing something. Books are a kind of conceptual architecture that, until recently, has been distinctly absent on the Web (though from the beginning certain people and services have endeavored to organize the Web meaningfully). Mann’s study captures the anxiety felt at the prospect of the book’s decline (the great coming blindness), and also the librarian’s understandable dread at having to totally reorganize his/her way of organizing things.
It’s possible, however, to agree with the diagnosis and not the prescription. True, librarians have gotten very good at organizing books over time, but that’s not necessarily how scholarship will be produced in the future. David Weinberg ponders this:

As an argument for maintaining human expertise in manually assembling information into meaningful relationships, this paper is convincing. But it rests on supposing that books will continue to be the locus of worthwhile scholarly information. Suppose more and more scholars move onto the Web and do their thinking in public, in conversation with other scholars? Suppose the Web enables scholarship to outstrip the librarians? Manual assemblages of knowledge would retain their value, but they would no longer provide the authoritative guide. Then we will have either of two results: We will have to rely on “‘lowest common denominator'”and ‘one search box/one size fits all’ searching that positively undermines the requirements of scholarly research”…or we will have to innovate to address the distinct needs of scholars….My money is on the latter.

As I think is mine. Although I would not rule out the possibility of scholars actually participating in the manual assemblage of knowledge. Communities like MediaCommons could to some extent become their own libraries, vetting and tagging a wide array of electronic resources, developing their own customized search frameworks.
There’s much more in this paper than I’ve discussed, including a lengthy treatment of folksonomies (Mann sees them as a valuable supplement but not a substitute for controlled taxonomies). Generally speaking, his articulation of the big challenges facing scholarly search and librarianship in the digital age are well worth the read, although I would argue with some of the conclusions.

report on scholarly cyberinfrastructure

The American Council of Learned Societies has just issued a report, “Our Cultural Commonwealth,” assessing the current state of scholarly cyberinfrastructure in the humanities and social sciences and making a series of recommendations on how it can be strengthened, enlarged and maintained in the future.
The definition of cyberinfastructure they’re working with:

“the layer of information, expertise, standards, policies, tools, and services that are shared broadly across communities of inquiry but developed for specific scholarly purposes: cyberinfrastructure is something more specific than the network itself, but it is something more general than a tool or a resource developed for a particular project, a range of projects, or, even more broadly, for a particular discipline.”

I’ve only had time to skim through it so far, but it all seems pretty solid.
John Holbo pointed me to the link in some musings on scholarly publishing in Crooked Timber, where he also mentions our Holy of Holies networked paper prototype as just one possible form that could come into play in a truly modern cyberinfrastructure. We’ve been getting some nice notices from others active in this area such as Cathy Davidson at HASTAC. There’s obviously a hunger for this stuff.

call for papers: what to do with a million books

The Humanities Division at the University of Chicago and the College of Science and Letters at the Illinois Institute of Technology are hosting an intriguing colloquium on the future of research in the humanities in response to the rapid growth of digital archives. They are currently accepting paper proposals, which are due at the end of August.
Here is the call for papers:
What to Do with a Million Books: Chicago Colloquium on Digital Humanities and Computer Science
Sponsored by the Humanities Division at the University of Chicago and the College of Science and Letters at the Illinois Institute of Technology.
Chicago, November 5th & 6th, 2006
Submission Deadline: August 31st, 2006
The goal of this colloquium is to bring together researchers and scholars in the Humanities and Computer Sciences to examine the current state of Digital Humanities as a field of intellectual inquiry and to identify and explore new directions and perspectives for future research.
In the wake of recent large-scale digitization projects aimed at providing universal access to the world’s vast textual repositories, humanities scholars, librarians and computer scientists find themselves newly challenged to make such resources functional and meaningful.
As Gregory Crane recently pointed out (1), digital access to “a million books” confronts us with the need to provide viable solutions to a range of difficult problems: analog to digital conversion, machine translation, information retrieval and data mining, to name a few. Moreover, mass digitization leads not just to problems of scale: new goals can also be envisioned, for example, catalyzing the development of new computational tools for context-sensitive analysis. If we are to build systems to interrogate usefully massive text collections for meaning, we will need to draw not only on the technical expertise of computer scientists but also learn from the traditions of self-reflective, inter-disciplinary inquiry practiced by humanist scholars.
The book as the locus of much of our knowledge has long been at the center of discussions in digital humanities. But as mass digitization efforts accelerate a change in focus from a print-culture to a networked, digital-culture, it will become necessary to pay more attention to how the notion of a text itself is being re-constituted. We are increasingly able to interact with texts in novel ways, as linguistic, visual, and statistical processing provide us with new modes of reading, representation, and understanding. This shift makes evident the necessity for humanities scholars to enter into a dialogue with librarians and computer scientists to understand the new language of open standards, search queries, visualization and social networks.
Digitizing “a million books” thus poses far more than just technical challenges. Tomorrow, a million scholars will have to re-evaluate their notions of archive, textuality and materiality in the wake of these developments. How will humanities scholars, librarians and computer scientists find ways to collaborate in the “Age of Google?”
Colloquium Website
November 5th & 6th, 2006
The University of Chicago
Ida Noyes Hall
1212 East 59th Street
Chicago, IL 60637
Keynote Speakers
Greg Crane (Professor of Classics, Tufts University) has been engaged since 1985 in planning and development of the Perseus Project, which he directs as the Editor-in-Chief. Besides supervising the Perseus Project as a whole, he has been primarily responsible for the development of the morphological analysis system which provides many of the links within the Perseus database.
Ben Shneiderman is Professor in the Department of Computer Science, founding Director (1983-2000) of the Human-Computer Interaction Laboratory, and Member of the Institute for Advanced Computer Studies and the Institute for Systems Research, all at the University of Maryland. He is a leading expert in human-computer interaction and information visualization and has published extensively in these and related fields.
John Unsworth is Dean of the Graduate School of Library and Information Science and Professor of English at the University of Illinois at Urbana-Champaign. Prior to that, he was on the faculty at the University of Virginia where he also led the Institute for Advanced Technology in the Humanities. He has published widely in the field of Digital Humanities and was the recipient last year of the Lyman Award for scholarship in technology and humanities.
Program Committee
Prof. Helma Dik, Department of Classics, University of Chicago
Dr. Catherine Mardikes, Bibliographer for Classics, the Ancient Near East, and General Humanities, University of Chicago
Prof. Martin Mueller, Department of English and Classics, Northwestern University
Dr. Mark Olsen, Associate Director, The ARTFL Project, University of Chicago
Prof. Shlomo Argamon, Computer Science Department, Illinois Institute of Technology
Prof. Wai Gen Yee, Computer Science Department, Illinois Institute of Technology
Call for Participation
Participation in the colloquium is open to all. We welcome submissions for:
1. Paper presentations (20 minute maximum)
2. Poster sessions
3. Software demonstrations
Suggested submission topics
* Representing text genealogies and variance
* Automatic extraction and analysis of natural language style elements
* Visualization of large corpus search results
* The materiality of the digital text
* Interpreting symbols: textual exegesis and game playing
* Mashup: APIs for integrating discrete information resources
* Intelligent Documents
* Community based tagging / folksonomies
* Massively scalable text search and summaries
* Distributed editing & annotation tools
* Polyglot Machines: Computerized translation
* Seeing not reading: visual representations of literary texts
* Schemas for scholars: field and period specific ontologies for the humanities
* Context sensitive text search
* Towards a digital hermeneutics: data mining and pattern finding
Submission Format
Please submit a (2 page maximum) abstract in either PDF or MS Word format to
Important Dates
Deadline for Submissions: August 31st
Notification of Acceptance: September 15th
Full Program Announcement: September 15th
Contact Info
General Inquiries:
Organizational Committee
Mark Olsen,, Associate Director, ARTFL Project, University of Chicago.
Catherine Mardikes,, Bibliographer for Classics, the Ancient Near East, and General Humanities, University of Chicago.
Arno Bosse,, Director of Technology, Humanities Division, University of Chicago.
Shlomo Argamon,, Department of Computer Science, Illinois Institute of Technology.

post-doc fellowships available for work with the institute

The Institute for the Future of the Book is based at the Annenberg Center for Communication at USC. Jonathan Aronson, the executive director of the center, has just sent out a call for eight post-docs and one visiting scholar for next year. if you know of anyone who would like to apply, particularly people who would like to work with us at the institute, please pass this on. the institute’s activities at the center are described as follows:
Shifting Forms of Intellectual Discourse in a Networked Culture
For the past several hundred years intellectual discourse has been shaped by the rhythms and hierarchies inherent in the nature of print. As discourse shifts from page to screen, and more significantly to a networked environment, the old definitions and relations are undergoing unimagined changes. The shift in our world view from individual to network holds the promise of a radical reconfiguration in culture. Notions of authority are being challenged. The roles of author and reader are morphing and blurring. Publishing, methods of distribution, peer review and copyright — every crucial aspect of the way we move ideas around — is up for grabs. The new digital technologies afford vastly different outcomes ranging from oppressive to liberating. How we make this shift has critical long term implications for human society.
Research interests include: how reading and writing change in a networked culture; the changing role of copyright and fair use, the form and economics of open-source content, the shifting relationship of medium to message (or form to content).
if you have any questions, please feel free to email bob stein

questions about blog search and time

Does anyone know of a good way to search for old blog entries on the web? I’ve just been looking at some of the available blog search resources and few of them appear to provide any serious advanced search options. The couple of major ones I’ve found that do (after an admittedly cursory look) are Google and Ice Rocket. Both, however, appear to be broken, at least when it comes to dates. I’ve tried them on three different browsers, on Mac and PC, and in each case the date menus seem to be frozen. It’s very weird. They give you the option of entering a specific time range but won’t accept the actual dates. Maybe I’m just having a bad tech day, but it’s as if there’s some conceptual glitch across the web vis a vis blogs and time.
Most blog search engines are geared toward searching the current blogosphere, but there should be a way to research older content. My first thought was that blog search engines crawl RSS feeds, most of which do not transmit the entirety of a blog’s content, just the more recent. That would pose a problem for archival search.
Does anyone know what would be the best way to go about finding, say, old blog entries containing the keywords “new orleans superdome” from late August to late September 2005? Is it best to just stick with general web search and painstakingly comb through for blogs? If we agree that blogs have become an important kind of cultural document, than surely there should be a way to find them more than a month after they’ve been written.

thinking about google books: tonight at 7 on radio open source

While visiting the Experimental Television Center in upstate New York this past weekend, Lisa found a wonderful relic in a used book shop in Owego, NY — a small, leatherbound volume from 1962 entitled “Computers,” which IBM used to give out as a complimentary item. An introductory note on the opening page reads:

The machines do not think — but they are one of the greatest aids to the men who do think ever invented! Calculations which would take men thousands of hours — sometimes thousands of years — to perform can be handled in moments, freeing scientists, technicians, engineers, businessmen, and strategists to think about using the results.

This echoes Vannevar Bush’s seminal 1945 essay on computing and networked knowledge, “As We May Think”, which more or less prefigured the internet, web search, and now, the migration of print libraries to the world wide web. Google Book Search opens up fantastic possibilities for research and accessibility, enabling readers to find in seconds what before might have taken them hours, days or weeks. Yet it also promises to transform the very way we conceive of books and libraries, shaking the foundations of major institutions. Will making books searchable online give us more time to think about the results of our research, or will it change the entire way we think? By putting whole books online do we begin the steady process of disintegrating the idea of the book as a bounded whole and not just a sequence of text in a massive database?
The debate thus far has focused too much on the legal ramifications — helped in part by a couple of high-profile lawsuits from authors and publishers — failing to take into consideration the larger cognitive, cultural and institutional questions. Those questions will hopefully be given ample air time tonight on Radio Open Source.
Tune in at 7pm ET on local public radio or stream live over the web. The show will also be available later in the week as a podcast.

more on wikipedia

As summarized by a Dec. 5 article in CNET, last week was a tough one for Wikipedia — on Wednesday, a USA today editorial by John Seigenthaler called Wikipedia “irresponsible” for not catching significant mistakes in his biography, and Thursday, the Wikipedia community got up in arms after discovering that former MTV VJ and longtime podcaster Adam Curry had edited out references to other podcasters in an article about the medium.
In response to the hullabaloo, Wikipedia founder Jimmy Wales now plans to bar anonymous users from creating new articles. The change, which went into effect today, could possibly prevent a repeat of the Seigenthaler debacle; now that Wikipedia would have a record of who posted what, presumably people might be less likely to post potentially libelous material. According to Wales, almost all users who post to Wikipedia are already registered users, so this won’t represent a major change to Wikipedia in practice. Whether or not this is the beginning of a series of changes to Wikipedia that push it away from its “hive mind” origins remains to be seen.
I’ve been surprised at the amount of Wikipedia-bashing that’s occurred over the past few days. In a historical moment when there’s so much distortion of “official” information, there’s something peculiar about this sudden outrage over the unreliability of an open-source information system. Mostly, the conversation seems to have shifted how people think about Wikipedia. Once an information resource developed by and for “us,” it’s now an unreliable threat to the idea of truth imposed on us by an unholy alliance between “volunteer vandals” (Seigenthaler’s phrase) and the outlaw Jimmy Wales. This shift is exemplified by the post that begins a discussion of Wikipedia that took place over the past several days on the Association of Internet Researchers list serve. The scholar who posted suggested that researchers boycott Wikipedia and prohibit their students from using the site as well until Wikipedia develops “an appropriate way to monitor contributions.” In response, another poster noted that rather than boycotting Wikipedia, it might be better to monitor for the site — or better still, write for it.
Another comment worthy of consideration from that same discussion: in a post to the same AOIR listserve, Paul Jones notes that in the 1960s World Book Encyclopedia, RCA employees wrote the entry on television — scarcely mentioning television pioneer Philo Farnsworth, longtime nemesis of RCA. “Wikipedia’s failing are part of a public debate,” Jones writes, “Such was not the case with World Book to my knowledge.” In this regard, the flak over Wikipedia might be considered a good thing: at least it gives those concerned with the construction of facts the opportunity to debate with the issue. I’m just not sure that making Wikipedia the enemy contributes that much to the debate.