Tag Archives: archive

google, digitization and archives: despatches from if:book

In discussing with other Institute folks how to go about reviewing four year’s worth of blog posts, I’ve felt torn at times. Should I cherry-pick ‘thinky’ posts that discuss a particular topic in depth, or draw out narratives from strings of posts each of which is not, in itself, a literary gem but which cumulatively form the bedrock of the blog? But I thought about it, and realised that you can’t really have one without the other.
Fair use, digitization, public domain, archiving, the role of libraries and cultural heritage are intricately interconnected. But the name that connects all these issues over the last few years has been Google. The Institute has covered Google’s incursions into digitization of libraries (amongst other things) in a way that has explored many of these issues – and raised questions that are as urgent as ever. Is it okay to privatize vast swathes of our common cultural heritage? What are the privacy issues around technology that tracks online reading? Where now for copyright, fair use and scholarly research?
In-depth coverage of Google and digitization has helped to draw out many of the issues central to this blog. Thus, in drawing forth the narrative of if:book’s Google coverage is, by extension, to watch a political and cultural stance emerging. So in this post I’ve tried to have my cake and eat it – to trace a story, and to give a sense of the depth of thought going into that story’s discussion.
In order to keep things manageable, I’ve kept this post to a largely Google-centric focus. Further reviews covering copyright-related posts, and general discussion of libraries and technology will follow.
2004-5: Google rampages through libraries, annoys Europe, gains rivals
In December 2004, if:book’s first post about Google’s digitization of libraries gave the numbers for the University of Michigan project.
In February 2005, the head of France’s national libraries raised a battle cry against the Anglo-centricity implicit in Google’s plans to digitize libraries. The company’s seemingly relentless advance brought Europe out in force to find ways of forming non-Google coalitions for digitization.
In August, Google halted book scans for a few months to appease publishers angry at encroachments on their copyright. But this was clearly not enough, as in October 2005, Google was sued (again) by a string of publishers for massive copyright infringement. However, undeterred either by European hostility or legal challenges, the same month the company made moves to expand Google Print into Europe. Also in October 2005, Yahoo! launched the Open Content Alliance, which was joined by Microsoft around the same time. Later the same month, a Wired article put the case for authors in favor of Google’s searchable online archive.
In November 2005 Google announced that from here on in Google Print would be known as Google Book Search, as the ‘Print’ reference perhaps struck too close to home for publishers. The same month, Ben savaged Google Print’s ‘public domain’ efforts – then recanted (a little) later that month.
In December 2005 Google’s digitization was still hot news – the Institute did a radio show/podcast with Open Source on the topic, and covered the Google Book Search debate at the American Bar Association. (In fact, most of that month’s posts are dedicated to Google and digitization and are too numerous to do justice to here).
2006: Digitization spreads
By 2006, digitization and digital archives – with attendant debates – are spreading. From January through March, three posts – ‘The book is reading you’ parts 1, 2 and 3 looked at privacy, networked books, fair use, downloading and copyright around Google Book Search. Also in March, a further post discussed Google and Amazon’s incursions into publishing.
In April, the Smithsonian cut a deal with Showtime making the media company a preferential media partner for documentaries using Smithsonian resources. Jesse analyzed the implications for open research.
In June, the Library of Congress and partners launched a project to make vintage newspapers available online. Google Book Search, meanwhile, was tweaked to reassure publishers that the new dedicated search page was not, in fact, a library. The same month, Ben responded thoughtfully in June 2006 to a French book attacking Google, and by extension America, for cultural imperialism. The debate continued with a follow-up post in July.
In August, Google announceddownloadable PDF versions of many of its public-domain books. Then, in August, the publication of Google’s contract with UCAL’s library prompted some debate the same month. In October we reported on Microsoft’s growing book digitization list, and some criticism of the same from Brewster Kahle. The same month, we reported that the Dutch government is pouring millions into a vast public digitization program.
In December, Microsoft launched its (clunkier) version of Google Books, Microsoft Live Book Search.

2007: Google is the environment

In January, former Netscape player Rich Skrenta crowned Google king of the ‘third age of computing’: ‘Google is the environment’, he declared. Meanwhile, having seemingly forgotten 2005’s tussles, the company hosted a publishing conference at the New York Public Library. In February the company signed another digitization deal, this time with Princeton; in August, this institution was joined by Cornell, and the Economist compared Google’s databases to the banking system of the information age. The following month, Siva’s first Monday podcast discussed the Googlization of libraries.
By now, while Google remains a theme, commercial digitization of public-domain archives is a far broader issue. In January, the US National Archives cut a digitization deal with Footnote, effectively paywalling digital access to a slew of public-domain documents; in August, a deal followd with Amazon for commercial distribution of its film archive. The same month, two major audiovisual archiving projects launched.
In May, Ben speculated about whether some ‘People’s Card Catalog’ could be devised to rival Google’s gated archive. The Open Archive launched in July, to mixed reviews – the same month that the ongoing back-and-forth between the Institute and academic Siva Vaidyanathan bore fruit. Siva’s networked writing project, The Googlization Of Everything, was announced (this would be launched in September). Then, in August, we covered an excellent piece by Paul Duguid discussing the shortcomings of Google’s digitization efforts.
In October, several major American libraries refused digitization deals with Google. By November, Google and digitization had found its way into the New Yorker; the same month the Library of Congress put out a call for e-literature links to be archived.

2008: All quiet?

In January we reported that LibraryThing interfaces with the British Library, and in March on the launch of an API for Google Books. Siva’s book found a print publisher the same month.
But if Google coverage has been slighter this year, that’s not to suggest a happy ending to the story. Microsoft abandoned its book scanning project in mid-May of this year, raising questions about the viability of the Open Content Alliance. It would seem as though Skrenta was right. The Googlization of Everything continues, less challenged than ever.

if:book review 1: game culture

I’ve chosen ‘game culture’ as the theme for this first review post, for all that many of these posts could just as easily be tagged another handful of ways. But games have always hovered at the fringes of debates about the future of the book.
Consideration of serious video games; repurposing of existing games to create machinima, and cultural activities arising out of machinima. Dscussion of more overtly cross-platform activities: pervasive gaming, ARGs and their multiple spawn in terms of commercialization, interactivity, resistance to ‘didactic’ co-optation and more. There’s a lot here; as per my first post on this subject, I’d welcome comments and thoughts.
In February 2005, Sol Gaitan wrote a thoughtful piece about the prevalence of video games in children’s lives, and questioned whether such games might be used more for didactic purposes. In April 2005 Ben picked up an excerpt from Stephen Johnson’s Everything Bad Is Good For You, which pointed to further reading on video games in education. In August 2005, four British secondary schools experimented with educational games; someone died after playing video games for 50 hours straight without stopping to eat; and Sol pondered whether the future of the book was in fact a video game.
Between February and May 2006 the Institute worked on providing a public space for McKenzie Wark’s Gamer Theory – not strictly a game, but a networked meta-discussion of game culture. Discussion of ‘serious’ games continued in an April analysis of why some games should be publicly funded. In August 2006, Sino-Japanese relations became tense in the MMORPG Fantasy Westward Journey; later the same month, Gamersutra wondered why there weren’t any highbrow video games, prompting a thoughtful piece from Ray Cha on whether ‘high’ and ‘low’ art definitions have any meaning in that context.
Machinima and its relations have appeared at intervals. In July 2005 Bob Stein was interviewed in Halo, followed later the same month by Peggy Ahwesh in Halo-based talk show This Spartan Life. Ben wrote about the new wave of machinima and its relatives in December 2005, following this up with a Grand Text Auto call for scholarly papers in January 2006, and a vitriolic denunciation of the intersection between machinima, video gaming, and the virtualization of war (May 2006). In September 2006 McKenzie Wark was interviewed about Gamer Theory in Halo. Then, in October 2007, Chris mentioned the first machinima conference to be held in Europe.
Pervasive gaming makes its first appearance in a September 2006 mention of the first Come Out And Play festival (the 2008 one just wrapped up in NY last weekend). It’s interesting to note how the field has evolved since 2006: where pervasive gaming felt relatively indie in 2006, this year ARG superstar Jane McGonigal brought The Lost Game, part of The Lost Ring, her McDonalds-sponsored Olympic Games ARG
Earlier, overlap between pervasive gaming, ARGs and hoaxes was foreshadowed by an August 2005 story about a BBC employee writing a Wikipedia obituary for a fictional pop star – and then denying that they were gaming the encyclopedia. I wrote my first post about ARGs and commercialization in January 2007, following this with another about ARGs and player interaction in March. The same month, Ben and I got excited about the launch of McGonigal’s World Without Oil, which looked to bring together themes of ‘serious’ and pervasive gaming – but turned out, as Ben and my conversation (posted May 2006) to be rather pious and lacking in narrative.
Since then, both marketing and educational breeds of ARG have spread, as attested by Penguin’s WeTellStories (trailed February, launched March 2008), and the announcement of UK public service broadcaster Channel 4 Education’s move of its £6m commissioning budget into cross-platform projects.
I’m not going to attempt a summary of the above, except to say that everything and nothing has changed: cross-platform entertainment has edged towards the mainstream, didactic games continue to plow their furrow at the margins of the vast gaming industry, and commercialization is still a contentious topic. It’s not clear whether gaming has come closer to being accepted alongside cinema as a significant art form, but its vocabularies have – as McKenzie Wark’s book suggested – increasingly bled into many aspects of contemporary culture, and will no doubt continue to do so.


It’s been pretty quiet on the blog for the last few weeks. This is partly because there’s a lot of work going on backstage. But it’s also symptomatic of the fact that the research, writing and blogging element of the Institute for the Future of the Book is in the process of serious self-examination.
My first encounter with the Institute for the Future of the Book was via if:book. I posted a comment, received an email from Bob, wrote back, and found myself having tea with him at the Royal Court Theatre in London a week or so later. In my naivete, I hadn’t fully taken on board that it was the output of a think tank, a dedicated group of people whose full-time job it was to think about these things. Because most of the online creative work I was involved in at the time was part-time, voluntary and unpaid, I assumed that if:book worked the same way and asked how I could go about acquiring posting rights.
But the Institute has always been very open-sided. I got my posting rights. Then, shortly after making a first post, I was invited out to NYC to hang out with the team. What had begun as a playful, remote interaction of ideas suddenly took on form and force.
While the Web can often seem more divisive than social – a culture of mouse potatoes unable to interact with other humans save through keyboard and avatar – there are times when it can throw extraordinary, life-changing things your way. The Institute has been one of those for me.
But a lot has changed since I appeared on the scene a year and a half ago, both within the Institute and across the worlds of technology, digital arts and academia in whose cross-fire the Institute found its groove. With Penguin running ARGs, e-readers in the news every second week, and Web2.0 less a buzzword than an enabling condition of contemporary life, thought, debate and activity around discourse and the networked screen has exploded in all directions.
For a blog that explores these things, this poses a challenge. How to keep up with it all? Should it be curated? Should we commission content, generate content, or simply aggregate it and moderate discussion around this? And central to this are still deeper questions. What is such a space for? Who reads if:book? And, more profoundly yet, what will – or should – the Institute be in times to come?
From conversations with Institute members who’ve seen – as I did not – the space evolve from a blank canvas to a phalanx of ideas, an influential position and a series of projects, it’s my understanding that the mood and mode has always been exploratory. One thing might lead to the next, a chance meeting to a new project; a throwaway remark to a runaway success. But it’s not enough to say it’s been an exploration, and that the time for exploration is over.
We’re currently seeing the first shoots of an extraordinary flowering of digital culture. As the Web mainstreams, creators of all kinds – and not just the technologically adept – are finding a voice in the digital space. Let’s say this is no longer the future of the book but its present – a world where print and digital texts interact, interweave, are taggable in Twitter or rendered in digital ink.
One might say that the research, thinking and writing that’s taken place on if:book since late 2004 has helped plow the ground for this. Let’s ask then: when the question is less one of whether books or screens will win, but of (say) best practice in collaborative authorship or the best way to render multimedia authoring programs indexable in search engines, does this world need a think and do tank to lead the way? And if so, what does it think, and what does it do?
We don’t have answers to these questions. But they’re at the core of my task over the coming weeks, which is to delve into the archives of if:book and, from my Johnny-come-lately position of relative naivete, review the story so far. And, hopefully, gain some sense of where it might go next.
A year and a half on, I’m out in NY hanging out with the team again. Over the course of my stay I’ll be exploring the back catalogs, and talking to people in and around the Institute. When I did my first collaborative writing work, I learned that the best way to filter text down to bare bones for Web reading was to send it to a friend and then ask that friend to tell you what they remembered of it without looking at it again. I want to know which of if:book’s posts stuck in that way: which acted as turning points, which inspired some new event or project, which sparked debate or – as in my case – brought new contributors to the team.
Clearly, also, this cannot be confined to if:book personnel past or present. The blog has had a dedicated readership over the last years, occasional guests, and a wide community of support. We welcome suggestions – whether one-liners or paragraphs long – of ideas or articles that have been particularly memorable, fruitful, inspiring – or the reverse. For me, this exercise will be a chance to educate myself about a significant body of work that’s helped shape the conversation around writing and the Web; and hopefully to begin a conversation, review and summary process that can help take that body of work towards its own future.
Comments on the blog are welcome, as always – or if you’d prefer, send them to smary [at] futureofthebook.org and I’ll add them as guest posts.


From time to time, the Institute returns to thorny and intractable thought experiments. One that’s been kicking around for a long time is what we’ve called the “Communist Manifesto problem”: the problem of representing a book and the conversations it engenders over time, conversations which may grow to include other books. (The Communist Manifesto would be a particularly knotty text to render because it’s had so many cultural repercussions. See here and here for past references on this blog.) It’s a good thought experiment because it’s too big to be easily solved, but aspects of it come up fairly frequent basis. This past week, I found myself thinking about a particular facet of the Communist Manifesto problem: how we think about re-enactment in the age of the archive.
On Wednesday night, I went to see the Wooster Group’s production of Hamlet. I’m not especially qualified as a theater critic (I’m sure others here can say more intelligent things than I), but the central thrust of this production is simple enough: the actors performing Hamlet perform it in front of video of the 1964 filmed version of the play starring Richard Burton. The Burton version is a filmed play, a form intended to bring theater to the theaterless masses that never quite caught on; the Wooster Group’s actors expertly mime the 1964 actors, and sets are moved balletically to match changes in camera angles in the film. Often the original actors are digitally edited out of the film in whole or in part. It’s a clever idea. Hamlet is as familiar to us as any play can be. Even if you’ve never seen another dramatic or filmic production of the play, the language can’t be escaped: in some stretches, every line has been borrowed as a title for something else. It’s lousy with resonances. We can’t watch Hamlet as a self-contained work of art any more than we can look at the Mona Lisa. The Wooster Group’s production makes this explicit: when we watch Hamlet, we’re watching it against the army of other Hamlets we’ve seen.
scott shepherd and richard burton are both hamlet
This has always been an issue with certain well-known works: Hamlet‘s been omnipresent for a long time. The availability of a digital archive, however, has foregrounded this. Before film, theatergoers would be measuring productions against memories of previous productions they’d seen. Now we don’t need to rely on memory: a dozen filmed versions of Hamlet can be queued on Netflix without any trouble, to say nothing of the 4,700 videos that are the results of a YouTube search.
Another cover version: on Thursday night, I went over to Anthology Film Archives to see Raiders of the Lost Ark: The Adaptation. From 1982 to 1989, a group of teens in Mississippi filmed their own scene-for-scene version of Spielberg’s Indiana Jones movie, corralling their friends to play Egyptians, family dogs to play monkeys, and laboriously recreating all but one of the original stunts: they decided there was no way to film a Nazi decapitated by an airplane’s propellor without it looking cheesy, so they left that out. The film & sound quality is muddy, to say the least but one can’t help but be impressed by what they managed to do. It’s clear that an astonishing amount of work went into the film, still more when you realize that they didn’t have a copy of the original on video to work from. And spending seven years on the project: my youth appears pale and lazy by comparison. Strangely, the makers of the film only bothered to show it once before its rediscoveryfour years ago.
chris strompolos is a new and improved indiana jones
Once you start thinking about the idea of re-enactment, you start seeing it everywhere. Maybe the argument could be made that we’re in a cultural moment devoted to re-enactment. Much of what we write off as novelty can be put into this category. The Internet recently was excited about old people re-enacting iconic photos of the twentieth century; see also choirs of old people performing Sonic Youth’s “Schizophrenia”. Or choirs of small children doing much the same. But less ironic presentations abound: off the top of my head, Japancakes just released a note-for-note country-inflected cover of Loveless, My Bloody Valentine’s seminal drone-rock record. Going further, German new music ensemble Zeitkratzer has played and recorded Lou Reed’s Metal Machine Music. Tom McCarthy‘s excellent recent novel Remainder concerns a wealthy man who maniacally reenacts scenes; McCarthy springs from the art world, which has been interested in re-enactment for a while. Examples spiral on ad infinitum. But there seems to be something in us that wants to see or hear what we’ve seen or heard before again.
These are quickly composed thoughts, and I’m ignoring a great deal; parsing the difference between re-enactment and adaptation could be fiendishly complicated, as might be the role of copyright in all of this, etc. I’ll simply tie this back to the Communist Manifesto problem. I think it’s become apparent that we’re no longer reading texts in isolation: now when we read Hamlet, digital media has made it possible to read any number of possible versions at the same time. The archive presents us with an embarrassment of riches, though I suspect that we still lack the tools to let us make sense of the pile: both to make sense of the growing number of versions of texts and to usefully compare versions. The Wooster Group’s Hamlet can be seen as a close reading of the 1964 Hamlet. But such a one-to-one reading might just be the tip of the iceberg.