Tag Archives: Google

do you remember the first time?

Siva Vaidhyanathan, the Institute’s fellow, is busy writing a book about Google, to be titled The Googlization of Everything. He’s working in public, and right now, he’s interested in hearing stories about how people – that means you! – began to use Google:

Do you remember the first time you used Google? When was it? How did you hear about Google? What was you first impression?
Please use the comments over on The Googlization of Everything to tell me stories.
As Mudbone (Richard Pryor’s character) used to say, “you only remember two times, your first and your last.”

There are a lot of interesting comments there already . . .

kerfluffle at britannica.com

I got a note from someone at Britannica online telling me about a discussion prompted by Clay Shirky’s riposte to Nicolas Carr’s Atlantic article, “Is Google Making Us Stupid?”
The conversation on the Britannica site, and the related posts on John Brockman’s EDGE, remind me as much as anything of the conversational swordplay typical of TV pundits, who are so enamored of their own words that they can barely be bothered to listen to or read each other’s ideas, much less respond sincerely.
(Can it possibly be a coincidence that all the players in this drama are male? Get a grip guys! This is not about scoring points. You’re dealing with issues central to the future of the species and the planet.)
And as long as we’re dealing with missing persons, i was stunned to realize that not one of these media gurus references McLuhan, who as far as i’m concerned, not only asked more profound questions about the effect of media on humans and their society, but provided first-pass answers which we would still do well to heed.
Of the myriad posts and pages that now comprise the Britannica Carr/Shirky discussion, three posts are particular interest.
The first is from the critic Sven Birkerts, whom many people consider conservative. I don’t. Rather, I see Birkerts as the most eloquent voice on behalf of what we are losing as we shed the culture of the Gutenberg age. Birkerts doesn’t entreat us to stop time or throw wrenches in the wheels of change. He’s just asking us to be conscious of what’s good about the present.
Another is from George Dyson who writes in a way that in my worst nightmares i fear is prescient:

Nicholas Carr asks a question that all of us should be asking ourselves:
“What if the cost of machines that think is people who don’t?”
It’s a risk. “The ancestors of oysters and barnacles had heads. Snakes have lost their limbs and ostriches and penguins their power of flight. Man may just as easily lose his intelligence,” warned J. B. S. Haldane in 1928.

The third is a comment by Blair Boland, which appears as a comment to Nicolas Carr’s response to Shirky. Not only does Boland provide a taut history lesson, setting the record straight on the Luddites, but he states a fundamental issue of our time more clearly than anyone else: “who controls technology and for what ends?”

What both critiques share in common and take for granted is a smugly false and typically misleading disparagement of so-called Luddism. The original, much maligned Luddites are commonly dismissed as cranks, or worse still, “murderous thugs” and the “essential fact” of Luddite “complaint” twisted to serve the ends of propagandists for capital. Ned Ludd and his followers were not necessarily opposed to technological ‘change’ or ‘progress’ per se but the social context in which it occurred and the economic consequences it presaged. As Ludd expressed it, “we will never lay down our arms…[’til]the House of Commons passes an act to put down all machinery Hurtful to Commonality”. They realized that these changes were being undertaken undemocratically for the benefit of a narrow class of economic elites. Luddite anxieties were well founded as was their understanding of the implications for the working class in general, even though they couldn’t have foreseen all of the consequences fully. Their protests and resistance was met with the most aggressive and “murderous” suppression by the British government of the day. Thousands of troops were dispatched to put down the rebellion, not only succeeding in ruthlessly exterminating the Luddite uprising but also serving notice to workers in general of the close bonds between the state and industrialists; and the means that could be employed to discipline intractable workers. The dire conditions of the working class in the new “industrial age’ that ensued proved Luddite premonitions largely prophetic. These conditions still exist in many parts of the world. So while it’s fine to fret over the impact of the net on the reading habits of the affluent, the concerns of the Luddites still haven’t gone away. The important principle then as now, is who controls technology and for what ends? Taylor’s time/motion practices further tightened the hold of the owners of production technology over the wage serfs operating that technology, again in a very undemocratic and restrictive way, “hurtful to commonality”. These, as noted, are the same principles that guide much technological development today and are among the most worrisome aspects of its ultimate applications. “And now we’re facing a similar challenge”, to see that the latent democratizing abundance of the net is not “shaped” into the greatest expansion of social control and commercial concentration of power the world has ever known.

google, digitization and archives: despatches from if:book

In discussing with other Institute folks how to go about reviewing four year’s worth of blog posts, I’ve felt torn at times. Should I cherry-pick ‘thinky’ posts that discuss a particular topic in depth, or draw out narratives from strings of posts each of which is not, in itself, a literary gem but which cumulatively form the bedrock of the blog? But I thought about it, and realised that you can’t really have one without the other.
Fair use, digitization, public domain, archiving, the role of libraries and cultural heritage are intricately interconnected. But the name that connects all these issues over the last few years has been Google. The Institute has covered Google’s incursions into digitization of libraries (amongst other things) in a way that has explored many of these issues – and raised questions that are as urgent as ever. Is it okay to privatize vast swathes of our common cultural heritage? What are the privacy issues around technology that tracks online reading? Where now for copyright, fair use and scholarly research?
In-depth coverage of Google and digitization has helped to draw out many of the issues central to this blog. Thus, in drawing forth the narrative of if:book’s Google coverage is, by extension, to watch a political and cultural stance emerging. So in this post I’ve tried to have my cake and eat it – to trace a story, and to give a sense of the depth of thought going into that story’s discussion.
In order to keep things manageable, I’ve kept this post to a largely Google-centric focus. Further reviews covering copyright-related posts, and general discussion of libraries and technology will follow.
2004-5: Google rampages through libraries, annoys Europe, gains rivals
In December 2004, if:book’s first post about Google’s digitization of libraries gave the numbers for the University of Michigan project.
In February 2005, the head of France’s national libraries raised a battle cry against the Anglo-centricity implicit in Google’s plans to digitize libraries. The company’s seemingly relentless advance brought Europe out in force to find ways of forming non-Google coalitions for digitization.
In August, Google halted book scans for a few months to appease publishers angry at encroachments on their copyright. But this was clearly not enough, as in October 2005, Google was sued (again) by a string of publishers for massive copyright infringement. However, undeterred either by European hostility or legal challenges, the same month the company made moves to expand Google Print into Europe. Also in October 2005, Yahoo! launched the Open Content Alliance, which was joined by Microsoft around the same time. Later the same month, a Wired article put the case for authors in favor of Google’s searchable online archive.
In November 2005 Google announced that from here on in Google Print would be known as Google Book Search, as the ‘Print’ reference perhaps struck too close to home for publishers. The same month, Ben savaged Google Print’s ‘public domain’ efforts – then recanted (a little) later that month.
In December 2005 Google’s digitization was still hot news – the Institute did a radio show/podcast with Open Source on the topic, and covered the Google Book Search debate at the American Bar Association. (In fact, most of that month’s posts are dedicated to Google and digitization and are too numerous to do justice to here).
2006: Digitization spreads
By 2006, digitization and digital archives – with attendant debates – are spreading. From January through March, three posts – ‘The book is reading you’ parts 1, 2 and 3 looked at privacy, networked books, fair use, downloading and copyright around Google Book Search. Also in March, a further post discussed Google and Amazon’s incursions into publishing.
In April, the Smithsonian cut a deal with Showtime making the media company a preferential media partner for documentaries using Smithsonian resources. Jesse analyzed the implications for open research.
In June, the Library of Congress and partners launched a project to make vintage newspapers available online. Google Book Search, meanwhile, was tweaked to reassure publishers that the new dedicated search page was not, in fact, a library. The same month, Ben responded thoughtfully in June 2006 to a French book attacking Google, and by extension America, for cultural imperialism. The debate continued with a follow-up post in July.
In August, Google announceddownloadable PDF versions of many of its public-domain books. Then, in August, the publication of Google’s contract with UCAL’s library prompted some debate the same month. In October we reported on Microsoft’s growing book digitization list, and some criticism of the same from Brewster Kahle. The same month, we reported that the Dutch government is pouring millions into a vast public digitization program.
In December, Microsoft launched its (clunkier) version of Google Books, Microsoft Live Book Search.

2007: Google is the environment
In January, former Netscape player Rich Skrenta crowned Google king of the ‘third age of computing’: ‘Google is the environment’, he declared. Meanwhile, having seemingly forgotten 2005’s tussles, the company hosted a publishing conference at the New York Public Library. In February the company signed another digitization deal, this time with Princeton; in August, this institution was joined by Cornell, and the Economist compared Google’s databases to the banking system of the information age. The following month, Siva’s first Monday podcast discussed the Googlization of libraries.
By now, while Google remains a theme, commercial digitization of public-domain archives is a far broader issue. In January, the US National Archives cut a digitization deal with Footnote, effectively paywalling digital access to a slew of public-domain documents; in August, a deal followd with Amazon for commercial distribution of its film archive. The same month, two major audiovisual archiving projects launched.
In May, Ben speculated about whether some ‘People’s Card Catalog’ could be devised to rival Google’s gated archive. The Open Archive launched in July, to mixed reviews – the same month that the ongoing back-and-forth between the Institute and academic Siva Vaidyanathan bore fruit. Siva’s networked writing project, The Googlization Of Everything, was announced (this would be launched in September). Then, in August, we covered an excellent piece by Paul Duguid discussing the shortcomings of Google’s digitization efforts.
In October, several major American libraries refused digitization deals with Google. By November, Google and digitization had found its way into the New Yorker; the same month the Library of Congress put out a call for e-literature links to be archived.

2008: All quiet?
In January we reported that LibraryThing interfaces with the British Library, and in March on the launch of an API for Google Books. Siva’s book found a print publisher the same month.
But if Google coverage has been slighter this year, that’s not to suggest a happy ending to the story. Microsoft abandoned its book scanning project in mid-May of this year, raising questions about the viability of the Open Content Alliance. It would seem as though Skrenta was right. The Googlization of Everything continues, less challenged than ever.

why google and yahoo love wikipedia

From Dan Cohen’s excellent Digital Humanities Blog comes a discussion of the Wikipedia story that Cohen claims no one seems to be writing about — namely, the question of why Google and Yahoo give so much free server space and bandwith to Wikipedia. Cohen points out that there’s more going on here than just the open source ethos of these tech companies: in fact, the two companies are becoming increasingly dependent on Wikipedia as a resource, both as something to repackage for commercial use (in sites such as Answers.com), and as a major component in the programming of search algorithms. Cohen writes:
Let me provide a brief example that I hope will show the value of having such a free resource when you are trying to scan, sort, and mine enormous corpora of text. Let’s say you have a billion unstructured, untagged, unsorted documents related to the American presidency in the last twenty years. How would you differentiate between documents that were about George H. W. Bush (Sr.) and George W. Bush (Jr.)? This is a tough information retrieval problem because both presidents are often referred to as just “George Bush” or “Bush.” Using data-mining algorithms such as Yahoo’s remarkable Term Extraction service, you could pull out of the Wikipedia entries for the two Bushes the most common words and phrases that were likely to show up in documents about each (e.g., “Berlin Wall” and “Barbara” vs. “September 11” and “Laura”). You would still run into some disambiguation problems (“Saddam Hussein,” “Iraq,” “Dick Cheney” would show up a lot for both), but this method is actually quite a powerful start to document categorization.
Cohen’s observation is a valuable reminder that all of the discussion of Wikipedia’s accuracy and usefulness as an academic tool is really only skimming the surface of how and why the open-souce encyclopedia is reshaping the way knowledge is made and accessed. Ultimately, the question of whether or not Wikipedia should be used in the classroom might be less important than whether — or how — it is used in the boardroom, by companies whose function is to repackage, reorganize and return “the people’s knowledge” back to the people at a tidy profit.

virtual libraries, real ones, empires

Last Tuesday, a Washington Post editorial written by Library of Congress librarian James Billington outlined the possible benefits of a World Digital Library, a proposed LOC endeavor discussed last week in a post by Ben Vershbow. Billington seemed to imagine the library as sort of a United Nations of information: claiming that “deep conflict between cultures is fired up rather than cooled down by this revolution in communications,” he argued that a US-sponsored, globally inclusive digital library could serve to promote harmony over conflict:
Libraries are inherently islands of freedom and antidotes to fanaticism. They are temples of pluralism where books that contradict one another stand peacefully side by side just as intellectual antagonists work peacefully next to each other in reading rooms. It is legitimate and in our nation’s interest that the new technology be used internationally, both by the private sector to promote economic enterprise and by the public sector to promote democratic institutions. But it is also necessary that America have a more inclusive foreign cultural policy — and not just to blunt charges that we are insensitive cultural imperialists. We have an opportunity and an obligation to form a private-public partnership to use this new technology to celebrate the cultural variety of the world.
What’s interesting about this quote (among other things) is that Billington seems to be suggesting that a World Digital Library would function in much the same manner as a real-world library, and yet he’s also arguing for the importance of actual physical proximity. He writes, after all, about books literally, not virtually, touching each other, and about researchers meeting up in a shared reading room. There seems to be a tension here, in other words, between Billington’s embrace of the idea of a world digital library, and a real anxiety about what a “library” becomes when it goes online.
I also feel like there’s some tension here — in Billington’s editorial and in the whole World Digital Library project — between “inclusiveness” and “imperialism.” Granted, if the United States provides Brazilians access to their own national literature online, this might be used by some as an argument against the idea that we are “insensitive cultural imperialists.” But there are many varieties of empire: indeed, as many have noted, the sun stopped setting on Google’s empire a while ago.
To be clear, I’m not attacking the idea of the World Digital Library. Having watch the Smithsonian invest in, and waffle on, some of their digital projects, I’m all for a sustained commitment to putting more material online. But there needs to be some careful consideration of the differences between online libraries and virtual ones — as well as a bit more discussion of just what a privately-funded digital library might eventually morph into.