Category Archives: brewster_kahle

emerging libraries at rice: day one

For the next few days, Bob and I will be at the De Lange “Emerging Libraries” conference hosted by Rice University in Houston, TX, coming to you live with occasional notes, observations and overheard nuggets of wisdom. Representatives from some of the world’s leading libraries are here: the Library of Congress, the British Library, the new Bibliotheca Alexandrina, as well as the architects of recent digital initiatives like the Internet Archive, arXiv.org and the Public Library of Science. A very exciting gathering indeed.
We’re here, at least in part, with our publisher hat on, thinking quite a lot these days about the convergence of scholarly publishing with digital research infrastructure (i.e. MediaCommons). It was fitting then that the morning kicked off with a presentation by Richard Baraniuk, founder of the open access educational publishing platform Connexions. Connexions, which last year merged with the digitally reborn Rice University Press, is an innovative repository of CC-licensed courses and modules, built on an open volunteer basis by educators and freely available to weave into curricula and custom-designed collections, or to remix and recombine into new forms.
Connexions is designed not only as a first-stop resource but as a foundational layer upon which richer and more focused forms of access can be built. Foremost among those layers of course is Rice University Press, which, apart from using the Connexions publishing framework will still operate like a traditional peer review-driven university press. But other scholarly and educational communities are also encouraged to construct portals, or “lenses” as they call them, to specific areas of the Connexions corpus, possibly filtered through post-publication peer review. It will be interesting to see whether Connexions really will end up supporting these complex external warranting processes or if it will continue to serve more as a building block repository — an educational lumber yard for educators around the world.
Constructive crit: there’s no doubt that Connexions is one of the most important and path-breaking scholarly publishing projects out there, though it still feels to me more like backend infrastructure than a fully developed networked press. It has a flat, technical-feeling design and cookie cutter templates that give off a homogenous impression in spite of the great diversity of materials. The social architecture is also quite limited, and what little is there (ways to suggest edits and discussion forums attached to modules) is not well integrated with course materials. There’s an opportunity here to build more tightly knit communities around these offerings — lively feedback loops to improve and expand entries, areas to build pedagogical tutorials and to collect best practices, and generally more ways to build relationships that could lead to further collaboration. I got to chat with some of the Connexions folks and the head of the Rice press about some of these social questions and they were very receptive.

*     *     *     *     *

Michael A. Keller of Stanford spoke of emerging “cybraries” and went through some very interesting and very detailed elements of online library search that I’m too exhausted to summarize now. He capped off his talk with a charming tour through the Stanford library’s Second Life campus and the library complex on Information Island. Keller said he ultimately doesn’t believe that purely imitative virtual worlds will become the principal interface to libraries but that they are nonetheless a worthwhile area for experimentation.
Browsing during the talk, I came across an interesting and similarly skeptical comment by Howard Rheingold on a long-running thread on Many 2 Many about Second Life and education:

I’ve lectured in Second Life, complete with slides, and remarked that I didn’t really see the advantage of doing it in SL. Members of the audience pointed out that it enabled people from all over the world to participate and to chat with each other while listening to my voice and watching my slides; again, you don’t need an immersive graphical simulation world to do that. I think the real proof of SL as an educational medium with unique affordances would come into play if an architecture class was able to hold sessions within scale models of the buildings they are studying, if a biochemistry class could manipulate realistic scale-model simulations of protein molecules, or if any kind of lesson involving 3D objects or environments could effectively simulate the behaviors of those objects or the visual-auditory experience of navigating those environments. Just as the techniques of teleoperation that emerged from the first days of VR ended up as valuable components of laparascopic surgery, we might see some surprise spinoffs in the educational arena. A problem there, of course, is that education systems suffer from a great deal more than a lack of immersive environments. I’m not ready to write off the educational potential of SL, although, as noted, the importance of that potential should be seen in context. In this regard, we’re still in the early days of the medium, similar to cinema in the days when filmmakers nailed a camera tripod to a stage and filmed a play; SL needs D.W. Griffiths to come along and invent the equivalent of close-ups, montage, etc.

Rice too has some sort of Second Life presence and apparently was beaming the conference into Linden land.

*     *     *     *     *

Next came a truly mind-blowing presentation by Noha Adly of the Bibliotheca Alexandrina in Egypt. Though only five years old, the BA casts itself quite self-consciously as the direct descendant of history’s most legendary library, the one so frequently referenced in contemporary utopian rhetoric about universal digital libraries. The new BA glories in this old-new paradigm, stressing continuity with its illustrious past and at the same time envisioning a breathtakingly modern 21st century institution unencumbered by the old thinking and constrictive legacies that have so many other institutions tripping over themselves into the digital age. Adly surveyed more fascinating-sounding initiatives, collections and research projects than I can possibly recount. I recommend investigating their website to get a sense of the breadth of activity that is going on there. I will, however, note that that they are the only library in the world to house a complete copy of the Internet Archive: 1.5 petabytes of data on nearly 900 computers.
olpckahle.jpg (Speaking of the IA, Brewster Kahle is also here and is closing the conference Wednesday afternoon. He brought with him a test model of the hundred dollar laptop, which he showed off at dinner (pic to the right) in tablet mode sporting an e-book from the Open Content Alliance’s children’s literature collection (a scanned copy of The Owl and the Pussycat)).
And speaking of old thinking and constrictive legacies, following Adly was Deanna B. Marcum, an associate librarian at the Library of Congress. Marcum seemed well aware of the big picture but gave off a strong impression of having hands tied by a change-averse institution that has still not come to grips with the basic fact of the World Wide Web. It was a numbing hour and made one palpably feel the leadership vacuum left by the LOC in the past decade, which among other things has allowed Google to move in and set the agenda for library digitization.
Next came Lynne J. Brindley, Chief Executive of the British Library, which is like apples to the LOC’s oranges. Slick, publicly engaged and with pockets deep enough to really push the technological envelope, the British Library is making a very graceful and sometimes flashy (Turning the Pages) migration to the digital domain. Brindley had many keen insights to offer and described a several BL experiments that really challenge the conventional wisdom on library search and exhibitions. I was particularly impressed by these “creative research” features: short, evocative portraits of a particular expert’s idiosyncratic path through the collections; a clever way of featuring slices of the catalogue through the eyes of impassioned researchers (e.g. here). Next step would be to open this up and allow the public to build their own search profiles.

*     *     *     *     *

That more or less covers today with the exception of a final keynote talk by John Seely Brown, which was quite inspiring and included a very kind mention of our work at MediaCommons. It’s been a long day, however, and I’m fading. So I’ll pick that up tomorrow.

brewster kahle on the google book search “nightmare”

kahlevidscreenshot.jpg
“Pretty much Google is trying to set themselves up as the only place to get to these materials; the only library; the only access. The idea of having only one company control the library of human knowledge is a nightmare.”
From a video interview with Elektrischer Reporter (click image to view).
(via Google Blogoscoped)

showtiming our libraries

uc seal.png google book search.jpg Google’s contract with the University of California to digitize library holdings was made public today after pressure from The Chronicle of Higher Education and others. The Chronicle discusses some of the key points in the agreement, including the astonishing fact that Google plans to scan as many as 3,000 titles per day, and its commitment, at UC’s insistence, to always make public domain texts freely and wholly available through its web services.
But there are darker revelations as well, and Jeff Ubois, a TV-film archivist and research associate at Berkeley’s School of Information Management and Systems, hones in on some of these on his blog. Around the time that the Google-UC deal was first announced, Ubois compared it to Showtime’s now-infamous compact with the Smithsonian, which caused a ripple of outrage this past April. That deal, the details of which are secret, basically gives Showtime exclusive access to the Smithsonian’s film and video archive for the next 30 years.
The parallels to the Google library project are many. Four of the six partner libraries, like the Smithsonian, are publicly funded institutions. And all the agreements, with the exception of U. Michigan, and now UC, are non-disclosure. Brewster Kahle, leader of the rival Open Content Alliance, put the problem clearly and succinctly in a quote in today’s Chronicle piece:

We want a public library system in the digital age, but what we are getting is a private library system controlled by a single corporation.

He was referring specifically to sections of this latest contract that greatly limit UC’s use of Google copies and would bar them from pooling them in cooperative library systems. I vocalized these concerns rather forcefully in my post yesterday, and may have gotten a couple of details wrong, or slightly overstated the point about librarians ceding their authority to Google’s algorithms (some of the pushback in comments and on other blogs has been very helpful). But the basic points still stand, and the revelations today from the UC contract serve to underscore that. This ought to galvanize librarians, educators and the general public to ask tougher questions about what Google and its partners are doing. Of course, all these points could be rendered moot by one or two bad decisions from the courts.

u.c. offers up stacks to google

APTFrontPage.jpg
The APT BookScan 1200. Not what Google and OCA are using (their scanners are human-assisted), just a cool photo.

Less than two months after reaching a deal with Microsoft, the University of California has agreed to let Google scan its vast holdings (over 34 million volumes) into the Book Search database. Google will undoubtedly dig deeper into the holdings of the ten-campus system’s 100-plus libraries than Microsoft, which is a member of the more copyright-cautious Open Content Alliance, and will focus primarily on books unambiguously in the public domain. The Google-UC alliance comes as major lawsuits against Google from the Authors Guild and Association of American Publishers are still in the evidence-gathering phase.
Meanwhile, across the drink, French publishing group La Martiniè re in June brought suit against Google for “counterfeiting and breach of intellectual property rights.” Pretty much the same claim as the American industry plaintiffs. Later that month, however, German publishing conglomerate WBG dropped a petition for a preliminary injunction against Google after a Hamburg court told them that they probably wouldn’t win. So what might the future hold? The European crystal ball is murky at best.
During this period of uncertainty, the OCA seems content to let Google be the legal lightning rod. If Google prevails, however, Microsoft and Yahoo will have a lot of catching up to do in stocking their book databases. But the two efforts may not be in such close competition as it would initially seem.
Google’s library initiative is an extremely bold commercial gambit. If it wins its cases, it stands to make a great deal of money, even after the tens of millions it is spending on the scanning and indexing the billions of pages, off a tiny commodity: the text snippet. But far from being the seed of a new literary remix culture, as Kevin Kelly would have us believe (and John Updike would have us lament), the snippet is simply an advertising hook for a vast ad network. Google’s not the Library of Babel, it’s the most sublimely sophisticated advertising company the world has ever seen (see this funny reflection on “snippet-dangling”). The OCA, on the other hand, is aimed at creating a legitimate online library, where books are not a means for profit, but an end in themselves.
Brewster Kahle, the founder and leader of the OCA, has a rather immodest aim: “to build the great library.” “That was the goal I set for myself 25 years ago,” he told The San Francisco Chronicle in a profile last year. “It is now technically possible to live up to the dream of the Library of Alexandria.”
So while Google’s venture may be more daring, more outrageous, more exhaustive, more — you name it –, the OCA may, in its slow, cautious, more idealistic way, be building the foundations of something far more important and useful. Plus, Kahle’s got the Bookmobile. How can you not love the Bookmobile?

wikimania: the importance of naming things

wikimania logoI’ll write up what happened on the second day of Wikimania soon – I saw a lot of talks about education – but a quick observation for now. Brewster Kahle delivered a speech after lunch entitled “Universal Access to All Knowledge”, detailing his plans to archive just about everything ever & the various issues he’s confronted along the way, not least Jack Valenti. Kahle learned from Valenti: it’s important to frame the terms of the debate. Valenti explained filesharing by declaring that it was Artists vs. Pirates, an obscuring dichotomy, but one that keeps popping up. Kahle was happy that he’d succeeded in creating a catch phrase in naming “orphan works” – a term no less loaded – before the partisans of copyright could.

Wikimania is dominated by Wikipedia, but it’s not completely about Wikipedia – it’s about wikis more generally, of which Wikipedia is by far the largest. There are people here using wikis to do immensely different things – create travel guides, create repositories of lesson plans for K–12 teachers, using wikis for the State Department’s repositories of information. Many of these are built using MediaWiki, the software that runs Wikipedia, but not all by any means. All sorts of different platforms have been made to create websites that can be edited by users. All of these fall under the rubric “wiki”. we could just as accurately refer to wikis as “collaboratively written websites”, the least common denominator of all of these sites. I’d argue that the word has something to do with the success of the model: nobody would feel any sense of kinship about making “collaboratively written websites” – that’s a nebulous concept – but when you slap the name “wiki” on it, you have something easily understood, a form about which people can become fanatical.

microsoft joins open content alliance

Microsoft’s forthcoming “MSN Book Search” is the latest entity to join the Open Content Alliance, the non-controversial rival to Google Print. ZDNet says: “Microsoft has committed to paying for the digitization of 150,000 books in the first year, which will be about $5 million, assuming costs of about 10 cents a page and 300 pages, on average, per book…”
Apparently having learned from Google’s mistakes, OCA operates under a strict “opt-in” policy for publishers vis-a-vis copyrighted works (whereas with Google, publishers have until November 1 to opt out). Judging by the growing roster of participants, including Yahoo, the National Archives of Britain, the University of California, Columbia University, and Rice University, not to mention the Internet Archive, it would seem that less hubris equals more results, or at least lower legal fees. Supposedly there is some communication between Google and OCA about potential cooperation.
Also story in NY Times.

yahoo! announces book-scanning project to rival google’s

Yahoo, in collaboration with The Internet Archive, Adobe, O’Reilly Media, Hewlett Packard Labs, the University of California, the University of Toronto, The National Archives of England, and others, will be participating in The Open Content Alliance, a book and media archiving project that will greatly enlarge the body of knowledge available online. At first glance, it appears the program will focus primarily on public domain works, and in the case of copyrighted books, will seek to leverage the Creative Commons.
Google Print, on the other hand, is more self-consciously a marketing program for publishers and authors (although large portions of the public domain will be represented as well). Google aims to make money off its indexing of books through keyword advertising and click-throughs to book vendors. Yahoo throwing its weight behind the “open content” movement seems on the surface to be more of a philanthropic move, but clearly expresses a concern over being outmaneuvered in the search wars. But having this stuff available online is clearly a win for the world at large.
The Alliance was conceived in large part by Brewster Kahle of the Internet Archive. He announced the project on Yahoo’s blog:

To kick this off, Internet Archive will host the material and sometimes helps with digitization, Yahoo will index the content and is also funding the digitization of an initial corpus of American literature collection that the University of California system is selecting, Adobe and HP are helping with the processing software, University of Toronto and O’Reilly are adding books, Prelinger Archives and the National Archives of the UK are adding movies, etc. We hope to add more institutions and fine tune the principles of working together.
Initial digitized material will be available by the end of the year.

More in:
NY Times
Chronicle of Higher Ed.