Google announced today that it has made free downloadable PDFs available for many of the public domain books in its database. This is a good thing, but there are several problems with how they’ve done it. The main thing is that these PDFs aren’t actually text, they’re simply strings of images from the scanned library books. As a result, you can’t select and copy text, nor can you search the document, unless, of course, you do it online in Google. So while public access to these books is a big win, Google still has us locked into the system if we want to take advantage of these books as digital texts.
A small note about the public domain. Editions are key. A large number of books scanned so far by Google have contents in the public domain, but are in editions published after the cut-off (I think we’re talking 1923 for most books). Take this 2003 Signet Classic edition of the Darwin’s The Origin of Species. Clearly, a public domain text, but the book is in “limited preview” mode on Google because the edition contains an introduction written in 1958. Copyright experts out there: is it just this that makes the book off limits? Or is the whole edition somehow copyrighted?
Other responses from Teleread and Planet PDF, which has some detailed suggestions on how Google could improve this service.