Category Archives: search

to some writers, google print sounds like a sweet deal

Wired has a piece today about authors who are in favor of Google’s plans to digitize millions of books and make them searchable online. Most seem to agree that obscurity is a writer’s greatest enemy, and that the exposure afforded by Google’s program far outweighs any intellectual property concerns. Sometimes to get more you have to give a little.
The article also mentions the institute.

google is sued… again

This time by publishers. Penguin Group USA, McGraw-Hill, Pearson Education, Simon & Schuster and John Wiley & Sons. The gripe is the same as with the Authors’ Guild, which filed suit last month alleging “massive copyright infringement.” Publishers fear a dangerous precedent is set by Google’s scanning of books to construct what amounts to a giant card catalogue on the web. Google claims “fair use” (see rationale), again pointing out that for copyrighted works only tiny “snippets” of text are displayed around keywords (though perhaps this is not yet fully in effect – I was searching around in this book and was able to look at quite a lot).
Google calls the publishers’ suit “near-sighted.” And it probably is. The benefit to readers and researchers will be tremendous, as will (Google is eager to point out) the exposure for authors and publishers. But Google Print is undoubtedly an earth-shaking program. Look at the reaction in Europe, where alarm bells rung by France warned of cultural imperialism, an english-drenched web. Heads of state and culture convened and initial plans for a European digital library have been drawn up.
What the transatlantic flap makes clear is that Google’s book scanning touches a deep nerve, and the argument over intellectual property, signficant though it is, distracts from a more profound human anxiety — an anxiety about the form of culture and the shape of thoughts. If we try to grope back through the millennia, we can find find an analogy in the invention of writing.
The shift from oral to written language froze speech into stable strings that could be transmitted and stored over distance and time. This change not only affected the modes of communication, it dramatically refigured the cognitive makeup of human beings (as McLuhan, Ong and others have described). We are currently going through another such shift. The digital takes the freezing medium of text and throws it back into fluidity. Like the melting of polar ice caps, it unsettles equilibriums, changes weather patterns. It is a lot to adjust to, and we wonder if our great-great-grandchildren will literally think differently from us.
But in spite of this disorienting new fluidity, we still have print, we still have the book. And actually, Google Print in many ways affirms this since its search returns will point to print retailers and brick-and-mortar libraries. Yet the fact remains that the canon is being scanned, with implications we can’t fully perceive, and future uses we can’t fully predict, and so it is understandable that many are unnerved. The ice is really beginning to melt.
In Phaedrus, Plato expresses a similar anxiety about the invention of writing. He tells the tale of Theuth, an Egyptian deity who goes around spreading the new technology, and one day encounters a skeptic in King Thamus:

…you who are the father of letters, from a paternal love of your own children have been led to attribute to them a power opposite to that which they in fact possess. For this discovery of yours will create forgetfulness in the minds of those who learn to use it; they will not exercise their memories, but, trusting in external, foreign marks, they will not bring things to remembrance from within themselves. You have discovered a remedy not for memory, but for reminding. You offer your students the appearance of wisdom, not true wisdom. They will be hearers of many things and will have learned nothing; they will appear to be omniscient and will generally know nothing; they will be tiresome company, having the show of wisdom without the reality.

As I type, I’m exhibiting wisdom without the reality. I’ve read Plato, but nowhere near exhaustively. Yet I can slash and weave texts on the web in seconds, throw together a blog entry and send it screeching into the commons. And with Google Print I can get the quote I need and let the rest of the book rot behind the security fence. This fluidity is dangerous because it makes connections so easy. Do we know what we are connecting?

google expands book-scanning project to europe

This week Google will be paying a visit to the Frankfurt Book Fair to talk with European publishers and chief librarians (including arch nemesis Jean-Nöel Jeanneney) about eight new local incarnations of Google Print. (more)

news and blogs to live under one roof at yahoo!

Yahoo’s revamped news search will present news and blogs side by side on the same page. In addition, the site will feature related images from Flickr, the social photo-sharing site that Yahoo purchased earlier this year, as well as user-contributed links from My Web (a feature that allows you to save and store web pages, and share them with others).
As before, the front news page will promote only stories from mainstream media sources, while the blog-news combo appears on a second-tier page that you arrive at when you conduct a specific search, or click for more details or more stories. No doubt, this was done, at least in part, to mollify angry news outlets who will likely call foul for making hard news share space with blogs. Still, the webscape has changed. All but the most cursory glance at the headlines will yield a richly confusing array of mainstream and grassroots sources.
(story, Yahoo Search Blog)
(thoughtful analysis from Tim Porter)

google dystopia

Google as big brother: “Op-Art” by Randy Siegel from today’s NY Times.
google 2084.jpg

human versus algorithm

I just came across Common Times, a new community-generated news aggregation page, part of something called the Common Media Network, that takes the social bookmarking concept of del.icio.us and applies it specifically to news gathering. Anyone can add a story from any source to a series of sections (which seem pre-set and non-editable) arranged on a newspaper-style “front page.” You add links through a bookmarklet on the links bar on your browser. Whenever you come across an article you’d like to submit, you just click the button and a page comes up where you can enter the metadata like tags and comments. Each user has a “channel” – basically a stripped-down blog – where all their links are displayed chronologically with an RSS feed, giving individuals a venue to show their chops as news curators and annotators. You can set it up so links are posted simultaneously to a del.icio.us account (there’s also a Firefox extension that allows you to post stories directly from Bloglines).
commontimes.jpg
Human aggregation is often more interesting than what the Google News algorithm can turn up, but it can easily mould to the biases of the community. Of course, search algorithms are developed by people, and source lists don’t just manufacture themselves (Google is notoriously tight-lipped about its list of news sources). In the case of something like Common Times, a slick new web application hyped on Boing Boing and other digital culture sites, the communities can be rather self-selecting. Still, this is a very interesting experiment in multi-player annotation. When I first arrived at the front page, not yet knowing how it all worked, I was impressed by the fairly broad spread of stories. And the tag cloud to the right is an interesting little snapshot of the zeitgeist.
(via Infocult)

yahoo! hires finance writers

Following Kevin Sites in the Hot Zone, Yahoo! takes another step in its transformation into original content provider (see Wall Street Journal – free). Though they say they have no intention of becoming a full-fledged news service.
Yahoo’s move suggests increased specialization and atomization of news media on the web, as full-fledged news services find it increasingly hard to stay afloat (as the recent wave of staff cuts at major papers suggests). As newspapers agonize over how to make more money from their websites (e.g. Times Select), companies with diverse revenue bases (like the big search portals) will find it a lot easier to deliver the news. But it will be a stripped down service, heavy on features. Can the news media as public trust survive this process of atomization? Or was the idea of a public trust always a fairy tale?

the database of intentions

Interesting edition of Open Source last week on “Google Sociology” with David Weinberger and John Battelle, author of the just-published “The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture”. Listen here.
Weinberger has some interesting things to say about Google (and the other search engines) as “publishers.” I have some thoughts on that too. More to come later.
Battelle has done a great deal of thinking on search from a variety of angles: the technology of search, the economics of search, and the more esoteric dimensions of a “search” culture. He touches briefly on this last point, laying out a construct that is probably treated more extensively in his book: the “database of intentions.” By this he means the archive, or “artifact,” of the world’s search queries. A picture of the collective consciousness formed by the questions everyone is asking. Even now, when logged in to Google, a history of all your search query strings is kept – your own database of intentions. The potential value of this database is still being determined, but obvious uses are targeted advertising, and more relevant search results based on analysis of search histories.
As regards the collective database of intentions, Battelle speculates that future advances in artificial intelligence will likely draw on this enormous crop of information about how humans think and seek.

google blog search – still a long way to go

Google’s new blog search engine reminds me of how far we still have to go with blog search. The engine works much the same way as Google’s general web search – with keywords and page ranking – only here it’s searching RSS feeds. Recent posts with keyword matches fill the column, and a few links to related blogs come up at the top. But there’s the rub. These so-called “related” blogs are only related by direct keyword matches in their title tagline. I just searched “poetry” and came up with only three related blogs. C’mon. A search for “gossip” turns up only one related blog – “Starbucks Gossip”. There has to be some kind of promotion going on here, though their “about” page mentions nothing of the kind.
A good engine would be capable of searching blogs by their subject, their preoccupation, their obsession. Many blogs could be considered “general,” but just as many have a special focus, and readers are often searching with a particular theme in mind. They don’t just want a list of transient posts, but whole sites that might potentially become regular destinations. Many blogs are valuable publications that prove themselves day after day. But blog search hasn’t yet grown beyond the trendy “what’s the latest chatter on the blogosphere” mode.
I do have to give credit to Technorati. Glitchy as it is, they’re trying to think of creative ways – tagging, author-determined keywords – to help readers find interesting blogs and authors their audience. Then again, my greatest finds have usually been from other blogs. Humans will always be the smartest aggregators.
People out there, what do you use?

yahoo! experiments with multimedia journalism

Yahoo! has enlisted tele-journalist and blogger Kevin Sites to produce a one-year web program chronicling the world’s conflict zones in multimedia format.
hotzone.jpg
Sites has become known for his jaunts as a “solo journalist,” trundling from hot spot to hot spot with a backpack full of gadgetry, beaming reports from his one-man broadcast station. It’s a formula that is tailor-made for the web. Clearly, Yahoo! was paying attention. The NY Times reports on “Kevin Sites In the Hot Zone”:

As he travels to these places, Mr. Sites will write a 600- to 800-word dispatch each day and produce a slide show of 5 to 10 digital photographs. He will also narrate audio travelogues. There will be several forms of video – relatively unedited footage posted several times a week, and once a week, a more traditional video report, edited in the style of a network news broadcast.
Mr. Sites will also be the host of regular online chats with Yahoo users who will be able to post comments on message boards. And he will post quick text messages on the site updating his activities throughout the day.

Counting on war and carnage as a surefire crowd draw, Yahoo! makes a rather tawdry entrance into independent journalism. But this is a very significant move nonetheless, evidence that Yahoo! is evolving into a full-fledged media company, and suggesting that the one-man-band approach to journalism and webcast might become a regular thing. If the Sites show finds an audience, they should try out serious investigative reporting or medium-length documentary.