Author Archives: ben vershbow

the book is reading you

I just noticed that Google Book Search requires users to be logged in on a Google account to view pages of copyrighted works.
google book search account.jpg
They provide the following explanation:

Why do I have to log in to see certain pages?
Because many of the books in Google Book Search are still under copyright, we limit the amount of a book that a user can see. In order to enforce these limits, we make some pages available only after you log in to an existing Google Account (such as a Gmail account) or create a new one. The aim of Google Book Search is to help you discover books, not read them cover to cover, so you may not be able to see every page you’re interested in.

So they’re tracking how much we’ve looked at and capping our number of page views. Presumably a bone tossed to publishers, who I’m sure will continue suing Google all the same (more on this here). There’s also the possibility that publishers have requested information on who’s looking at their books — geographical breakdowns and stats on click-throughs to retailers and libraries. I doubt, though, that Google would share this sort of user data. Substantial privacy issues aside, that’s valuable information they want to keep for themselves.
That’s because “the aim of Google Book Search” is also to discover who you are. It’s capturing your clickstreams, analyzing what you’ve searched and the terms you’ve used to get there. The book is reading you. Substantial privacy issues aside, (it seems more and more that’s where we’ll be leaving them) Google will use this data to refine Google’s search algorithms and, who knows, might even develop some sort of personalized recommendation system similar to Amazon’s — you know, where the computer lists other titles that might interest you based on what you’ve read, bought or browsed in the past (a system that works only if you are logged in). It’s possible Google is thinking of Book Search as the cornerstone of a larger venture that could compete with Amazon.
There are many ways Google could eventually capitalize on its books database — that is, beyond the contextual advertising that is currently its main source of revenue. It might turn the scanned texts into readable editions, hammer out licensing agreements with publishers, and become the world’s biggest ebook store. It could start a print-on-demand service — a Xerox machine on steroids (and the return of Google Print?). It could work out deals with publishers to sell access to complete online editions — a searchable text to go along with the physical book — as Amazon announced it will do with its Upgrade service. Or it could start selling sections of books — individual pages, chapters etc. — as Amazon has also planned to do with its Pages program.
Amazon has long served as a valuable research tool for books in print, so much so that some university library systems are now emulating it. Recent additions to the Search Inside the Book program such as concordances, interlinked citations, and statistically improbable phrases (where distinctive terms in the book act as machine-generated tags) are especially fun to play with. Although first and foremost a retailer, Amazon feels more and more like a search system every day (and its A9 engine, though seemingly always on the back burner, is also developing some interesting features). On the flip side Google, though a search system, could start feeling more like a retailer. In either case, you’ll have to log in first.

more grist for the “pipes” debate

A couple of interesting items:
Larry Lessig wrote an excellent post last week debunking certain myths circulating the “to regulate or not to regulate” debate in Washington, namely that introducing “net neutrality” provisions in the new Telecom bill would impose unprecedented “common carriage” regulation on network infrastructure. Of course, the infrastructure was regulated before — when the net was accessed primarily through phone lines. Lessig asks: if an unregulated market is so good for the consumer, then why is broadband service in this country so slow and so expensive?
Also worth noting is a rough sketch from internet entrepreneur Mark Cuban of the idea of “tiered” network service. This would entail prioritizing certain uses of bandwidth. For example, your grandma’s web-delivered medical diagnostics would be prioritized over the teenager downloading music videos next door (if, that is, someone shells out for the priority service). This envisions for the consumer end what cable and telephone execs have dreamed of on the client end — i.e. charging certain web services more for faster page loads and speedier content delivery. Seems to me that either scenario would make the U.S. internet more like the U.S. healthcare system: abysmal except for those with cash.

meta-wikipedia

As a frequent consulter, but not an editor, of Wikipedia, I’ve often wondered about what exactly goes on among the core contributors. A few clues can be found in the revision histories, but on a whole these are hard to read, internal work documents meant more for those actually getting their hands dirty in the business of writing and editing. Like choreographic notation, they may record the steps, but to the untrained reader they give little sense of the look or feeling of the dance.
metawiki.jpg But dig around elsewhere in Wikipedia’s sprawl, turn over a few rocks, and you will find squirming in the soil a rich ecosystem of communities, organizing committees, and rival factions. Most of these — the more formally organized ones at least — can be found on the “Meta-Wiki,” a site containing information and community plumbing for all Wikimedia Foundation projects, including Wikipedia.
I took a closer look at some of these so-called Metapedians and found them to be a varied, often contentious lot, representing a broad spectrum of philosophies asserting this or that truth about how Wikipedia should evolve, how it should be governed, and how its overall significance ought to be judged. The more prominent schools of thought are even championed by associations, complete with their own page, charter and loyal base of supporters. Although tending toward the tongue-in-cheek, these pages cannot help but convey how seriously the business of building the encyclopedia is taken, with three groups in particular providing, if not evidence of an emergent tri-party system, then at least a decent introduction to Wikipedia’s political culture, and some idea of how different Wikipedians might formulate policies for the writing and editing of articles.
On one extreme is The Association of Deletionist Wikipedians, a cantankerous collective that dreams (with considerable ideological overlap with another group, the Exclusionists) of a “big, strong, garbage-free Wikipedia.” These are the expungers, the pruners, the weeding-outers — doggedly on the lookout for filth, vandalism and general extraneousness. Deletionists favor “clear and relatively rigorous standards for accepting articles to the encyclopedia.” When you come across an article that has been flagged for cleanup or suspected inaccuracies, that may be the work of Deletionists. Some have even pushed for the development of Wiki Law that could provide clearly documented precedents to guide future vetting efforts. In addition, Deletionists see it as their job to “outpace rampant Inclusionism,” a rival school of thought across the metaphorical aisle: The Association of Inclusionist Wikipedians.
This group’s motto is “Salva veritate,” or “with truth preserved,” which in practice means: “change Wikipedia only when no knowledge would be lost as a result.” These are Wikipedia’s libertarians, its big-tenters, its stub-huggers. “Outpace and coordinate against rampant Deletionism” is one of their core directives.

A favorite phrase of inclusionists is “Wiki is not paper.” Because Wikipedia does not have the same space limitations as a paper encyclopedia, there is no need to restrict content in the same way that a Britannica must. It has also been suggested that no performance problems result from having many articles. Inclusionists claim that authors should take a more open-minded look at content criteria. Articles on people, places, and concepts of little note may be perfectly acceptable for Wikipedia in this view. Some inclusionists do not see a problem with including pages which give a factual description of every last person on the planet.

(Even poor old Bob Aspromonte.)
Then along come the Mergist Wikipedians. The moderates, the middle-grounders, the bipartisans. The Mergists regard it their mission to reconcile the two extremes — to “outpace rampant Inclusionism and Deletionism.” As their eminently sensible charter explains:

The AMW believes that while some information is notable and encyclopedic and therefore has a place on Wikipedia, much of it is not notable enough to warrant its own article and is therefore best merged. In this sense we are similar to Inclusionists, as we believe in the preservation of information and knowledge, but share traits with Deletionists as we disagree with the rampant creation of new articles for topics that could easily be covered elsewhere.

For some, however, there can be no middle ground. One is either a Deletionist or and Inclusionist, it’s as simple as that. To these hardliners, the mergists are referred to dismissively as “delusionists.”
There are still other, less organized, ideological subdivisions. Immediatism focuses on “the immediate value of Wikipedia,” and so are terribly concerned with the quality — today — of its information, the neatness of its appearance, and its general level of professionalism and polish. When a story in the news draws public attention to some embarrassing error — the Seigenthaler episode, for instance — the Immediatists wince and immediately set about correcting it. Eventualism, by contrast, is more concerned with Wikipedia in the long run — its grand destiny — trusting that wrinkles will be ironed out, gaps repaired. All in good time.
How much impact these factions have on the overall growth and governance of Wikipedia is hard to say. But as a description of the major currents of thought that go into the building of this juggernaut, they are quite revealing. It’s nice that people have taken the time to articulate these positions, and that they have done so with humor, lending texture and color to what at first glance might appear to be an undifferentiated mob.

who owns the network?

Susan Crawford recently floated the idea of the internet network (see comments 1 and 2) as a public trust that, like America’s national parks or seashore, requires the protection of the state against the undue influence of private interests.

…it’s fine to build special services and make them available online. But broadband access companies that cover the waterfront (literally — are interfering with our navigation online) should be confronted with the power of the state to protect entry into this self-owned commons, the internet. And the state may not abdicate its duty to take on this battle.

Others argue that a strong government hand will create as many problems as it fixes, and that only true competition between private, municipal and grassroots parties — across not just broadband, but multiple platforms like wireless mesh networks and satellite — can guarantee a free net open to corporations and individuals in equal measure.
Discussing this around the table today, Ray raised the important issue of open content: freely available knowledge resources like textbooks, reference works, scholarly journals, media databases and archives. What are the implications of having these resources reside on a network that increasingly is subject to control by phone and cable companies — companies that would like to transform the net from a many-to-many public square into a few-to-many entertainment distribution system? How open is the content when the network is in danger of becoming distinctly less open?

ESBNs and more thoughts on the end of cyberspace

Anyone who’s ever seen a book has seen ISBNs, or International Standard Book Numbers — that string of ten digits, right above the bar code, that uniquely identifies a given title. Now come ESBNs, or Electronic Standard Book Numbers, which you’d expect would be just like ISBNs, only for electronic books. And you’d be right, but only partly. esbn.jpg ESBNs, which just came into existence this year, uniquely identify not only an electronic title, but each individual copy, stream, or download of that title — little tracking devices that publishers can embed in their content. And not just books, but music, video or any other discrete media form — ESBNs are media-agnostic.
“It’s all part of the attempt to impose the restrictions of the physical on the digital, enforcing scarcity where there is none,” David Weinberger rightly observes. On the net, it’s not so much a matter of who has the book, but who is reading the book — who is at the book. It’s not a copy, it’s more like a place. But cyberspace blurs that distinction. As Alex Pang explains, cyberspace is still a place to which we must travel. Going there has become much easier and much faster, but we are still visitors, not natives. We begin and end in the physical world, at a concrete terminal.
When I snap shut my laptop, I disconnect. I am back in the world. And it is that instantaneous moment of travel, that light-speed jump, that has unleashed the reams and decibels of anguished debate over intellectual property in the digital era. A sort of conceptual jetlag. Culture shock. The travel metaphors begin to falter, but the point is that we are talking about things confused during travel from one world to another. Discombobulation.
This jetlag creates a schism in how we treat and consume media. When we’re connected to the net, we’re not concerned with copies we may or may not own. What matters is access to the material. The copy is immaterial. It’s here, there, and everywhere, as the poet said. But when you’re offline, physical possession of copies, digital or otherwise, becomes important again. If you don’t have it in your hand, or a local copy on your desktop then you cannot experience it. It’s as simple as that. ESBNs are a byproduct of this jetlag. They seek to carry the guarantees of the physical world like luggage into the virtual world of cyberspace.
But when that distinction is erased, when connection to the network becomes ubiquitous and constant (as is generally predicted), a pervasive layer over all private and public space, keeping pace with all our movements, then the idea of digital “copies” will be effectively dead. As will the idea of cyberspace. The virtual world and the actual world will be one.
For publishers and IP lawyers, this will simplify matters greatly. Take, for example, webmail. For the past few years, I have relied exclusively on webmail with no local client on my machine. This means that when I’m offline, I have no mail (unless I go to the trouble of making copies of individual messages or printouts). As a consequence, I’ve stopped thinking of my correspondence in terms of copies. I think of it in terms of being there, of being “on my email” — or not. Soon that will be the way I think of most, if not all, digital media — in terms of access and services, not copies.
But in terms of perception, the end of cyberspace is not so simple. When the last actual-to-virtual transport service officially shuts down — when the line between worlds is completely erased — we will still be left, as human beings, with a desire to travel to places beyond our immediate perception. As Sol Gaitan describes it in a brilliant comment to yesterday’s “end of cyberspace” post:

In the West, the desire to blur the line, the need to access the “other side,” took artists to try opium, absinth, kef, and peyote. The symbolists crossed the line and brought back dada, surrealism, and other manifestations of worlds that until then had been held at bay but that were all there. The virtual is part of the actual, “we, or objects acting on our behalf are online all the time.” Never though of that in such terms, but it’s true, and very exciting. It potentially enriches my reality. As with a book, contents become alive through the reader/user, otherwise the book is a dead, or dormant, object. So, my e-mail, the blogs I read, the Web, are online all the time, but it’s through me that they become concrete, a perceived reality. Yes, we read differently because texts grow, move, and evolve, while we are away and “the object” is closed. But, we still need to read them. Esse rerum est percipi.

howl page one.jpg Just the other night I saw a fantastic performance of Allen Ginsberg’s Howl that took the poem — which I’d always found alluring but ultimately remote on the page — and, through the conjury of five actors, made it concrete, a perceived reality. I dug Ginsburg’s words. I downloaded them, as if across time. I was in cyberspace, but with sweat and pheremones. The Beats, too, sought sublimity — transport to a virtual world. So, too, did the cyberpunks in the net’s early days. So, too, did early Christian monastics, an analogy that Pang draws:

…cyberspace expresses a desire to transcend the world; Web 2.0 is about engaging with it. The early inhabitants of cyberspace were like the early Church monastics, who sought to serve God by going into the desert and escaping the temptations and distractions of the world and the flesh. The vision of Web 2.0, in contrast, is more Franciscan: one of engagement with and improvement of the world, not escape from it.

The end of cyberspace may mean the fusion of real and virtual worlds, another layer of a massively mediated existence. And this raises many questions about what is real and how, or if, that matters. But the end of cyberspace, despite all the sweeping gospel of Web 2.0, continuous computing, urban computing etc., also signals the beginning of something terribly mundane. Networks of fiber and digits are still human networks, prone to corruption and virtue alike. A virtual environment is still a natural environment. The extraordinary, in time, becomes ordinary. And undoubtedly we will still search for lines to cross.

end of cyberspace

The End of Cyberspace is a brand-new blog by Alex Soojung-Kim Pang, former academic editor and print-to-digital overseer at Encyclopedia Britannica, and currently a research director at the Institute for the Future (no relation). Pang has been toying with this idea of the end of cyberspace for several years now, but just last week he set up this blog as “a public research notebook” where he can begin working through things more systematically. To what precise end, I’m not certain.
The end of cyberspace refers to the the blurring, or outright erasure, of the line between the virtual and the actual world. With the proliferation of mobile devices that are always online, along with increasingly sophisticated social software and “Web 2.0” applications, we are moving steadily away from a conception of the virtual — of cyberspace — as a place one accesses exclusively through a computer console. Pang explains:

Our experience of interacting with digital information is changing. We’re moving to a world in which we (or objects acting on our behalf) are online all the time, everywhere.
Designers and computer scientists are also trying hard to create a new generation of devices and interfaces that don’t monopolize our attention, but ride on the edges of our awareness. We’ll no longer have to choose between cyberspace and the world; we’ll constantly access the first while being fully part of the second.
Because of this, the idea of cyberspace as separate from the real world will collapse.

If the future of the book, defined broadly, is about the book in the context of the network, then certainly we must examine how the network exists in relation to the world, and on what terms we engage with it. I’m not sure cyberspace has ever really been a home for the book, but it has, in a very short time, totally altered the way we read. Now, gradually, we return to the world. But changed. This could get interesting.

.tv

People have been talking about internet television for a while now. But Google and Yahoo’s unveiling of their new video search and subscription services last week at the Consumer Electronics Show in Las Vegas seemed to make it real.
Sifting through the predictions and prophecies that subsequently poured forth, I stumbled on something sort of interesting — a small concrete discovery that helped put some of this in perspective. Over the weekend, Slate Magazine quietly announced its partnership with “meaningoflife.tv,” a web-based interview series hosted by Robert Wright, author of Nonzero and The Moral Animal, dealing with big questions at the perilous intersection of science and religion.
life_banner_mono.gif
Launched last fall (presumably in response to the intelligent design fracas), meaningoflife.tv is a web page featuring a playlist of video interviews with an intriguing roster of “cosmic thinkers” — philosophers, scientists and religious types — on such topics as “Direction in evolution,” “Limits in science,” and “The Godhead.”
This is just one of several experiments in which Slate is fiddling with its text-to-media ratio. Today’s Pictures, a collaboration with Magnum Photos, presents a daily gallery of images and audio-photo essays, recalling both the heyday of long-form photojournalism and a possible future of hybrid documentary forms. One problem is that it’s not terribly easy to find these projects on Slate’s site. The Magnum page has an ad tucked discretely on the sidebar, but meaningoflife.tv seems to have disappeared from the front page after a brief splash this weekend. For a born-digital publication that has always thought of itself in terms of the web, Slate still suffers from a pretty appalling design, with its small headline area capping a more or less undifferentiated stream of headlines and teasers.
Still, I’m intrigued by these collaborations, especially in light of the forecast TV-net convergence. While internet TV seems to promise fragmentation, these projects provide a comforting dose of coherence — a strong editorial hand and a conscious effort to grapple with big ideas and issues, like the reassuringly nutritious programming of PBS or the BBC. It’s interesting to see text-based publications moving now into the realm of television. As Tivo, on demand, and now, the internet atomize TV beyond recognition, perhaps magazines and newspapers will fill part of the void left by channels.
Limited as it may now seem, traditional broadcast TV can provide us with valuable cultural touchstones, common frames of reference that help us speak a common language about our culture. That’s one thing I worry we’ll lose as the net blows broadcast media apart. Then again, even in the age of five gazillion cable channels, we still have our water-cooler shows, our mega-hits, our television “events.” And we’ll probably have them on the internet too, even when “by appointment” television is long gone. We’ll just have more choice regarding where, when and how we get at them. Perhaps the difference is that in an age of fragmentation, we view these touchstone programs with a mildly ironic awareness of their mainstream status, through the multiple lenses of our more idiosyncratic and infinitely gratified niche affiliations. They are islands of commonality in seas of specialization. And maybe that makes them all the more refreshing. Shows like “24,” “American Idol,” or a Ken Burns documentary, or major sporting events like the World Cup or the Olympics that draw us like prairie dogs out of our niches. Coming up for air from deep submersion in our self-tailored, optional worlds.

exploring the book-blog nexus

It appears that Amazon is going to start hosting blogs for authors. Sort of. Amazon Connect, a new free service designed to boost sales and readership, will host what are essentially stripped-down blogs where registered authors can post announcements, news and general musings. amazon connect.jpg Eventually, customers can keep track of individual writers by subscribing to bulletins that collect in an aggregated “plog” stream on their Amazon home page. But comments and RSS feeds — two of the most popular features of blogs — will not be supported. Engagement with readers will be strictly one-way, and connection to the larger blogosphere basically nil. A missed opportunity if you ask me.
Then again, Amazon probably figured it would be a misapplication of resources to establish a whole new province of blogland. This is more like the special events department of a book store — arranging readings, book singings and the like. There has on occasion, however, been some entertaining author-public interaction in Amazon’s reader reviews, most famously Anne Rice’s lashing out at readers for their chilly reception of her novel Blood Canticle (link – scroll down to first review). But evidently Connect blogs are not aimed at sparking this sort of exchange. Genuine literary commotion will have to occur in the nooks and crannies of Amazon’s architecture.
It’s interesting, though, to see this happening just as our own book-blog experiment, Without Gods, is getting underway. Over the past few weeks, Mitchell Stephens has been writing a blog (hosted by the institute) as a way of publicly stoking the fire of his latest book project, a narrative history of atheism to be published next year by Carroll and Graf. While Amazon’s blogs are mainly for PR purposes, our project seeks to foster a more substantive relationship between Mitch and his readers (though, naturally, Mitch and his publisher hope it will have a favorable effect on sales as well). We announced Without Gods a little over two weeks ago and already it has collected well over 100 comments, a high percentage of which are thoughtful and useful.
We are curious to learn how blogging will impact the process of writing the book. By working partially in the open, Mitch in effect raises the stakes of his research — assumptions will be challenged and theses tested. Our hunch isn’t so much that this procedure would be ideal for all books or authors, but that for certain ones it might yield some tangible benefit, whether due to the nature or breadth of their subject, the stage they’re at in their thinking, or simply a desire to try something new.
An example. This past week, Mitch posted a very thinking-out-loud sort of entry on “a positive idea of atheism” in which he wrestles with Nietzsche and the concepts of void and nothingness. This led to a brief exchange in the comment stream where a reader recommended that Mitch investigate the writings of Gora, a self-avowed atheist and figure in the Indian independence movement in the 30s. Apparently, Gora wrote what sounds like a very intriguing memoir of his meeting with Gandhi (whom he greatly admired) and his various struggles with the religious component of the great leader’s philosophy. Mitch had not previously been acquainted with Gora or his writings, but thanks to the blog and the community that has begun to form around it, he now knows to take a look.
What’s more, Mitch is currently traveling in India, so this could not have come at a more appropriate time. It’s possible that the commenter had noted this from a previous post, which may have helped trigger the Gora association in his mind. Regardless, these are the sorts of the serendipitous discoveries one craves while writing book. I’m thrilled to see the blog making connections where none previously existed.

the future of academic publishing, peer review, and tenure requirements

There’s a brilliant guest post today on the Valve by Kathleen Fitzpatrick, english and media studies professor/blogger, presenting “a sketch of the electronic publishing scheme of the future.” Fitzpatrick, who recently launched ElectraPress, “a collaborative, open-access scholarly project intended to facilitate the reimagining of academic discourse in digital environments,” argues convincingly why the embrace of digital forms and web-based methods of discourse is necessary to save scholarly publishing and bring the academy into the contemporary world.
In part, this would involve re-assessing our fetishization of the scholarly monograph as “the gold standard for scholarly production” and the principal ticket of entry for tenure. There is also the matter of re-thinking how scholarly texts are assessed and discussed, both prior to and following publication. Blogs, wikis and other emerging social software point to a potential future where scholarship evolves in a matrix of vigorous collaboration — where peer review is not just a gate-keeping mechanism, but a transparent, unfolding process toward excellence.
There is also the question of academic culture, print snobbism and other entrenched attitudes. The post ends with an impassioned plea to the older generations of scholars, who, since tenured, can advocate change without the risk of being dashed on the rocks, as many younger professors fear.

…until the biases held by many senior faculty about the relative value of electronic and print publication are changed–but moreover, until our institutions come to understand peer-review as part of an ongoing conversation among scholars rather than a convenient means of determining “value” without all that inconvenient reading and discussion–the processes of evaluation for tenure and promotion are doomed to become a monster that eats its young, trapped in an early twentieth century model of scholarly production that simply no longer works.

I’ll stop my summary there since this is something that absolutely merits a careful read. Take a look and join in the discussion.

questions about blog search and time

Does anyone know of a good way to search for old blog entries on the web? I’ve just been looking at some of the available blog search resources and few of them appear to provide any serious advanced search options. The couple of major ones I’ve found that do (after an admittedly cursory look) are Google and Ice Rocket. Both, however, appear to be broken, at least when it comes to dates. I’ve tried them on three different browsers, on Mac and PC, and in each case the date menus seem to be frozen. It’s very weird. They give you the option of entering a specific time range but won’t accept the actual dates. Maybe I’m just having a bad tech day, but it’s as if there’s some conceptual glitch across the web vis a vis blogs and time.
Most blog search engines are geared toward searching the current blogosphere, but there should be a way to research older content. My first thought was that blog search engines crawl RSS feeds, most of which do not transmit the entirety of a blog’s content, just the more recent. That would pose a problem for archival search.
Does anyone know what would be the best way to go about finding, say, old blog entries containing the keywords “new orleans superdome” from late August to late September 2005? Is it best to just stick with general web search and painstakingly comb through for blogs? If we agree that blogs have become an important kind of cultural document, than surely there should be a way to find them more than a month after they’ve been written.