Category Archives: wikia

people-powered search (part 1)

Last week, the London Times reported that the Wikipedia founder, Jimbo Wales, was announcing a new search engine called “Wikiasari.” This search engine would incorporate a new type of social ranking system and would rival Google and Yahoo in potential ad revenue. When the news first got out, the blogosphere went into a frenzy; many echoing inaccurate information – mostly in excitement – causing lots confusion. Some sites even printed dubious screenshots of what they thought was the search engine.
Alas, there were no real screenshots and there was no search engine… yet. Yesterday, unable to make any sense what was going on by reading the blogs, I looked through the developer mailing list and found this post by Jimmy Wales:

The press coverage this weekend has been a comedy of errors. Wikiasari was not and is not the intended name of this project… the London Times picked that off an old wiki page from back in the day when I was working on the old code base and we had a naming contest for it. […] And then TechCrunch ran a screenshot of something completely unrelated, thus unfortunately perhaps leading people to believe that something is already built about about to be unveiled. No, the point of the project is to build something, not to unveil something which has already been built.

And in the Wikia search webpage he explains why:

Search is part of the fundamental infrastructure of the Internet. And, it is currently broken. Why is it broken? It is broken for the same reason that proprietary software is always broken: lack of freedom, lack of community, lack of accountability, lack of transparency. Here, we will change all that.

So there is no Google-killer just yet, but something is brewing.
From the details that we have so far, we know that this new search engine will be funded by Wikia Inc, Wales’ for-profit and ad-driven MediaWiki hosting company. We also know that the search technology will be based on Nutch and Lucene – the same technology that powers Wikipedia’s search. And we also know that the search engine will allow users to directly influence search results.
I found interesting that in the Wikia “about page”, Wales suggests that he has yet to make up his mind on how things are going to work, so suggestions appear to be welcome.
Also, during the frenzy, I managed to find many interesting technologies that I think might be useful in making a new kind of search engine. Now that a dialog appears to be open and there is good reason to believe a potentially competitive search engine could be built, current experimental technologies might play an important role in the development of Wikia’s search. Some questions that I think might be useful to ponder are:
Can current social bookmarking tools, like, provide a basis for determining “high quality” sites? Will using Wikipedia and it’s external site citing engine make sense for determining “high quality” links? Will using a Digg-like, rating system result spamless or simply just low brow results? Will a search engine dependant on tagging, but no spider be useful? But the question I am most interested in is whether a large scale manual indexing lay the foundation for what could turn into the Semantic Web (Web 3.0)? Or maybe just Web 2.5?
The most obvious and most difficult challenge for Wikia, besides coming up with a good name and solid technology, will be with dealing with sheer size of the internet.
I’ve found that open-source communities are never as large or as strong as they appear. Wikipedia is one of the largest and one of the most successful online collaborative projects, yet just over 500 people make over 50% of all edits and about 1400 make about 75% of all edits. If Wikia’s new search engine does not generate a large group of users to help index the web early on, this project will not survive; A strong online community, possibly in a magnitude we’ve never seen before, might be necessary to ensure that people-powered search is of any use.

on business models in web publishing

waiting cart.jpg

Waiting cart stacked with newspaper advertising inserts — by, via Flickr

Here at the Institute, we’re generally more interested in thinking up new forms of publishing than in figuring out how to monetize them. But one naturally perks up at news of big money being made from stuff given away for free. Doc Searls points to a few items of such news.
First, that latest iteration of the American dream: blogging for big bucks, or, the self-made media mogul. Yes, a few have managed to do it, though I don’t think they should be taken as anything more than the exceptions that prove the rule that most blogs are smaller scale efforts in an ecology of niches, where success is non-monetary and more of the “nanofame” variety that iMomus, David Weinberger and others have talked about (where everyone is famous to fifteen people). But there is that dazzling handful of popular bloggers that rival the mass media outlets, and they’re raking in tens, if not hundreds, of thousands of dollars in ad revenues.
Some sites mentioned in the article:
TechCrunch: “$60,000 in ad revenue every month” (not surprising — its right column devoted to sponsors is one of the widest I’ve seen)
Boing Boing: “on track to gross an estimated $1 million in ad revenue this year” over a million a year “on pace to become a multimillion-dollar property.
Then, somewhat surprisingly, is The New York Times. Handily avoiding the debacle predicted a year ago by snarky bloggers like myself when the paper decided to relocate its op-ed columnists and other distinctive content behind a pay wall, the Times has pulled in $9 million from nearly 200,000 web-exclusive Times Select subscribers, while revenues from the Times-owned continue to skyrocket. There’s a feeling at the company that they’ve struck a winning formula (for now) and will see how long they can ride it:

When I ask if TimesSelect has been successful enough to suggest that more material be placed behind the wall, Nisenholtz [senior vice president for digital operations] replies, “The strategy isn’t to move more content from the free site to the pay site; we need inventory to sell to advertisers. The strategy is to create a more robust TimesSelect” by using revenue from the service to pay for more unique content. “We think we have the right formula going,” he says. “We don’t want to screw it up.”

***Subsequent thought: I’m not so sure. Initial indicators may be good, but I still think that the pay wall is a ticket to irrelevance for the Times’ columnists. Their readership is large and (for now) devoted enough to maintain the modestly profitable fortress model, but I think we’ll see it wither over time.***
Also, in the Times, there’s this piece about ad-supported wiki hosting sites like Wikia, Wetpaint, PBwiki or the swiftly expanding WikiHow, a Wikipedia-inspired how-to manual written and edited by volunteers. Whether or not the for-profit model is ultimately compatible with the wiki work ethic remains to be seen. If it’s just discrete ads in the margins that support the enterprise, then contributors can still feel to a significant extent that these are communal projects. But encroach further and people might begin to think twice about devoting their time and energy.
***Subsequent thought 2: Jesse made an observation that makes me wonder again whether the Times Company’s present success (in its present framework) may turn out to be short-lived. These wiki hosting networks are essentially outsourcing, or “crowdsourcing” as the latest jargon goes, the work of the hired guides. Time will tell which is ultimately the more sustainable model, and which one will produce the better resource. Given what I’ve seen on, I’d place my bets on the wikis. The problem? You, or your community, never completely own your site, so you’re locked into amateur status. With Wikipedia, that’s the point. But can a legacy media company co-opt so many freelancers without pay? These are drastically different models. We’re probably not dealing with an either/or here.***
We’ve frequently been asked about the commercial potential of our projects — how, for instance, something like GAM3R 7H30RY might be made to make money. The answer is we don’t quite know, though it should be said that all of our publishing experiments have led to unexpected, if modest, benefits — bordering on the financial — for their authors. These range from Alex selling some of his paintings to interested commenters at IT IN place, to Mitch gradually building up a devoted readership for his next book at Without Gods while still toiling away at the first chapter, to McKenzie securing a publishing deal with Harvard within days of the Chronicle of Higher Ed. piece profiling GAM3R 7H30RY (GAM3R 7H30RY version 1.2 will be out this spring, in print).
Build up the networks, keep them open and toll-free, and unforseen opportunities may arise. It’s long worked this way in print culture too. Most authors aren’t making a living off the sale of their books. Writing the book is more often an entree into a larger conversation — from the ivory tower to the talk show circuit, and all points in between. With the web, however, we’re beginning to see the funnel reversed: having the conversation leads to writing the book. At least in some cases. At this point it’s hard to trace what feeds into what. So many overlapping conversations. So many overlapping economies.