Category Archives: indexing

thinking about indexing

Once upon a time, a long time ago, I was an editor for Let’s Go, a series of travel guides. While there, I learned a great many things about making books, not all of them useful. One of them: how to make an index. Let’s Go had at that time – maybe it still does, I’m not sure how things are run now – an odd relationship with the publisher of the series, St. Martin’s Press: Let’s Go laid the books out in-house and sent finished files (PostScript in those days) to St. Martin’s, who took care of getting the books printed and in stores. The editors were thus responsible for everything that appeared in the books, from the title page down to the index. Because Let’s Go is staffed by college students, the staff mostly turns over every year; because it was staffed by college students, most of them didn’t know how to edit. Consequently, every year, the tasks involved in editing books must be retaught. And thus it was that one summer I was taught to index a book, and the next summer I found myself teaching others to index books.
As an editor of a book at Let’s Go, you were responsible for creating an index for your book. There’s something to be said to having the person who created the book also controlling how it’s accessed: presumably, the person who put the book together knows what’s important in it and what readers should find in it. The vast majority of the publishing world works differently: generally once a book has been edited, it’s sent off to professional indexers, who independently create an index for the book. There’s an argument for this: knowing how to create an index is specialized knowledge: it’s information architecture, to use the common phrase. It doesn’t necessarily follow that someone who’s good at editing a book will know how to organize an index that will be useful to readers.
But Let’s Go maintained a child-like faith in the malleability of its editors, and editors were made to index their own books, quality be damned. The books were being edited (and typeset) in a program called Adobe FrameMaker, which is generally used to produce technical manuals; in FrameMaker, if you highlight text and press a certain key command, an index window pops up. The index window attaches a reference to the page number of the highlighted text to the book’s index with whatever descriptive text desired. At the end of every week, editors did something called “generating their book”, which updated all the page numbers, giving a page count for the book in progress, and produced an index, which could be scrutinized. In theory, editors were supposed to add terms to their index as they worked; in practice, most ended up racing to finish their index the week before the book was due to be typeset.
a sample page of an index which you could click to see in larger form if you really wanted toIt must be admitted that most of the indices constructed in this way were not very good. A lot of index jokes were attempted, not all successfully. (In an Ireland guide, for example, “trad 72” was immediately followed by “traditional music, see trad”. Funny phrases were indexed almost as much as useful topics (in the same book, “giant babies 433” is followed by “giant lobster clutching a Guinness 248”). Friends’ names turned up with an unfortunate frequency. One finds that there’s something casual about an index. If we think of a book as a house, the table of contents is the front door, the way a visitor is supposed to enter. The index is the back door, the one used by friends.
Thinking about indices in print books isn’t something that happens as much any more. In an era when less and less profit can be made off printed books, niceties like indices often get lost for cost reasons: they both cost money to make and they take pages to print. More and more indices wind up as online-only supplements. Much of the function of the index seems to have been obviated by full-text searching: rather than taking the index’s word for where a particular name appears in a text, it’s much simpler to press command-F to find it.
But while the terms may have changed, the problem of making easy paths into a text hasn’t gone away. The problem of organizing information quickly comes to light when keeping a blog that isn’t strictly time-based like this one: while we set out a few years back with nicely defined categories for posts, we quickly realized that the categories weren’t enough. Like many people, we moved to tags to attempt to classify what we were talking about; our tags, unpruned, are as messy a thicket as the most unwieldy index.

* * * * *

the cover of cloud, the, 3I came across Helen Mirra’s book Cloud, the, 3 last week at 192 Books in Chelsea. I’d seen & liked some of Mirra’s work in a show at Peter Blum in the spring where she had a piece based on Robert Walser, one of my pet favorite writers. It was a thick book for someone I’d thought of as a visual artist: I picked it up & flipped through it, which turns out to be the best way to approach this book: the viewer is left with the impression of an index that’s been exploded or turned into a flip book, an index spread out to cover a whole book. The pages are almost entirely blank, each with an entry or two.
A note at the back explains what the book is: “The preceding text is an index of John Dewey’s Reconstruction in Philosophy (New York: Beacon, 1920), written by Helen Mirra in 2005/6.” An afterword by Lynn Hejinian goes into more detail, including why Mirra is working from this particular volume of Dewey, in which he attempts to bring philosophy to bear on the problems of the real world. But the central idea is simple enough: Mirra constructed her own index to a book. Dewey’s book (the edition Mirra used can be examined here) already has an index, eight stately pages that move through the terms used in Reconstruction in Philosophy from “Absolute reality, 23, 27” to “World, nomenal and phenomenal, 23”.
There’s some overlap between Dewey’s index – is it really his index, constructed by John Dewey himself? – and Mirra’s index. Dewey’s index, for example, contains “Errors, 34”. Mirra’s version contains “Errors, of our ancestors, 35–36”. Mirra’s working from the same book, but her index finds poetry in Dewey’s prose: “Environment, 10, 14, 19; even a clam modifies the, 84; given, 156.” “Color in contrast with pure light, a, 88.” “Habitually reasonable thoughts, 6.” “Half-concealed and half-apologetic life, 210.” “Sailor compared with the weaver, the, 11.” “True method as comparable to the operations of the bee, 32.” Consulting Dewey’s book at those pages reveals that Mirra’s made up nothing. Her index, however, reveals her own personal reading of the book.
Cloud, the, 3 is an artist’s book, a book that is meant to function as an art object rather than being a conduit of information. In a way, this seems perfectly appropriate: in a world where Dewey’s book is fully searchable online, indexing can seem superfluous, no longer a practical concern. (This hasn’t always been the case: Art & Language, a conceptual collective started in the late 1960s, pursued indexing as a Marxist tactic to bring knowledge to the masses.) One can make the argument that in structure Mirra’s book is not that dissimilar from the unwieldy tag cloud that graces the right side of this blog, the “frightful taxonomic bog” that we periodically fret over & fail to do anything about. But I think the object-status of Mirra’s book enables us to think about its contents in a way that, for example, a tagcloud doesn’t: as an object that doesn’t need to exist, we question its existence and wonder why it is accorded financial value. A tag cloud, all too often, is just one more widget. I like Mirra’s book because it didn’t have to exist: the artist had to work to create it.

* * * * *

Most of the publishing industry doesn’t follow Let’s Go’s example: in general, it’s much more hierarchical. Writers write, editors edit, indexers index, and typesetters typeset. Perhaps it’s economically necessary to have everyone specialize in this way; however, there’s an inefficiency built into this system which necessitates that people less familiar with the text are constructing the ways into it. On the Internet, by contrast, we increasingly realize that we are all editors now. We could all be indexers too.

review: the access principle

open_access.jpg
In his book “The Access Principle– The Case for Open Access to Research and Scholarship,” John Willinsky, from the University of British Columbia, tackles the idea that scholarship needs to be more open and accessible than it currently is. He offers a comprehensive and persuasive argument that covers the ethical, political and economic reasons for making scholarship accessible to both scholars and the public. He lives by his words, as a full text version is now available for download on the MIT Press website. The book is an important resource for anyone who is concerned with scholarly communicate. We were also fortunate to have his attendance at our meeting on the formulation of an scholarly press.
Many people have spoken to the situation that raising journal subscription costs and shrinking library acquisition budgets are quickly reaching their limits of feasibility, and now Willinsky provides in one place, a clear depiction of the status quo and the reasons on how it arrived there. He then takes the argument for open access deeper by widening the discussion to address the developing world and the general public.
Willinsky documents a promising trend that several large institutions including the NIH and prestigious journals such as the New England Journal of Medicine, are making their research available. They use different models releasing the research. For example, NEJM makes article accessible six months after its paid publication is released. In attempting to encourage this trend of open access to scholarly work, Willinsky devotes much of “Open Access” to document the business models of scholarly publishing and shows in detail the economic feasibility of open access publishing. He clearly maintains that making scholarship accessible is not necessarily making it free. Walking through the current economic models of academic publishing, Willinsky gives a good overview of the range of publishing models with varying degrees of accesibility. As well, he devotes an entire chapter which proposes an intriguing model of how a journal could be operated by scholars as a cooperative.
To coincide with this effort to argue for the open access of scholarship, Willinsky also works with a group of developers to create an open source and free publication platform, called the Open Journal System. The OJS provides a journal a way to reduce their costs by providing digital tools for editing, management and distribution. Although, it is clear that scholars and publishers still hold on to print as the ideal medium, even as it is becoming increasing economically infeasible to maintain. However, when the breaking point eventually comes to pass, the point in time when shrinking library budgets and raising subscription rates eventually become unworkable, viable options will fortunately already exist. A sample list of journals using OJS shows the breadth of subject matter and international use of the tool.
It is the last chapters of the book, “Reading,” “Indexing” and “History” which leave the biggest impact. In “Reading,” Willinsky explores how the way people read is already being influenced by screen-based text. Initially, the focus on digital publishing was relevant in his analysis and proposals, because the efficiencies gained by digital publishing can be used to balance the costs of accessing print publishing. However, in the shift to digital online publishing, he notes that there exists an opportunity to aid the comprehension of readers that is unrelated to the economics and ethics of access.
He uses the example of how students read a primary history text very differently than a historian reads. A historian quickly jumps from the top to bottom looking for clues concerning geography, time of the events depicted and the time document was written, in order to understand the historical context of the document. On the other hand, a student will typically read a document from start to finish, with less emphasis on building a context for the document.
Scholars’ readings of journal articles have similarities to the way historians read their source documents. Just as there are techniques to assist student of history how to read, there also ways to assist the reading of all scholarly work. Most importantly, these techniques can be integrated into the reading environment of the open and online journal. Addressing and utilizing the potential of digital and networked text, in the end, can assist the overall arguments of Willinsky. Because Willinsky comes from an education and pedagogy background, it is not surprising that he uses an “scaffolding” approach to support learning and reading. In this context, scaffolding refers to the pedagogical idea that knowledge transfer is increased when readers (or learners) are given tools and resources to support their learning experience with the main text.
Currently, there are of course features in print journal publishing to aid the reader. He cites that abstracts, footnotes and citations are ubiquitous tools to aid the reader. In the online environment, these tools can be expanded even further. While Willinsky acknowledges that open access will change the readership of scholarly publishing and that the medium must adapt for these new readers, he does not mean to say that the level of writing itself necessarily has to change. Scholars should still write to expand their field.
One very basic feature that is included the Open Journal System is the ability to comment. This simple feature has the ability to narrow the gap between author and reader. Although as far as I can tell, it is not often used. Also included, “Reading Tools” are basic but significant additions to the reading experience, currently providing supportive information by searching open access databases with author-proscribed key words. Willinsky states these tools are still undergoing development, which is not surprising because our understanding of the digital networked text is still in the formative stage as well. Because OJS is open source, it allows new feature sets to be added into the system as new forms of reading are understood and can be applied onto a large scale. Radical experimentation is not always appropriate. Just getting the journals into an online environment is a significant achievement. It is telling that the default setting for “Reading Tools” is off, although it is being used by some journals.
The chapter “Indexing,” flips the analysis to look at how online and accessibility will change how scholarship is stored, indexed and retrieved on the publisher side. Willinsky notes that in countries as Bangalore, universities cannot even afford the collected abstracts of journals, let alone subscriptions to the actual journal. However, the developing world is starting to benefit from the growing open indexes such as PubMed, ERIC, and CiteSeer.IST and HighWire.
He goes deeper into the issues of indexing by exploring how indexing of schloarly literature can be “more comprehensive, integrated and automated” while being open and accessible. Collaborative indexing is one such route to explore, which begins to blur the lines between publisher, author and reader. Willinsky has documented how fragmented current indexing service are, which leads to overlap and confusion over where journal are indexed. He aptly points out that indexing needs to evolve in step with open access because the amount of information to search vastly increases. Information that cannot located, even if it is openly accessible, has limited social value.
The Access Principle closes with a wonderful look at the historical relationship between scholarship and publishing in the aptly named chapter, “History.” In the early ears of the printing press, scholars where often found at the presses themselves, working with printers to produce their work. Once the printing press matured, a disconnect between the scholar and the press developed. Intermediaries emerged who ordered their subscription preferences and texts were sent off publishers and editors, as scholars moved further away from the physical press. Today, the shift to the digital has allowed the scholar to redevelop a closer relationship with the entire process of publishing. Blogging, print on demand, wikis, online journals and tagging tools are a few examples of how scholars now interact with “not only fonts and layout, but to the economics of distribution and access.”
It’s important that the book closes here, because it illuminates how publishing technology has always been a distruptive force on the way knowledge is stored and shared. Willinsky’s concern is to argue for open access but to also show how interrelated the digital is to that access. Further, there is the opportunity to “improve the quality and value of that access.”
Our work at the institue, including Sophie, MediaCommon, Gamer Theory, and nexttext all point to these new directions that Willinsky share, which not surprisingly make his book particularly relevant to me. However, Willinsky describes something relevant to all scholars as well.