Category Archives: xml

atomisation, part two

In the last few weeks a number of people have sent me a link to Michael Wesch’s video meditation on the evolution of media and its likely impact on all aspects of human interaction. One of Wesch’s main points is that the development of XML enables the separation of form from content which in turn is fueling the transition to new modes of communication.

Paradoxically Wesch’s video works precisely because of the integration of form and content . . . possibly one of the best uses of animated text and moving images in the service of a new kind of expository essay. If you simply read the text in an RSS reader it wouldn’t have anywhere near the impact it does. Although Wesch’s essay depends on the unity of form and content, he is certainly right about the increasing trend on the web to decontextualize content by making it independent of form. If Mcluhan was right about the medium being a crucial part of the message, then, if we are looking at content in different forms are we getting the same message? If not, what does this mean for social discourse going forward?

ITIN place | 2007 redux: design journal, part 3

CarGlas4.jpg(read parts 1&2)
[3] I’d just begun hard coding navigational elements for the new ITIN archives, when I suspected Through the Looking-Glass might be an apt, fun read to offset the growing angst around coding. Maybe something in literature would provide the gestalt I felt missing from the minutia of writing lines of functions, booleans, and parameters. Sounds holistic maybe, but this suspicion plus a Wikipedia entry I’d read on Lewis Carol convinced me it’d be the perfect read just now. So, when I was walking through Penn Staten earlier last month, I found a bookseller in the LIRR station and, all excited, I picked up a copy of Alice’s Adventures, with the intentions of breezing through it in order to move onto Looking-Glass. It was nice to open ITIN place the next day to find Stormy Blues For Alice In The Looking Glass. Somehow, the two had already met.
Sally: I’ve been trying to figure out some of the back-end stuff for the past few days, namely, how to get your entire archive to link up to something like this. Do you have any programming / web design wizard friends who might be able to offer me some technical advice?
Alex: God know…. I guess we’ll have to build them manually…some 700 links? yipes.
Alex: I mean, god no….LOL
Sally: Hey, I’m working with a programmer now on a script that will allow the archive to thumbnail images from your entries and automatically load them (& URLs to the corresponding entries) into the Flash file. I don’t know PHP, which is likely the language needed to thumbnail your images automatically, so I’m getting help on that. Once that’s in place, we should be able to (a) play further with layout aspects! and (b) the archive should automatically update every time you publish an entry. Getting closer…
Alex: and it will still do that animated scale up and down trick?
Sally: my PHP programmer who would work on the thumbnail-ing flaked out on me, seems programmers can be as flaky as drummers… So, I set it upon myself to teach myself Flash-based blog applications. At its simplest, it requires a little PHP, a little XML and Flash, all in conversation with what you post online.


Ben: As for PHP gurus… We do in fact have someone working with us right now who’s an experience PHP coder. We’re keeping him pretty busy right now with MediaCommons stuff, but I think he could help you out with this stuff in a few weeks.
Sally: I also imagine there should be more than one way to search / browse the archives. One might be a linear “wall” from month to month that we could click/scroll through, another might be a drop-down menu of months say, to the right of the “wall” of images. Any thoughts on that?
Meanwhile, I’d plotted out on my whiteboard a map of the flash file. It looked to me that there were two methods of approach, interface-wise. Either the zoom function would scale up the size of an entire month’s calendar, and a re-center or panning function would allow the user to focus on a particular entry – or – the zoom function would simply scale up one entry at a time onrollOver (the original idea).
I am (still) drawn to the first idea, even though I’ve put it aside, since that would best recreate the sense of approaching a gallery wall, or landing on the (x,y) of Alex’s blog. But, caveats abound — if an onPress fires the zoom and re-center, then how do you click the entry’s permalink and/or zoom out? Is this overcomplicating things? Here is an example of an unweildy new zoom (an attempt to manage dragging and zooming).

htmlentry.jpg Then I started to think about loading in individual blog entries from the XML. I talked to my friend Mike about this for a while and in exchange for some brownies (although really only out of his extreme kindness and generosity) he constructed an XML format, sample.xml, and guided me on a way to load in the HTML of each individual entry into a small clip.
The great thing about using the HTML of each entry in the previous example is that it would allow the archives to build completely dynamically. Any changes Alex made in an archived post would reflect in real time in the flash file. Unfortunately, this doesn’t cut down on load time and I can’t coax the videos and animated .gifs to appear (of which there are considerable number). Here is an example of one entry pulled into the Flash file with HTML. CSS can be incorporated, but it’s obviously slow loading.
Mike brought up something I’d wondered too too: are we going to have one XML file for the entire archive? It seems to make more sense for each month to have it’s own.
So, after a few weeks, I caught up with Future of the Book’s expert developer Eddie Tejeda, and we decided to put an XML document within each month. On an exciting note, Eddie devised a great scheme (script) to take screen shots of all of ITIN place’s entries. He’s working on getting the image size down, so as to minimize loading time.
Eddie’s screen shots would load much faster than pure HTML, but it could possibly cut the dynamism. This would build something like this, only faster:

Most of the hard coding of the archive is done. Design matters remain: At the moment, the entries load in rather like a retro computer solitaire game, and drop down menus are disconnected and unskinned. It’s a task to go back and forth between design and developing — I’m just cutting my teeth on some of this and the dryness of programming can dilute creative inspiration (if this is anything to go by). The archive is very close to complete; it will be a thrill to use this gentler beast.


250px-Nuclear_fireball.jpg A Nov. 18 post on Adam Green’s Darwinian Web makes the claim that the web will “explode” (does he mean implode?) over the next year. According to Green, RSS feeds will render many websites obsolete:
The explosion I am talking about is the shifting of a website’s content from internal to external. Instead of a website being a “place” where data “is” and other sites “point” to, a website will be a source of data that is in many external databases, including Google. Why “go” to a website when all of its content has already been absorbed and remixed into the collective datastream.
Does anyone agree with Green? Will feeds bring about the restructuring of “the way content is distributed, valued and consumed?” More on this here.

ted nelson & the ideologies of documents

I. Nelson’s criticism

Ted Nelson (introduced last week by Ben) is a lonely revolutionary marching a lonely march, and whenever he’s in the news mockery is heard. Some of this is with good reason: nobody’s willing to dismantle the Internet we have for his improved version of the Internet (which doesn’t quite work yet). You don’t have to poke around too long on his website to find things that reek of crackpottery. But the problems that Nelson has identified in the electronic world are real, even if the solutions he’s proposing prove to be untenable. I’d like to expand on on one particular aspect of Nelson’s thought prominent in his latest missive: his ideas about the inherent ideologies of document formats. While this sounds very blue sky, I think his ideas do have some repercussions for what we’re doing at the Institute, and it’s worth investigating them, if not necessarily buying off on Xanadu.

Nelson starts from the position that attempting to simulate paper with computers is a mistaken idea. (He’s not talking about e-ink & the idea of electronic paper, though a related criticism could be made of that: e-ink by itself won’t solve the problem of reading on screens.) This is correct: we could do many more things with virtual space than we can with a static page. Look at this Flash demonstration of Jef Raskin’s proposed zooming interface (previously discussed here), for example. But we don’t usually go that far because we tend to think of electronic space in terms of the technology that preceded it – paper space. This has carried over into the way in which we structure documents for online reading.

There are two major types of electronic documents online. In one, the debt to paper space is explicit: PDFs, one of the major formats currently used for electronic books, are a compressed version of Postscript, a specification designed to tell a printer exactly what should be on a printed page. While a PDF has more functionality than a printed page – you can search it, for example, and if you’re tricky you can embed hyperlinks and tables of content in them – it’s built on the same paradigm. A PDF is also like a printed page in that it’s a finalized product: while content in a PDF can be written over with annotations, it’s difficult to make substantial changes to it. A PDF is designed to be an electronic reproduction of the printed page. More functionality has been welded on to it by Adobe, who created the format, but it is, at its heart attempting to maintain fidelity to the printed page.

The other dominant paradigm is that of the markup language. A quick, not too technical introduction: a markup language is a way of encoding instructions for how a text is to be structured and formatted in the text. HTML is a markup language; so is XML. This web page is created in a markup language; if you look at it with the “View Source” option on your browser, you’ll see that it’s a text file divided up by a lot of HTML tags, which are specially designed to format web pages: putting <i> and </i> around a word, for example, makes it italic. XML is a broader concept than HTML: it’s a specification that allows people to create their own tags to do other things: some people are using their own version of XML to represent ebooks.

There’s a lot of excitement about XML – it’s a technology that can be (and is)bent to many different uses. A huge percentage of the system files on your computer, for example, probably use some flavor of XML, even if you’ve never thought of composing an XML documents. Nelson’s point, however, is that there’s a central premise to all XML: that all information can be divided up into a logical hierarchy – an outline, if you will. A lot of documents do work this way: book is divided into chapters; a chapter is divided into paragraphs; paragraphs are divided into words. A newspaper is divided into stories; each story has a headline and body copy; the body copy is divided into paragraphs; a paragraph is divided into sentences; a sentence is divided into words; and words are divided into letters, the atom of the markup universe.

II. A Victorian example

But while this is the dominant way we arrange information, this isn’t necessarily a natural way to arrange things, Nelson points out, or the only way. It’s one way of many possible ones. Consider this spread of pages (double-click to enlarge them):

Click here to enlarge this image.

This is a title page from a book printed by William Morris, another self-identified humanist. We mostly think of William Morris (when we’re not confusing him with the talent agency) as a source of wallpaper, but his work as a book designer can’t be overvalued. The book was printed in 1893; it’s entitled The Tale of King Florus and the Fair Jehane. Like all of Morris’s books, it’s sumptuous to the point of being unreadable: Morris was dead set on bringing beauty back into design’s balance of aesthetics & utility, and maybe over-corrected to offset the Victorian fixation on the latter.

I offer this spread of pages as an example because the elements that make up the page don’t break down easily into hierarchical units. Let’s imagine that we wanted to come up with an outline for what’s on these pages – let’s consider how we would structure them if we wanted to represent them in XML. I’m not interested in how we could represent this on the Web or somewhere else – it’s easy enough to do that as an image. I’m more interested in how we would make something like this if we were starting from scratch & wanted to emulate Morris’s type and woodcuts – a more theoretical proposition.

First, we can look at the elements that comprise the page. We can tell each page is individually important. Each page has a text box, with decorative grapevines around the text box; inside the text box, the title gets its own page; on the second page, there’s the title repeated, followed by two body paragraphs, separated by a fleuron. The first paragraph gets an illustrated dropcap. Each word, if you want to go down that far, is composed of letters.

But if you look closer, you’ll find that the elements on the page don’t decompose into categories quite so neatly. If you look at the left-hand page, you can see that the title’s not all there – this is the second title page in the book. The title isn’t part of the page – as would almost certainly be assumed under XML – rather, they’re overlapping units. And the page backgrounds aren’t mirror images of each other: each has been created uniquely. Look at the title at the top of the right-hand page: it’s followed by seven fleurons because it takes seven of them to nicely fill the space. Everything here’s been minutely adjusted by hand. Notice the characters in the title on the right and how they interact with the flourishes around them: the two A’s are different, as are the two F’s, the two N’s, the two R’s, the two E’s. You couldn’t replicate this lettering with a font. You can’t really build a schema to represent what’s on these two pages. A further argument: to make this spread of pages rigorous, as you’d have to to represent it in XML, would be to ruin them aesthetically. The vines are the way they are because the letters are the way they are: they’ve been created together.

The inability of XML to adequately handle what’s shown on these pages isn’t a function of the screen environment. It’s a function of the way we build electronic documents right now. Morris could build pages this way because he didn’t have to answer to the particular restraints we do now.

III. The ideologies of documents

Let’s go back to Ted:

Nearly every form of electronic document- Word, Acrobat, HTML, XML- represents some business or ideological agenda. Many believe Word and Acrobat are out to entrap users; HTML and XML enact a very limited kind of hypertext with great internal complexity. All imitate paper and (internally) hierarchy.

For years, hierarchy simulation and paper simulation have been imposed throughout the computer world and the world of electronic documents. Falsely portrayed as necessitated by “technology,” these are really just the world-view of those who build software. I believe that for representing human documents and thought, which are parallel and interpenetrating– some like to say “intertwingled”– hierarchy and paper simulation are all wrong.

It’s possible to imagine software that would let us follow our fancy and create on the screen pages that look like William Morris’s – a tool that would let a designer make an electronic woodcut with ease. Certainly there are approximations. But the sort of tool I imagine doesn’t exist right now. This is the sort of tool we should have – there’s no reason not to have it already. Ted again:

I propose a different document agenda: I believe we need new electronic documents which are transparent, public, principled, and freed from the traditions of hierarchy and paper. In that case they can be far more powerful, with deep and rich new interconnections and properties- able to quote dynamically from other documents and buckle sideways to other documents, such as comments or successive versions; able to present third-party links; and much more.

Most urgently: if we have different document structures we can build a new copyright realm, where everything can be freely and legally quoted and remixed in any amount without negotiation.

Ben does a fine job of going into the ramifications of Nelson’s ideas about transclusion, which he proposes as a solution. I think it’s an interesting idea which will probably never be implemented on a grand scale because there’s not enough of an impetus to do so. But again: just because Nelson’s work is unpragmatic doesn’t mean that his critique is baseless.

I feel there’s something similar in the grandiosity of Nelson’s ideas and Morris’s beautiful but unreadable pages. William Morris wasn’t just a designer: he saw his program of arts and crafts (of which his books were a part) as a way to emphasize the beauty of individual creation as a course correction to the increasingly mechanized & dehumanized Victorian world. Walter Benjamin declares (in “The Author as Producer”) that there is “a difference between merely supplying a production apparatus and trying to change the production apparatus”. You don’t have to make books exactly like William Morris’s or implement Ted Nelson’s particular production apparatus to have your thinking changed by them. Morris, like Nelson, was trying to change the production apparatus because he saw that another world was possible.

And a postscript: as mentioned around here occasionally, the Institute’s in the process of creating new tools for electronic book-making. I’m in the process of writing up an introduction to Sophie (which will be posted soon) which does its best to justify the need for something new in an overcrowded world: Nelson’s statement neatly dovetailed with my own thinking on the subject on why we need something new: so that we have the opportunity to make things in other ways. Sophie won’t be quite as radical as Nelson’s vision, but we will have something out next year. It would be nice if Nelson could do the same.

google blog search – still a long way to go

Google’s new blog search engine reminds me of how far we still have to go with blog search. The engine works much the same way as Google’s general web search – with keywords and page ranking – only here it’s searching RSS feeds. Recent posts with keyword matches fill the column, and a few links to related blogs come up at the top. But there’s the rub. These so-called “related” blogs are only related by direct keyword matches in their title tagline. I just searched “poetry” and came up with only three related blogs. C’mon. A search for “gossip” turns up only one related blog – “Starbucks Gossip”. There has to be some kind of promotion going on here, though their “about” page mentions nothing of the kind.
A good engine would be capable of searching blogs by their subject, their preoccupation, their obsession. Many blogs could be considered “general,” but just as many have a special focus, and readers are often searching with a particular theme in mind. They don’t just want a list of transient posts, but whole sites that might potentially become regular destinations. Many blogs are valuable publications that prove themselves day after day. But blog search hasn’t yet grown beyond the trendy “what’s the latest chatter on the blogosphere” mode.
I do have to give credit to Technorati. Glitchy as it is, they’re trying to think of creative ways – tagging, author-determined keywords – to help readers find interesting blogs and authors their audience. Then again, my greatest finds have usually been from other blogs. Humans will always be the smartest aggregators.
People out there, what do you use?