Ben Godfrey

Archive for November, 2005

Web 2.0: emergent information organisation

I think I may have had a Web 2.0 epiphany. If you haven’t come across the label yet, Web 2.0 is one of those terms without a meaning that has been floating around the heads of a lot of clever and creative people without any one really explaining it. The best explanation I’d seen until about half an hour ago was the Web 2.0 checklist, which does cover almost everything you need.

However, I’m think Web 2.0 is the signifier for the increasing tendency towards bottom-up or emergent methods of organising and retrieving information. Emergent behavior is the principle of intelligent behaviour arising from many simple parts. How this relates to the web is easy to see. On Flickr (or Moblog), I tag my photos, you tag your photos. Nothing special there, but add a page which shows all photos for a certain tag and suddenly you have a new resource: a new collection emerges “by itself”, correctly selecting the relevant shards of our independent collections. We don’t plan any one collection, but they happen as a function of the volume of uploading and tagging.

So tags, though disarmingly simple, can cut wide seams into a body of knowledge, powering later information retrieval. The work of bringing the knowledge together is relatively straightforward. Sufficiently straightforward that Technorati can do it for 21.9m blogs across the web.

Services like Technorati that cut horizontal lines through many vertical information sources highlight another aspect of Web 2.0: XML access to databases. RSS, Atom, OPML, RDF, REST are all acronyms that signify relatively bare bones availability of data. These data are open for easy recombination: web mash-ups. I take your data and combine it with someone else’s and create something new. The best mash-ups are impressive demos of the Web 2.0 concept. Take chicagocrime.org, created by Django’s lead developer, Adrian Holovaty. Chicagocrime pulls a feed of crime information direct from the Chicago Police Department and presents that information on a map powered by Google’s maps API. The result is a really great way to work out which neighbourhoods of Chicago you shouldn’t move to.

I would like to be able to take my business ideas and make them part of Web 2.0. Cohack provides reporting, it can benefit hugely from Web 2.0, because more and more information is there to be sucked into our engine. Want a daily report mapping the average temperature in 25 cities compared against the smog index? That would take me no time to build. Before mash-ups started happening, the first RSS applications were news readers, kind of auto-browsers. Then podcasts became popular because RSS+MP3/MPEG decouples download from use and allow content to be delivered and consumed in a really convenient way. But things really get interesting when you start to recombine and manipulate the data as part of the daily feedsuck. Our AdKnowledge application (and many others) could give you a dashboard built from unifying a thousand separate sources automatically. AdKnowledge gives it you in Excel, not very Web 2.0, but perfect for business people to work with right now. Excel is to humans as XML is to computers, a good starting point for processing operations, so even the delivery isn’t the end of the process of manipulation.

When any data can be acquired for processing just by knowing the right url, original sources of data become more valuable. Moblog is the conduit for mobile content. Yours, mine, anyone’s can be addressed easily, if you have the right permissions. There are many interesting applications for this data. Some of the images we get from Japanese phones are already tagged with lat/long data. If that were more prevalent, the next big mash up could be moblogged images of a major terrorist attack placed on a Google map of the city in real-time. Like Greg Robinson’s excellent Mapr application, but with a cloud of camera phone owners creating a sort of emergent super-photographer and with no admin intervention required to get the data straight on to the web. If one shot doesn’t show you what you want to see, the application has one from round the next corner, one from the third floor of the next building, and so on.

I’m writing the business plan for Moblog at the moment and I have to say it’s one of the most frustrating experiences I’ve had. On the one hand I’m stuck in this room desperate to kickstart something which I think could be really big. On the other, the first step is a process which I can’t really do. I can get excited and paint a picture of an amazing future of dynamically recombined data from many sources in a blog post, but when it comes to showing an investor how that’s going to return 10x his or her investment in year 5 and conveying the excitement of the opportunity at the same time, I just don’t have it. I can draw the outline and I’m savvy enough now to know that ideas and products are very different things, but the level of detail required is something I’m finding very taxing. I need to make some better contacts I think.

Required Google Analytics post

Wow, Google Analytics has really landed. It seems every blog I’ve visited this week has a GA post at the top, and I’ve visited a lot of diverse blogs in the last week. Is this web analytics coming of age? It blows every other free solution out of the water completely, that’s for sure, so a lot more people are going to start tracking a lot more stuff. Damn, maybe even I’ll use it :-).

The business of blogging

Doing some research whilst writing the business plan for my latest venture, I dug up a few bits about the business models in blogging:

And an interesting aside: Murdoch’s internet strategy is to capture kids. We hope to explore whether moblogging will take off in the yoof market. Must think of a good model for my plan.

Thoughts on Django

Had a bit of play with Django today, just messing around really. It’s really good in some ways and a bit of a let down in others. For example the model and generic views is great, but I still have to write templates. Sometimes I even have to write form tags! Forms can be easily derived from the model. Django doesn’t generate forms for you because it lets you control the presentation. Big whoop. I much prefer InfoCMS’s method, give you a default presentation which is fairly simple and semantic HTML (i.e. divs with sensible class names) and then let you create XSLT templates to modify form presentation site-wide. This can be sticky if each form is different, but how often does that happen. Maybe you have big form and condensed sidebar form and a special case or two, but that should be OK if you write modular XSLT.

I might have to find a way to make the form generate-validate loop a bit more straightforward. The nice thing about Django is that the API is just below the surface, so it would be fairly easy to build this kind of functionality (I think). I’ll probably never get rid of the templates though. This is mad, Django has this great model for plugging apps into sites. Make a new app, say a blog or a forum and you can have urls branching off the app base URI. Whether my blog is at /blog/ or /journal/ doesn’t matter, the url patterns after that are the same either way. So someone else writes a forum module and a third person writes a simple wiki, I want these and my blog on one site that has a consistent look and feel. But the other guys have used different form HTML from me. I have to go through each of their templates and overwrite them! Great when the next update comes out and there’s a tonne of new fields etc.

Nah that sucks. Django does have template inheritance, so all that’s needed possibly is some agreed best practice for template layout. Always have a base template, a form template, etc, every app author uses these, knowing that the site admin can create their own localised versions with whatever HTML they like.

Web Frameworks Night and the attack of the alpha geeks

Went to the Web Frameworks Night tonight to learn a bit more about some of the most interesting framework projects. As creator and sole hacker on InfoCMS and as somebody who realises how silly it is to reinvent the wheel, I thought I owed it to myself (and my clients) to show up and learn some stuff.

The three frameworks presented where Catalyst, Django and the now quite famous Ruby On Rails. Each was interesting, I arrived late for the Catalyst talk and was least interested in it due to it’s Perliness. Django was what I really wanted to learn about, but it was also interesting to listen to Matt Biddulph discussing Rails development. We also got to see inside the in-development BBC programme catalogue app, which knows all about pretty much every show the BBC has done since 1936. That alone was worth the bike-ride.

As I said it was Django that I wanted to know more about. Django is Python, full stack, has a nice OR-mapper and lots of bits of pieces that I have or have tried to build into InfoCMS. It doesn’t have Rails or Zope 3’s test-driven development, which is a big blow. It doesn’t have XSLT as an output layer or much in the way of TTW tools for building sites as far as I could see. Test-driven development is something which I’ve totally failed to bring into InfoCMS, but the other two are core goals. However, though Django may lack some core things I want from InfoCMS, it does seem to have a very similar set of goals. Minimal code, maximal reuse. High level of component sharing and nice things like automatically generated admin interfaces, Ajax, etc.

They’ve all got bloody template languages though! Gah.

Anyway, I’ve been thinking I’d like to rebuild InfoCMS from the ground up in Python for a while. It’s a pipe-dream because I’ll never get the time, what with running two start-ups and that. However, I could steal a third or half of the code I need from Django :-). Both the website and Simon stressed it’s modular nature. Don’t like the view layer, throw it out. Simon also said that they’re not afraid of breaking backwards compatibility and that as they’ve only been open source since July, now is a good time to get involved and potentially shape the project dramatically. I might just do that.

The after party pub mission was also really interesting. I got talking to this guy, an alumnus of Media Lab, Berkeley and Ludicorp! Every cool idea I threw at him he bounced back without even trying. He referred to Cal and Tim by first name. He’s consulting at Sony on generative stuff and toys for Playstation etc. I talked about Generator X and the generative art scene, he knows Casey Raes. He mentioned mixing and mash-up tools on Playstation, I counter with Ableton Live, Sony are working with them. I mentioned an interesting new book, Rules of Play, it was written by his old business partner. It was a funny conversation. Mainly because he was a really nice and down to earth guy, at the same time as being a name geek, just back from OSCON. It was pretty exciting to be honest.

The $100 laptop

Nicholas Negroponte and Kofi Annan have just formally launched the $100 laptop project at WSIS. This is a really amazing project initiated by the One Laptop Per Child organisation chaired by Negroponte with research being performed at the MIT Media Lab which he also heads.

Listening to the webcast it’s evident that they’ve set themselves such a huge list of things to have, it’s just incredible that’s its so cheap. Governments will have to buy at least 1m units, but for that they get a machine which has a dual-mode sunlight-readable display, wifi on such low power that it can still be used to provide a mesh node when the machine is turned off, a power budget that can be provided by a crank, and in multiple colours (probably).

CTO Mary Lou Jepsen said “We basically reinvented the laptop” and it really sounds like it.

Behaving as an ebook is really important because they’re selling it to governments as a “trojan horse.” Buy this which will last five years instead of five years worth of books at $20. Selling through book channels means that it must at least replicate all that a book can do. The dual-mode display runs at 150dpi in black and white mode and Negroponte says one minute of cranking should give over thirty minutes of reading.

The device will use open source software, probably Linux, probably provided by Red hat. OPLC are promising that it will support every single language, even small ones. This is important too as it allows language networks to grow. The internet has a huge English language network, which means the language is very strong. Unless they eventually move on to digital media, languages risk death.

Thailand and Brazil are the most eager countries to adopt at the moment. The plan to launch in six large countries first, two in Asia, two in Africa, one in mid-east, one in South America. Smaller countries hard because of sales force issues!Negroponte thinks that perhaps the UNDP could help with smaller countries later. On the subject of why not try to do this through industry because of the great difficulty of dealing with governments, Negroponte argues that education is a public good. He needs to take the hard road because they can’t be seen to sanction governments not being the providers of education.

Could industry compete, creating an even cheaper laptop in the long run? Alan Kay and Nick Negroponte would love to see it!

Alan Kay say it’s a “Platform for content,” and warns that we’re focussing on the machine when the difficult problem is really the support structures around the machine, training teachers, delivering content and so on. It’s good to hear him say this. Listening to the entire webcast in fact is reassuring, this is being run by a group of people who have done non-for-profit projects before, know the pitfalls and are bursting with ideas to get things moving forwards.

Negroponte would like to see a time when American and European kids sponsor kids in poor countries by buying them a unit and being paired up. I’d love to do this, send or perhaps even take a crate full of them to a group of kids. It’s a great project and will surely have many positive and subtle benefits far beyond what people have envisaged so far. Interesting evidence of this comes from Negroponte’s assertion that this is a platform to “learn learning”, not just one topic, but the skills to educate themselves in later life.

DE9 | Transitions

Just got my copy of Richie Hawtin’s DE9 | Transitions mix DVD. It’s cool. It’s a 96 minute mix, done in Dolby 5.1, but there’s also a MP3 version for plain old stereo people, like me, and for putting on iPods and such and also a CD with a cut down 74 min version. “However you want to hear my mix, take your pick.”

There’s also a bonus live mix and a short interview with the Man of Plastik himself. He talks about his interest in the future of DJing. Describing how once software takes care of beat matching the DJ’s role is to craft great transitions between tracks. This is really interesting to me. During my set at Norm and Ruth’s party on Saturday, I didn’t really build with the tracks I was playing. I did fairly basic chop-the-crossfader style mixing, unlike say what Liam Howlett achieves in his Dirtchamber mix or Soulwax or the turntablists and hip-hop producers like DJ Shadow do all the time.

Hawtin also talks a bit about how he put the mix together, creating it in Live and then mastering for 5.1 in ProTools. You get to see his studio and his Live session file, it’s huge. After reading yesterday about Liam Howlett’s use of a Roland W30, it’s interesting to compare the different approaches of dance musicians and think about how I can repeat their practices.

Keys

BTW, I got the keyboard for my birthday in the end. I also got Resident Evil 4. So far the former has received about half an hour of my attention and the latter about 11 hours. I’m warming up to it :-).

Diggin’ in the crates

Spent a bit of time re-organising my Documents folder. This folder has been growing new content since 1995. So sorting it out was much like going into the shed you haven’t opened the door of for a year or two. Ultimate spring cleaning.

I found a whole bunch of HTML stuff I did whilst at college and uni, not just web dev, but writing, graphics, everything. Some of the UEA crew may remember Marionville :-). In many ways this site today is just the latest part of an evolving low level creativity that I’ve been indulging since I wasn’t a kid anymore. It’s wierd looking back at the old stuff and seeing clear threads that lead into who I am today. Embarrasing as it is, I would like to put some of it live. I’ve been thinking this site needs a redesign as it goes, but I’ve just been soo busy with Cohack, Moblog, freelance, Resident Evil, Live-rockin’ and everything else I’ve kind of run out of time to just piss about. Which is a real shame.

Early new year’s resolution: get back on the random acts of creativity tip.

Plus, I’ve been blogging for 4.5 years! If only I blogged a bit more, that might be interesting :-).