5:27, Thursday January 8th, 2009 • feeling resigned • no comments
I'm a keen twitterer. When I read my tweets I see find that certain voices shout louder than others, where volume = tweet frequency. Those voices aren't necessarily the ones I care about. I want to know what's going on with my more restrained friends too.
I designed Followize to solve this problem. Like Twitter100, it shows the latest tweet from each friend. The UI is more efficient than Twitter100's and I have some enhancements planned that I hope will make Followize a very quick and convenient way to keep up with the people you're following.
Followize uses the Twitter API's friends method. Until yesterday, the documentation for that method said it would return "up to 100 of the authenticating user's friends who have most recently updated." I.e. that the sort order is the created_at time of each friend's latest status update. Subsequent pages of less-recently-updating friends can be requested as well. Followize is just a nice UI for this data built on Google App Engine.
However, after building the app and using it for a little while, I noticed that the data was not sorted in this way at all. I raised this as an API issue. One of Twitter's engineers responded that this was a documentation error, rather than a software error, and updated the docs. The correct order is (effectively) the date the user began following a given person. Unfortunately this all but kills my application.
If Twitter is sending the data in the wrong order for my app, I have to load all the data and sort it myself. The first person I followed might be the one who has most recently updated and thus the last record in the results of the friends method call. Pulling a page of 100 friends from Twitter to App Engine takes around 0.8 seconds, decoding the JSON then takes another 0.15 seconds. Good old Scobleizer follows 21K people, Obama follows 171K! Loading all the required data for Scoble would take 3.3 minutes, plus some time for sorting, committing to cache etc. Twitter rate limits API requests to 100 per 60 minute period. Loading those 21K friends requires 210 API requests, and that's only for one page. Scoble is likely to reload the page a few minutes later and the whole thing begins again.
I'm looking at using Gnip as a workaround, but this is sub-optimal. A rough strategy would be as follows:
A user logs in to Followize for the first time.
A background process loads the complete list of their friends from Twitter's API.
Followize adds those friends to a Gnip filter of Twitter users followed by Followize users.
Gnip POSTs updates for each user to a Followize API endpoint.
Followize stores users being followed and their latest update in it's DB.
When the user requests the page, tweets are loaded from the DB.
This drawbacks to this approach are:
Step 2 could still fall fowl of Twitter's API rate limit, necessitating a 1 hour wait.
The application load doesn't scale with traffic. Scoble could sign up, I'll start getting a tonne of tweets coming in from Gnip, but Scoble may never visit Followize again, rendering that traffic useless. I can pull data up to 60 minutes old from Gnip, so I could minimize the processing overhead by pulling tweets every 60 seconds for example.
All of these API calls would be too long-running for Google App Engine.
The application complexity is dramatically increased and it is now reliant on an additional remote service.
I'd like Twitter to order the data for me, but Twitter's API as it stands can't be modified to do all the heavy lifting for every application. Gnip has an interesting model in that they allow you to offload some work, filtering of data, to them. A model in which I could write my own view of Twitter's data and upload that to be run locally to their DB would be a great solution. Given the wide range of apps using Twitter's API, I'm hopeful.
0:35, Monday December 1st, 2008 • feeling relaxed • no comments
I just finished reading Joe Armstrong's "Programming Erlang: Software for a Concurrent World." The first part was a bit slow for a someone familiar with other functional languages. I thought a brief summary of the language might be useful.
What you need to know:
Erlang is a strict functional language similar in syntax to the ML family of functional programming languages (like Caml, Miranda and Haskell), but not a descendant (lists have commas for example).
Erlang has immutable state only (there are some mutable state options, you're not supposed to go there).
Erlang uses pattern matching to define functions, unpack data structures, etc. A set of function clauses have the same arity. A function with the same name and different arity is a different function. The pattern matching is heavily used and is pretty cool.
Erlang has case statements, exceptions, anonymous functions ("funs"), list comprehensions and pattern guards.
Atoms are just handy values. Like blah. Yes, blah is a valid atom. Erlang doesn't really have strings. On a full moon, when the weather is just right, lists of numbers become strings for a short time, but immediately change back. I've heard that if you fire 2 Erlang number lists around the LHC, they concatenate.
Erlang doesn't have abstract types, just literals, lists, tuples and records (tuples with named members). There is an interesting convention of representing data structures as a tuple tagged with an atom prefix, e.g. {point, 5, 15}. This works nicely with pattern matching.
Erlang is really sweet for writing concurrent programs. It uses messaging passing to communicate between processes. No shared memory, no locks, programs can break out into multiprocessing way more easily than in other languages, particularly imperative ones. Work can be moved to other processes easily, other processes can be on other nodes easily. What Perl is to text processing, Erlang is to concurrency.
Open Telecoms Platform (OTP) is an environment for running server applications, a bit like a J2EE container. It provides a bunch of useful generic code for things like supervising processes so you don't have to. Dynamic code loading is part of the Erlang VM. There's a lot of useful systems code and tools in the Erlang distribution.
Amazon's SimpleDB is written in Erlang (real world example!).
Armstrong's book is a great intro. I picked it up because I had a few conversations about Erlang over the last few months. It's an interesting language. I can't quite remember what, but something made me feel that Erlang was going to be an interesting skill to have in the coming years, that we would be hearing more of it. It may just be a passing fad, but it has delivered some heavyweight systems.
Armstrong designed and implemented the first version of Erlang in 1986.
15:03, Thursday November 27th, 2008 • feeling relaxed • no comments
A sign that the mashup ecosystem is maturing fast. I was happy to see that creating an embeddable map of my BrightKite check-ins was as simple as feeding my BrightKite RSS url (which contains GeoRSS) info into the search box on maps.google.com and then requesting the embed code.
It all works, but the embedded map centres at the mid-point of the set of points, which shows no points at all. I tried to get it to start on London or SF by default, but no dice. Scroll over to see more useful stuff.
BrightKite RSS urls are of the form http://brightkite.com/people/<username>/objects.rss. Have fun!
21:43, Sunday November 23rd, 2008 • feeling relaxed • no comments
I'm back home in London now. My plan was to blog a bit more regularly through-out my trip to the Bay Area. Instead I will review the whole trip in one post.
In 2½ weeks in the San Francisco Bay Area, I attended 8 events, had meetings with 7 individuals and companies and collected a stack of business cards. I also have quite a few UK contacts to follow up. I found events using Upcoming and the ValleyWag calendar. At these events I met some incredibly smart and dedicated people and exchanged business cards. I followed up afterwards, inviting people to connect on LinkedIn and to meet up again for coffee or similar. My goal was to network with developers, founders and investors to find opportunities to collaborate on projects.
When I initially planned my trip, I looked at some of the things going on in the area. I decided that the O'Reilly Web 2.0 Summit was the best place to meet interesting people from the technology sector, specifically web and specifically from web 2.0 companies. I chose the event because it corresponded so closely with the stuff I'm interested in. I also chose to stay on until the MashupCamp event, for the same reason. Once I had those 2 bookends, I filled out the rest of my time with every other interesting event going. I didn't spend a lot of money on event tickets. I only went to the lobby of the hotel in which the Web 2.0 Summit was held, not the sessions themselves (though they sounded fascinating). All the other events I went to were free or in the range of $30 or less for a ticket. I wasn't able to go to the Under The Radar: Mobility conference, which I would have liked to do. Although I could probably have found my way in, I had met quite a few mobile people already during the trip.
Events are great places to network because of the variety of people present and their readiness to say hi and chat. My goal was a bit vague, so I tried to give a specific story with each new person. It's really important to explain to people what you're looking for. Many are happy to help, but they need to understand your problem first. At the Web 2.0 Summit I was even interviewed because of my slightly different story.
The events I attended ranged from the general like the Web 2.0 Summit, to the more specific like Mobile Tech 4 Social Change, Metaweb's Freebase Hack Day and the OpenSocial birthday party at MySpace's offices in SoMa. For each event, I would take a look at the list of attendees for interesting people and then try to contact them ahead of time, via email or Twitter, to arrange a time to say hi quickly. Twitter is a particularly useful tool for this right now as it is accessible, has high adoption amongst tech people and is close to an instant message. An @reply will often be picked up in real-time. Twittering that you are attending an event invites other people to say hi also. You can also use Twitter Search to follow the tag for an event or to look for other related terms. This will reveal yet more people to speak to.
While events are great places to meet people, they are less good for creating lasting relationships and discussing solid plans. I followed up with a large number of the people I met to create stronger connections, either via LinkedIn or by meeting in real life. Much of the second week of my trip was spent between these meetings. By the time I got on the flight home, I'd definitely created a set of contacts I would love to get an opportunity to talk to again or a reason to work with.
Results
The results of my trip are a much larger network and a better feeling for the kinds of people and companies working in San Francisco. I wasn't able to really get out to Silicon Valley. In many ways this was because there was so much happening in the city and it was easier to attend multiple events without spending hours on the CalTrain.
Some of the contacts led to immediate conversations about work, both contract and full-time. Most did not. However, I've just started work on a contract with Huddle, a company of lovely people that I first met in June and have come to know relatively well due to their excellent DrinkTank event. It's my experience that nothing can be done in a hurry. Whether your goal is to find a job, colleagues for a new project, employees for an existing one or investors to take an idea forward, start early and expect to spend some time on it.
8:02, Tuesday November 4th, 2008 • feeling relaxed • no comments
I went along to the Mobile 2.0 drinks and met a good mix of people. Some locals, which means I now have a list of coffee places in Mission to hit. Plus, it was just over 24 hours after I landed that someone said the magic words: "So are you going to the Digg party tomorrow?" It's a party for the election, which kills two birds with one stone. Ideal!
I also met some more industry people. It was especially good to meet the guys from mobile search engine Taptu. Ironic that we meet here given that they are Cambridge-based. I also met one of Yahoo's mobile team and we exchanged mobile anecdotes.
Tomorrow I'm going to Mobile Tech for Social Change. A BarCamp for mobile with a social twist. It is hosted at the Google.org office in SF (not Mountain View sadly, will have to find another way to get down there).
The focus of events has been mobile so far. I'm experienced in the sector, but not wedded to it. I'm interested to see what's going on, but there is the broader Web 2.0 Summit coming up, with all it's associated aspects. I'm looking forward to meeting people there.