Building – Part 2: Serving Telemetry

I’ve just finished working on McLaren’s new F1 site,, for the 2010 season, at Pirata London, for Work Club.

I’ll be writing up what we’ve done here in several parts. Sign up for my RSS feed to keep updated.

Part two covers the telemetry panel, known as “The Race 1.0b”. Technically, I think this is the most interesting section of the site.

Choosing a solution

Before I started in this job, I spent a week scouting around for technologies and platforms to support what we need.

We need to distribute our data feed, on a second-by-second basis to thousands of viewers. Although we weren’t sure exactly how many visitors we’d need to support, there’s a chance we’d get a mention on television, so we should be expecting visitors in the tens of thousands. Obviously we’d need to keep the latency down too – there’s no point broadcasting stats that appear with a significant delay over the television signal. This would very much be a companion to the television programme.

The most sensible way to receive a live feed is using Comet. This is similar to the familiar Ajax, with one big difference. When you request a data packet with Ajax, it will return as soon as it can. It might say “here’s data”, or it might say “I have no data yet”. The second of these cases is useless – you’d have to go and ask again. The process of continually requesting data on a timed basis is known as polling. It puts a big load on the server, which needs to waste execution time dealing with “I have no data” requests.

Comet is different, because it will not receive a “I have no data” response. It doesn’t get a response at all in this case. It just sits and waits, and only gets a response when data comes through. This action is called a “push”, because the server triggers an action at the client, which is an unusual operation. This is Comet’s key difference.

Comet has two flavours. If the response terminates the connection when data comes through, then it will need to request the next packet straight away. This, then, is known as “long-polling” Comet. It has the familiar polling action that we had with Ajax, but far lower latency, because the packet comes through as soon as it’s available. There is no chance of a 1 second delay. The other flavour of Comet is “streaming”, because it drip-feeds the data into the connection. The new HTML5 web sockets would probably fall under this heading, though I believe they use TCP direct, instead of needing to deal with HTTP. I would love to have streamed the data, but reading about it online showed that I would have numerous problems, with proxies and hubs that would wait for packets to complete before passing on results. Essentially, streaming is not thought to be ready for production yet.

I chose long-polling Comet for McLaren, and that simply left the question of implementation.

Long-polling Comet: software

If you’re dealing with a server under high load, you’re bombarded with thousands of requests a second. It was clear that I’d need the lightest possible solution to the problem. A web server which can handle the C10k problem, and hopefully do significantly better than that. Apache uses one thread per request, so in anyone’s book, won’t be able to handle this kind of load.

The nginx (engine-x) webserver is an ideal choice for performance. Webfaction have some oft-quoted stats on it’s performance, and it can even show improvement when proxying to Apache. Last year I’d read on Simon Willison’s blog that nginx has a push module.

The nginx http push module (nhpm) is a publisher/subscriber module for nginx. It serves as a hub which can receive POST requests, and then distribute them to a large number of subscribers waiting on GET requests. Naturally, the POSTs can be filtered by IP. The logical demo for any Comet demo is a chatroom, which is duly presented on the site. What I needed was something far more basic, because everyone receives the same data, but it would do.

Another option was to use node.js, which has apparently now been used to support a million comet user for Plurk. Despite performing reasonably well in tests, I decided against node because I didn’t have the time to build and test a webserver myself before launch. I tend to think there’s more at work in a robust webserver than first appears. Nginx is mature, and the push module is several iterations in.

Let’s just go ahead and install nginx on our linux variant. This will install on a Mac natively, but if you want to do do the later performance testing chapters, I recommend getting another linux box going. You can set one up with Amazon cheaply, or with Rackspacecloud if you find all that certificate stuff confusing.

Installing nginx with the push module:
If you’re not connecting as root, switch to a root shell
[sourcecode language=”bash”]
sudo bash

Download and untar nginx and the nginx push module
[sourcecode language=”bash”]
curl -O
curl -O
tar -xzvf nginx-0.7.65.tar.gz
tar -xvzf nginx_http_push_module-0.692.tar.gz

nginx requires PCRE (regular expressions library) and open-ssl
[sourcecode language=”bash”]
curl -L -O
tar -xzf pcre-8.01.tar.gz
apt-get install libssl-dev #ubuntu
yum install openssl-devel #redhat/fedora

Now we compile and install nginx
[sourcecode language=”bash”]
cd nginx-0.7.65
./configure –add-module=../nginx_http_push_module-0.692 –with-http_flv_module –user=apache –group=apache –with-http_gzip_static_module –with-pcre=../pcre-8.01
make && make install

Now nginx is installed, we can pop off and edit the nginx.conf file.
[sourcecode language=”bash”]
vi /usr/local/nginx/conf/nginx.conf

The nginx config file is a thing of beauty. If you’re used to Apache configs, this will be like upgrading from PC to Mac. Difficult to switch but well worth the effort.

For now, just grab my simple version of the nginx conf.

If you haven’t got a user already, let’s create one, somewhat confusingly named “apache”, though you can choose your own name if you want.
[sourcecode language=”bash”]
groupadd apache
useradd -c “Apache Server” -d /dev/null -g apache -s /bin/false apache

And start the server:
[sourcecode language=”bash”]
(you can stop with):
[sourcecode language=”bash”]
/usr/local/nginx/sbin/nginx -s stop
Remember it’s listening on Port80, so make sure it’s not conflicting with Skype or Apache.

Nginx is now listening on port 80.
If you send a POST to /feed/publish it will get broadcast out to anyone hanging on /feed/subscribe
Let’s test it in a browser.
Open http://localhost/feed/subscribe
It will hang on and wait.

Now create a small HTML page that POSTs to /feed/publish
[sourcecode language=”html”]