I’ll be writing up what we’ve done here in several parts. Sign up for my RSS feed to keep updated.
Part four covers the load testing of telemetry data, broadcast by the nginx http push module.
Using Apache Bench (ab)
Apache Bench (
ab) is our preferred load-testing tool.
Load Testing is a stressful exercise for any server. I strongly recommend creating a completely new Amazon EC2 server to run load tests from.
Do NOT run
ab from your local machine. I know you're trying to save effort, but it's not about power. Both PCs and Macs are desktop machines, and will just throw errors at pathetically small load, even when testing against themselves!
ab may not be installed by default. It can be installed with either of the following commands (linux):
yum install httpd-devel
apt-get install apache2-utils
ab for help.
You'll need to increase your open files limit, as we did on the feed server.
ulimit -n 999999
A good test (in my experience) is:
ab -n 100000 -c 5000 -k -r http://server/feed/subscribe
Some important notes before you do this:
If you're not posting to the
publishURL on that box, all the
abthreads will just hang and wait. Make sure you're publishing.
abis requesting a lot of documents, and it'll get them. This will take a lot of bandwidth. Check your charges.
-r” switch is not supported by the RedHat version of
ab. Just take it away if it complains.
Don't run this against other people's servers! It's mean and stupid. And they could conceivably take you to court for a denial of service attack.
Don't run this against a live site! Not only is that stupid, but you could also be prosecuted under terrorism legislation for denial-of-service attacks. Your IP will get blacklisted, and you will be shunned by the wider Internet community, forced to eke out some sort of existence in lower IRC chatrooms for the rest of time. Don’t do it kids.
ab uses the following parameters here:
- n – number of requests
- c – number of concurrent requests
- k – enable 'keep-alive' (which lowers the overhead of each connection having it's own HTTP connection established each time)
- url – the page we're requesting. You can always run
curl <url>to see what this is.
- r – ignore errors – can help when using very high numbers
I’ve found the keep-alive is necessary, but I’m not sure if it’s fair to have it in there.
ab will present a summary report.
Failures and Errors
The 'failures' due to length shown below are common, and not really failures. They are merely saying that the length of responses vary, and since our POSTs are different lengths then this is ok.
Should you receive an
apr_socket_connect(): Cannot assign requested address (99) or similar, wait a bit and start again. There's probably a router in the way trying to prevent DOS attacks, and I half suspect that it take a lot longer than you think before the ports get properly closed.
What happens when it gets hit too hard?
Two important factors seem to come into play.
One, the POST action starts taking longer, and can start to take longer than a second.
Two, requests seem to be served at slower intervals. It may be that users miss packets (telemetry not so important, but comments missing are bad). If a user is not activly waiting when the new POST comes through, they will not get the update.
This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking feed.mclaren.com (be patient).....done Server Software: nginx/0.8.33 Server Hostname: feed.servername.com Server Port: 80 Document Path: /feed/subscribe Document Length: 776 bytes Concurrency Level: 5 Time taken for tests: 29.421 seconds Complete requests: 100 Failed requests: 70 (Connect: 0, Receive: 0, Length: 70, Exceptions: 0) Write errors: 0 Keep-Alive requests: 95 Total transferred: 100395 bytes HTML transferred: 73605 bytes Requests per second: 3.40 [#/sec] (mean) Time per request: 1471.046 [ms] (mean) Time per request: 294.209 [ms] (mean, across all concurrent requests) Transfer rate: 3.33 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 2 5.5 0 21 Processing: 11 1469 2126.8 1041 10642 Waiting: 0 1469 2127.2 1041 10642 Total: 11 1471 2130.3 1041 10659 Percentage of the requests served within a certain time (ms) 50% 1041 66% 1043 75% 1046 80% 1047 90% 1049 95% 10657 98% 10659 99% 10659 100% 10659 (longest request)