TvE 2100

At 2100 feet above Santa Barbara

Optimizing a Rails Application, Part 1

I’m working on a really cool Rails application called AWS-Console which manages servers on Amazon’s Elastic Compute Cloud. With the click of a button, we can instantiate new servers which fire up within a couple of minutes. We can even take a svn repository of a Rails app and launch it fully automatically, takes about 10 minutes to come up. Our test uses mephisto: want 10 instances of mephisto running? Clicke here…

So I was measuring the performance of the AWS-Console site and it turned out to be abysmal! Like 1 request per second max, running apache+mongrel_cluster on a 2Ghz/2GB box. That was rather disappointing. So I rolled up my sleeves to try and figure out what is going on, and this is the tale of my first Rails performance adventure!

Sandhill Cranes
Sandhill Cranes, Bosque del Apache, NM, Feb 2004, ©2004 Thorsten von Eicken

The goal

Up to now we’ve been exclusively focused on functionality and have not cared a bit about performance. But at some point one does have to make sure that performance is in the ballpark. I’ve read Stefan Kaes’ excellent blog with his many suggestions on how to improve performance. So I have about 20 things in mind that I could “fix” and hope that they improve performance. But I don’t like working blindly, so I decided to follow the conventional procedure: 1. benchmark the application to get baseline measurement(s) 2. profile the application to figure out where the time is going 3. optimize the bottlenecks using Stefan’s tips or things I figure out on my own 4. re-benchmark the application to see whether I actually improved things 5. go back to step 1 until satisfied 6. learn from what I did so I don’t have to do it all over for the next app/feature

I’m planning to write up the whole story here, this is just the first part. I hope this will be useful to others and I hope it will jog my memory the next time I’m going through this :-).

Part 1: benchmarking

For the benchmarking I am using httperf which is an excellent program to apply realistic load to a web site. The reason I like httperf is that it decouples request generation from server responses, which means that it can continue opening new concurrent connections to the server no matter how fast the server responds. This is as it happens in real life: users come to the site no matter how slow the server is and only as they browse around they get slowed down by the server’s response time. The bottom line is that this allows httperf to overload the server and really drive it against the wall. Httperf can also take little scripts of URLs which describe a flow through the site and it will open a connection and then “walk through” the URLs one at a time, and if the site has images, it can request the images for a page in a burst. So all-in-all it’s a good and realistic benchmarking tool.

The downside of httperf is that the URL scripts are very primitive, and in particular if the site requires authentication, then it’s tedious to have httperf log in as a different user every time it opens a connection.

Creating a workload

To get started I downloaded httperf from HP’s site and installed it. I don’t remember the details, but it seemed pretty straightforward.

The key to using httperf as described above where it walks through a sequence of URLs is the –wsesslog workload generator. If you want to play with it, you will need to check out the man page for all the details, but here is what I did. The first thing was to create a file with the workload. My first test of the AWS-console looked as follows:

/sessions method=POST contents=''

This file fetches “/”, which redirects to “/sessions/new” and which is followed by fetching all the javascripts and stylesheets referenced by the response. Then comes the favicon and that is followed by a POST of the login information. Then it fetches “/” because the login redirects there, and then a bunch of fetches of the instances and images pages.

The way I came up with this list is to start a fresh browser and navigate to the site, and then look at the server log in /var/log/httpd/access_log (or similar) and grab all the URIs listed there. I had to make-up the contents of the post myself, but it’s simply a URL-encoded string of the various form fields.

Note that httperf does handle cookies in a simple but effective manner. At the first request, the site will return a session cookie to httperf which the latter dutifully presents with all subsequent requests. So a site where the user has to login before proceeding actually does work correctly. Nice!

Then I tested this using httperf with the following command line:

httperf --hog --server --wsesslog=1,0,aws-test1 \
--session-cookie -ssl --print-reply

The –hog option has to do with sockets and is important but not interesting, –session-cookie enables cookie handling as described above, –server is the address of the server, –print-reply prints out all the responses from the server so I can check that the proper pages get returned and not some errors, –ssl uses https (aws-console is an SSL site), and –wsesslog references the workload file. The –wsesslog options signify that httperf should run through the workload file once, and delay 0 seconds between URLs (to simulate user think time, which I’m not interested in), and aws-test1 is the filename of my workload. Printing out the replies means lots of stuff to scroll through, but it allowed me to check that the login and the other page fetches worked properly.

Applying some load

Then a quick test running 30 sessions and starting a new session every two seconds:

httperf --hog --server --wsesslog=30,0,aws-test1 --session-cookie --rate 0.5

To produce a graph, I would vary –rate from about 0.1 to 5 (ramp starting a new session from one every 10 seconds up to 5 per second). Good numbers for your app will depend on the performance that you see.

Well, the result I got was a whopping 1.2 replies per second on average!!! Given that almost half the requests are for static pages (the style sheets and java scripts) which are served blindingly fast by apache, I can only say “abysmal!”

With this poor performance there really isn’t much point in generating a graph that shows reply rate as load is increased or load vs. response time.

Something I did is to add more sessions to my description file so that I can exercise different use cases (flows through the site). I though of automatically generating variants of the workload file where I change the login id and perhaps some other post parameters, but I haven’t implemented that yet.

Note that httperf has a “-v” option which is helpful in seeing progress. One can specify a very large number of sessions in –wsesslog and then watch httperf print out its requests/sec measurements every 5 seconds and after one to two dozen measurements, hit ctrl-C to get the overall stats.

Figuring out where the time is going

So, why doesn’t it do more requests per second than it does? That’s an excellent question for part2, to appear soon!