Benchmarks, Damn Benchmarks and Statistics

Seen on the #mapserver IRC channel:

“hey guys I have mapserver running fine on a vehicle tracking application, what I want to know is the requirements for mapserver. Let say 100 connections on the same time. I have 2 GB RAM , Dual Core 3GHz procesor, what do you say, Will it be enough?”

Well, enough for what? At least one variable is missing, and that’s the expected response time for each request. If I am allowed to take a week per response, I can really lower the hardware requirements!

Benchmarking an application is a tricky business, and there are lots of ways to quantify the robustness of an application. My favorite method is a holistic method that takes into account the fact that most of the time the load generators are human beings. This won’t work for pure “web services”, where the requests can be generated automatically by a wide number of different clients.

Step one is to generate a baseline of what human load looks like. Working through your test plan is one way to achieve this, though you might want to game out in your head what a “typical” session looks like rather than a “complete” session that hits every piece of functionality once. Call this your “human workload”.

  1. Empty your web server logs.
  2. Sit down and run the “human workload” yourself, at a reasonable speed. You know the application, so you probably click faster than an average user, no matter, it doesn’t hurt to bias a little in the fast direction. When you are done with your session note the elapsed time, this is your “human workload time”.
  3. Now, take your web server logs and run them back against the server using a tool like curl. This generates all the requests from your human session, but forces the server to execute them as fast as possible. When it finishes, note the elapsed time, this is your “cpu workload time”.
  4. Finally, divide your “human workload time” by your “cpu workload time”. The result is how many computers like the one you just ran your test on are needed to support each human. If the answer is 0.2, then you can support 5 humans on your test machine.

Obviously, this is a very simple test metric, but it has the advantage of extreme ease-of-application and a tight binding between what is being measured and what the real world will finally hit the application with.

Blood Sucking Phone Company

So, I’m sitting in SeaTac, minding my own business, drinking a cup of tea (SeaTac travel tip: free WiFi at Tully’s coffee with purchase) and my phone rings. It is my (god forsaken, soul sucking, devil worshipping) cell-phone company. They want to give me a free phone! It’s quad band! It’s got a great camera! I’m a preferred customer, so it’s only $10 (for a $300 phone!) and comes with a second line.

Well, having been screwed before, my ears perk up. What is this second line, of which you speak? “Oh, you get unlimited minutes for three months, sir!” And after three months? “We can get you a couples package.” Well, fabulous. Another up-sell. Thanks, but no thanks, and take your fabulous phone with you.

It’s a good thing they weren’t offering me an iPhone, or I would have cracked.

O'Reilly on Open Source ... in 1999

It is more than a little disturbing that both the myths and the explanations in this nearly-10-year-old article have remained so constant. Here are the myths:

  • It’s all about Linux versus Windows, with Red Hat as yet another challenger to Microsoft.
  • Open Source Software Isn’t Reliable or Supported.
  • Big companies don’t use open source software.
  • Open Source is hostile to intellectual property.
  • Open Source is all about licenses.
  • If I give away my software to the open source community, thousands of developers will suddenly start working for me for nothing.
  • Open source only matters to programmers, since most users never look under the hood anyway.
  • There’s No Money to be Made on Free Software.
  • The Open Source movement isn’t sustainable, since people will stop developing free software once they see others making lots of money from their efforts.
  • Open Source is playing catch up to Microsoft and the commercial world.

Fake Steve on the Borg and Freetards

Good analysis in a sarcastic candy coating:

Open source has ridden the classic Gartner hype cycle. Three years ago was the “peak of inflated expectations,” and VCs would fund anything with “open source” in its name or business plan. Now the cycle has moved on to the “trough of disillusionment.” Reality has set in. Nobody is making money. They’re in the Slough of Despond.

Bear in mind that FakeSteve is the Voice of the Valley, and measures using the standard Valley metrics: VC funding, license sales, lines of copy in Fortune. Open source has been through two hype cycles now (from cycle one, remember the VA Linux IPO?) and keeps on trucking. There are more things in heaven and earth, FakeSteve, than are dreamt of in the Valley.

Timmy's Telethon #6

And finalement:

  1. Third party business applications: It could be argued that an enterprise GIS exist to support business requirements which often call for a third party client solution. Are vendors building COTs apps against these third party solutions?

I think the wording is a little off here, and it should be “are vendors building third party COTS solutions against these open source apps?”

And the answer is “sure!” Not as many as against the ESRI stack, but that is to be expected: ISVs support things where there is demand for support, and the market leader obviously drives a lot more demand. However, the amount of ISV support for open source is greater than zero, and growing. On the PostGIS side, for example:

  • Safe Software’s FME supports it
  • Mapguide supports it
  • Ionic Red Spider supports it
  • ESRI’s Interoperability Extension supports it
  • Manifold supports it (!!!)
  • CadCorp SIS supports it
  • ArcSDE 9.3 supports it
  • MapDotNet Server supports it

Most of the web-services side of things are supported via open standards, so Mapserver slots into all kinds of infrastructures and third party tools via the WMS standard, for example. I imagine if the WMS standard did not exist, more software would talk directly to Mapserver via it’s own CGI dialect, but WMS made that redundant.

And on the client side, I found this nugget from Bill Dollins’ ESRI Fed UC review intriguing:

So the take away was “we’ve got the server to crunch your data and give you good analysis results, display it any way you want.” There was no OpenLayers demo but it was mentioned several times and something that should be able to leverage new APIs.

In the open source world, OpenLayers is already the “default map component” for web apps, it is interesting to see it already banging on the doors to the proprietary world as well.