No One Ever got Fired for Buying Linux

For a while there, Microsoft made a lot of hay about the London Stock Exchange using Windows in their trading system.

As it turns out, too much hay. The LSE is now going to abandon their Windows trading system. As the author points out, IT failures aren’t all that rare, what is rare is learning about them. Usually the principals bury the body and move on to “Phase II”. In this case the principal was fired, and her replacement is hanging out the dirty laundry.

Another thing that is rare is for a dominant vendor to shoulder any blame for these kinds of failures. The usual principle is that, if everyone is doing it, it can’t possibly be stupid.

Did you buy an expensive web mapping server and then have to put it on a nightly re-boot cycle to avoid service degradation? Don’t worry, everyone is doing it, it doesn’t reflect badly on you.

Is all your e-mail locked in binary file archives, where a small corruption can render the entire archive irretrievable? Don’t worry, everyone is doing it, it doesn’t reflect badly on you.

It’s not an IT thing, really, it’s called “culture”, our common shared beliefs and idiosyncrasies.

Did you start your day by repeatedly accelerating and decelerating a 4000lb metal box holding only yourself and a cup of coffee over a hot tar field, place your box in another hot tar field, and then hike over the tar field to a large glass box enter, and place yourself inside a further fabric covered box? Don’t worry, everyone is doing it, it doesn’t reflect badly on you.

Lies, Damn Lies...

Green shoots…” ah, for the good old days of only two weeks ago, when green shoots were in our future…

Job Losses

I never really understood why decreases in the rate of change of unemployment were considered such great news. “Good news, the second derivative has gone positive! we’re plunging into the abyss slightly less quickly!” Only in a world of rampant, congenital optimism – or statistics-induced myopia – could four months in which 18,300 Americans lost their jobs every day be described as a period of “improving conditions”.

Working in the Cathedral

In February, at the Toronto Code Sprint, the PostGIS team looked each other in the eye (for the first time) and committed to get the 1.4 release out by late April.

Well, it’s late June now. It seems very likely that I will get to cut 1.4.0RC1 tomorrow morning.

My personal preference has always been to release early and often. In the hacker ethic, this sounds like a good thing, it’s the “bazaar” model that Eric Raymond promoted over the “cathedral” model of development. In the bazaar, you dump out regular releases, and let the community dictate whether they are of quality (“don’t use 2.31.2a, it’s garbage!”). I still remember being told by a more knowledgeable Linux user that I could upgrade to 1.1.53 (?), but not any further than that, because the succeeding releases were unstable. In the cathedral, you release no wine before its time, aiming for a polished diamond of a release.

So, 1.4.0 has taken much longer than expected, the confluence of a development team that is now unwilling to accept the existence of any “crasher” bugs at all (no matter how unlikely they are to be exercised) and a growing comprehensiveness in the test suite, which is now covering all the functions, in most every combination of inputs. Because of the enhanced testing, we discovered crashers we didn’t know we had – and then we had to fix them.

Despite chafing to release! release! release! I have come to appreciate our new conservatism. Among my favorite feedbacks on PostGIS is the users who say “it just works, install it and forget about it, rock solid”. That feels good, and to keep things that way, our new austerity is only going to help.

The maturation of PostGIS into a product you can just “install and forget” has been multi-stage.

Prior to the 1.0 release, Sandro Santilli added the first regression tests. These tests have been growing ever since and have been invaluable in ensuring that old bugs don’t re-enter the code base, and that new features don’t break old features.

For the 1.4 release, the documentation was upgraded substantially, by adding a great deal of extra structuring to the reference section. Regina Obe discovered that a side effect of the extra structure was that she could automatically generate a test for most every documented function using XSLT on the docbook XML. This new “garden test” found a number of previously undetected bugs, that have since been removed.

For the 1.4 release, I added the start of a CUnit test suite that exercises the PostGIS functions without requiring a database back-end. Even in it’s early state, it has saved me from a couple booboos already. For future releases, this extra regression suite is going to help keep things stable.

For the 1.4 release, Mark Cave-Ayland re-worked the logging and debugging infrastructure, to make the coding cleaner and easier to maintain during debugging cycles. He also split out the underlying geometry implementations, which are now used in the loader/dumper utilities, for a more consistent approach to geometry handling.

These are all under-the-covers improvements that end-users never see. But they all contribute to that “it just works, it just runs” end-user experience that I have come to treasure even more than the sensation of slamming out a point release at 2am. I hope everyone tries out RC1 so that we can slay any remaining bugs before the 1.4.0 release!

MySQL vs PostGIS

Did I say I would publish my performance results? I did. Here they are.

MySQL Snark #2

I am doing a little benchmarking as a learning experience with JMeter and I will publish the throughput numbers in a few days, after I run the full suite I have developed on the various combinations of concurrency and insert/select ratios.

Because MySQL has so few functions that actually do anything (see the note here) there’s not a great deal to test beyond raw performance. The early throughput results seem to indicate it’s comparable for simple CRUD on one table, but for anything non-trivial it falls down.

Here’s a basic spatial join: pull 23 roads from a 3.4M row line table and spatially join to a 66K row tract polygons table, calculating the sum of the areas of tract polygons found. There are spatial indexes on both tables.

mysql> select sum(area(t.geom)) 
from tiger_roads_texas r, tiger_tracts t 
  mbrintersects(r.geom, GeomFromText('LINESTRING(453084 -1650742,452384 -1650442)')) 

| sum(area(t.geom)) |
|  1260394420.00453 | 
1 row in set (9.43 sec)

And in PostGIS:

tiger=# select sum(area(t.geom)) 
from tiger_roads_texas r, tiger_tracts t 
where r.geom && GeomFromText('LINESTRING(453084 -1650742,452384 -1650442)',2163) 
and r.geom && t.geom;

(1 row)

Time: 5.574 ms</pre>

Those are both “hot cache” results, after running them a couple times each.