Elephant vs Dolphin

Elephant crushes dolphin, but dolphin drowns elephant?

More folks doing real work needing a real spatial database, this time RedFin a real estate information company.

Specifically, we were having some major performance problems with queries that were constrained by both spatial and numeric columns, and all of our attempts to squeeze more performance out of MySQL (including hiring expensive outside consultants) had come to naught.

Guys, if you’re going to hire expensive outside consultants to help with your spatial database problems, they should be (a) me and (b) conversant in PostGIS! :)

Update: It is worth explaining that their main problem, the slow results when the spatial clause was weak, is a direct result of MySQL having an inferior query planner. PostGIS is fast because PostGIS provides good statistics on spatial index selectivity to the planner, and the PgSQL planner is flexible enough to accept selectivity information from extended types. I know this is a big performance problem because before PostGIS had a decent selectivity system, performance bottlenecks like RedFin’s happened all the time.

Feel the Burn

My purchase of a house in central Victoria eight years ago just keeps getting better and better (I can walk everywhere I want to go, daily amenities within 500m, all of downtown within 2km). Paul Krugman posts on a map from the Sydney Morning Herald, showing the expected proportion of family income that will go to petrol as prices rise.

I will refrain from gloating to my colleagues in exurban Atlanta and Phoenix. Except this probably counts. Oh well, in for a penny, in for a pound: suckers!

I wonder if David Brooks will recant his (surprisingly recent) love letter to the exurb. James Howard Kunstler hasn’t been that good at predicting the future, but his take on Atlanta from five years ago is looking more and more spot on.

ArcServer vs Geoserver

Props to Chris Holmes, posting about the work going into making the next version of Geoserver fully crawleable (not a real word! (yet!)) by web spiders, right in synch with Jack Dangermond’s announcement of the same feature coming in ArcServer.

The government here in British Columbia is moving haltingly towards a more service-oriented approach to spatial data publishing, and this has included a strong awareness that being consumable by Google tools, and spidered by search engines is key to improving general accessibility. If the citizen’s first stop in searching for information is Google or MSN, it makes sense to spend some effort ensuring your relevant information is properly exposed via those services.

Free business opportunity: establish button-down search engine optimization firm targeted exclusively at government clients.

Now, why is Google sharing the stage with ESRI? ESRI needs Google’s stamp of approval a lot more than Google needs ESRI’s – what’s in it for Google? Is the Google worried that Virtual Earth is doing better cultivating government application development opportunities than the Google suite? Google plays well with the hacker set, but is a harder sell with the suits than tried-and-true Microsoft, particularly in a big, conservative organizations.

Riding the Shark

My annoyance at learning that OS/X 10.5 doesn’t really support gprof has turned into ecstasy at finding the wholly superior Shark profiler that comes with XCode.

Unlike gprof, Shark doesn’t require that you compile with special profiling flags, it can run on unaltered binaries. In fact, Shark doesn’t even require that you run it against any binary in particular! You can run it against everything on your system, then view the profile of any process post-facto.

I just did a profile of Mapserver running as a FastCGI process, just by running some load against Mapserver and letting Shark collect statistics on all processes at once. Then I pick the mapserv.fcgi process from the sample data, and voila!

screenshot_03

I can see that the most costly small function is longest_match, from the bottom-up view at the top, and that it is called in the image saving routines, in the top-down view at the bottom. Good news, Mapserver is so efficient that the biggest cost is compressing the output image.

Even cooler, I can flip to chart view and see what the CPUs were doing throughout the sample period. The blue spikes are mapserv.fcgi calls.

screenshot_04

Zoom into one of those, and we can see the CPU ticks through one map draw, including the kernel (the red bit) taking a slice out. End to end, Mapserver is taking about 15ms to draw this particular map.

foo_01

In addition to the “Time Profile” mode I’m showing here, there’s also a “Java Time Profile”. I wonder if Java developers can make use of this excellent tool too?

FastCGI on OS/X Leopard

A post for posterity.

OK, so you want to run FastCGI on OS/X 10.5, how does that work? If you’ve just followed the directions and used your usual UNIX skillz, you’ll have dead-ended on this odd error:

httpd: Syntax error on line 115 of /private/etc/apache2/httpd.conf: Cannot load /usr/libexec/apache2/mod_fastcgi.so into server: dlopen(/usr/libexec/apache2/mod_fastcgi.so, 10): no suitable image found. Did find:\n\t/usr/libexec/apache2/mod_fastcgi.so: mach-o, but wrong architecture

This is the architecture of your DSO not matching the x86_64 architecture of the build shipped with OS/X. So we must build with the correct flags in place.

Here’s the steps from scratch.

curl http://www.fastcgi.com/dist/mod_fastcgi-2.4.6.tar.gz > mod_fastcgi-2.4.6.tar.gz
tar xvfz mod_fastcgi-2.4.6.tar.gz
cd mod_fastcgi-2.4.6
cp Makefile.AP2 Makefile

Now edit Makefile and change top_dir to /usr/share/httpd and add a line CFLAGS=-arch x86_64.

make
sudo cp .libs/mod_fastcgi.so /usr/libexec/apache2

Now edit /private/etc/apache2/httpd.conf and add a line LoadModule fastcgi_module libexec/apache2/mod_fastcgi.so. Run sudo apachectl restart and you are now loaded. You’ll need to enable FastCGI for your applications as described in the documentation.