Mapserver/PostGIS Performance Tips

I’m working on re-writing the PostGIS driver in Mapserver to clean it up a little and hopefully make it faster, and seeing the flow of control, there are a couple ways users of the existing driver can improve performance with small configuration changes. The simplest syntax for defining a PostGIS layer in Mapserver is just:

DATA "the_geom from the_table"

Very simple, but: how does Mapserver know what primary key to use in queries? And what SRID to use when creating the bounding box selection for drawing maps? The answer is, it asks the database for that information. With two extra queries. Every time it processes the layer.

However, if you are explicit about your unique key and SRID in configuration, Mapserver can, and does, skip querying the back-end for that information.

DATA "the_geom from the_table using unique gid using srid=4326"

Also, if you have more than one PostGIS layer in your map file, you should turn on the Mapserver connection pool, even if you’re not running in FastCGI mode. That’s because the pool will allow all the layers to reuse the same connection. If you have have seven PostGIS layers, at 15ms per connection, that’s 90ms saved (you still pay 15ms for the first connection).

Add this line at the end of each PostGIS layer to tell Mapserver to leave the connection open for future layers:

PROCESSING "CLOSE_CONNECTION=DEFER"

Go fast, fast, fast!

New Regime

Today my wife went back to work, which means the “new regime” is in effect!

I admit, I’ve been taking it pretty easy the last 18 months, I haven’t cooked a lot of breakfasts or meals. Really, I’ve been a bum.

But mornings now, my wife has to get ready for work, which means I have to feed and water the kids (toss some slops in the bin, hose them down afterwards, you know the drill) and get them ready to be carted off to daycare. Wake up and get that game face on! It’s really very nice, and reminds me of my year at home taking care of my daughter when she was one – dealing with the little details of kid life is very gratifying (in moderate doses).

And the pay-off? Once they head out the house descends into silence, blessed silence. Super-productive morning today.

Picking up the gauntlet

Mike Pumphrey over at the Geoserver blog has written a short post about this year’s Geoserver-vs-Mapserver comparison. I hope we can maintain this study as an annual event, and even get someone with an ArcServer license to join in the fun. Each iteration finds new areas that need work and resets the bar better and better every year.

Basically, there are some differences that are small, and ignorable, and there are some differences that are really anomalous. And the end of the day, both systems are doing the same thing, so order-of-magnitude performance differences are cries for help.

I’ve been focussing on the Mapserver side. Last year, the study by Brock and Justin found an odd quirk where Mapserver got progressively worse at shape file rendering as the shape files got bigger. I found the issue and fixed it this spring, and (w00t!) Mapserver won the shape file race this year.

But… this year found that the PostGIS performance in Mapserver was (while fast) about half as fast as Geoserver. Hmmmm. So I know what I’ll be working on this month. I have some guesses, but they will need to be tested.

Andrea added some aesthetic tests this year, and brought them to the attention of the Mapserver team, and as a result the next release of Mapserver will include more attractive labeling results and line width control.

Any development team that’s willing to swallow their pride (because for every test you win, there’s one you’ll lose) can get a lot of benefit in joining in this benchmarking exercise.

Keep your friends close...

And your enemies closer. It seems ESRI has yet to learn that particular piece of the wisdom of Sun-tzu, and that’s too bad. By excluding “competitors” that are very small compared to the overall marketplace, ESRI is being penny-wise and pound foolish. Sure, open source will steal a few accounts here and there, but the real prize is to co-opt them into your ecosystem, where you can keep an eye on them, a lesson Microsoft has clearly learned.

uDig 1.1.0

I’m a cowboy. I like to just slap a brand on the cattle and push them out the gate. Sometimes this gets me in trouble.

Jesse Eichar, the uDig project lead, is not a cowboy. The 1.1.0 release comes after a series of 14 (fourteen) “RC” versions and three “SC” versions. Congratulations to Jesse, and to Jody and Andrea and other uDig team members, on “going gold” with the 1.1 release. Remember, if things aren’t perfect, there’s always 1.1.1!

One thing watching the uDig development process has taught me over the years is how much harder user-facing applications are than server-side ones. The number of places you can “get it wrong” is orders of magnitude greater. The number of ways you can fine tune and fine tune and fine tune a particular piece of interaction is almost infinite (the editing tools are something like major revision four since the project started, and I’m sure there will still be things to be changed and fiddled with, given the hyper-modality and hyper-interactivity of editing). It has given me a lot more respect for the people writing web browsers and word processors and all the other virtual tools that we use every day. And now I automatically quadruple estimates that involve user interfaces, instead of merely doubling them as I used to.

Update: A timely review of uDig, posted at the Linux Journal.