Sprint Day #1

Ready, set, go!

The first day of the Toronto Code Sprint 2009 was today, as 20 hackers from the geospatial “C Tribe” gathered in a smallish room at the Radisson hotel on the Toronto waterfront.

The weather was warm and clear yesterday, but cold and wet today, not that we noticed until the evening.

I spent my day huddled with the Largest Group of PostGIS Developers Ever Assembled. Myself, Regina Obe, Mark Cave-Ayland, Leo Hsu, Olivier Courtin and Pierre Racine. Day one was mostly talking, about our goals for PostGIS 1.4 (get it out the door, close the bugs and release a candidate) and PostGIS 2.0 (room for more types, strict support of ISO SQL/MM outputs, geodetic support, 3D objects, geometry typemods, and wktraster). We got a lot decided in a short period, and are ready to buckle down for some coding tomorrow in aid of getting 1.4 out the door.

Because I was huddled with PostGIS people, I didn’t get to participate in the Mapserver planning session. But over dinner I was told that XML map-file directions were discussed and decided on (a standard schema will be developed and XSLT transform from map->xml), performance issues were targeted (proj!), and a plan for one-pass querying (to speed up WFS mode) was settled on.

This evening, we watched the Hershey Bears best the Toronto Marlies of the AHL, featuring some fine offense (on the part of the Bears) and the requisite number of fights. Then we battled the weather (that cold, cold rain) back to dinner at East Sid Joe’s downtown. Thanks to our sponsors, qPublic.net and Rich Greenwood for supporting today’s activities.

Talkie Talkie

Everyone who does talks regularly has their own approach to building up the necessary content and flow. Dave Bouwman builds his talks up from post-it notes. A nice approach.

For brand new topics, I take a long walk, a couple hours or more around Beacon Hill Park, and roll ideas about – I carry a notepad so I can jot them down and then forget them again, keeping my brain loose. That first couple hours of walking hopefully yields the kernel of the talk and a a handful of interesting ideas – perhaps a dozen lines of text. Then I sit on the couch with a text editor and write the presentation like an essay. Outline in high level. Detail things, move those bits around, then actually write the talk, like an article, fully written out. When I get to around 5000 words I know I have about hour of material. I put ideas for graphics and slides in-lined in the text with «<»> characters. Once I’m happy with the story, I sit down in a separate session and build the actual slide deck. Google Images is a pretty useful source of pictures for almost any topic.

It’s an absurdly time-consuming process, but for keynotes and other instances where I’m monopolizing the attention of hundreds of people at a time, it’s a fair bargain. They are giving me their time en masse, they deserve a polished product – spend 5 minutes saying “um” or “ahhh” in front of 400 people, you’ve just wasted 33 hours of aggregate time.

FastCGI Hint

I’m preparing to benchmark / profile Mapserver and PostGIS for the upcoming code sprint in Toronto, and setting up Mapserver as a FastCGI is a requirement to get good profiling results. The JMeter bench marks run multiple threads of load, so having multiple Mapservers running makes things faster.

However, trying to get “FastCgiConfig” to dynamically spawn the required instances was a real pain. Setting the “updateInterval” nice and low made extra Mapserver processes come online a little faster, but in a kind of chunky way. It seemed to kill the existing process before flipping on the new ones. The config line looked like this:

FastCgiConfig -appConnTimeout 60 -idle-timeout 60 -init-start-delay 0 -minProcesses 2 -maxClassProcesses 6 -startDelay 5 -restart-delay 1 -killInterval 30 -singleThreshold 5 -updateInterval 1

In the end, I opted to just statically start the number of processes that made sense for my dual core system (4, in my estimation) using the “FastCgiServer” directive. The config line is a blissfully simple:

FastCgiServer /Users/pramsey/Sites/cgi-bin/mapserv.fcgi -processes 4

Throughput for simple tests (style-free roads from PostGIS, 4 threads of execution) is running as high as 48 maps per second.

Small Mercies

One of the “nice” things about having already lost most of ones investment portfolio is that further drops in the market aren’t nearly so emotionally distressing. 5% of the original investment, that was worth fretting about. 5% of what’s left? Meh.

I have become, in the words of the rock dude, comfortably numb.

Update: Good thing I’m in it for the long term. Oh, wait a minute.

Cadcorp?

What is this “Cadcorp” of which you speak?

We in the long-homogenized markets of North America find it easy to forget that there is any vendor other than the One True Vendor, but in regional markets all over the world there are strong alternatives available. Cadcorp is a strong regional UK/EU company, GeoConcept is a French company, there are lots of others in markets I know nothing about.

(As recently as 10 years ago, BC still had its own regional GIS software, “PAMAP GIS”, used by the Ministry of Forests and others for spatial analysis. After a valiant (and fatal) attempt to crack into the US Forest Service, PAMAP was purchased by PCI and eventually moth-balled.)

Unlike the “leading brand”, Cadcorp has made it a goal to interoperate with as many other systems, in as many ways, as possible. They were the second proprietary product to include native PostGIS support (the gods of interoperability, Safe Software, beat them to it by a nose), they have been active in the OGC since forever, and they added support for things like GeoJSON, GeoRSS, and the Tile Mapping Specification almost before they existed.

I’ve got Cadcorp on the brain because they recently won a pretty big contract to provide software a bunch of Irish local governments, and the database that is going to underlie this installation is PostGIS.

All that interoperability adds value – one of the things that differentiates Cadcorp’s product is how easily you can use it to directly edit and manipulate PostGIS data. Fulton County, Georgia, was one of the earliest users of PostGIS, and not-coincidentally is also a Cadcorp customer.

Cadcorp has also been hearing from customers that they really, really, really, want to put their rasters into the database (poor buggers), so Cadcorp has taken the next step down the slippery open source slope. They have funded (with cold hard cash to this PostGIS developer) some early work on defining raster storage in PostGIS, and have also allocated staff time to support the development. The WKTRaster project has now got types defined and is starting to implement data importers/exporters. It will be a hot topic of discussion among the PostGIS team at the Toronto Code Sprint next week. If you are interested in rasters in the database, the project is still only partially funded and looking for more backers.