Deep Thought

If “open” is good at making others loose, and less than free is the model for things outside your core revenue stream, why isn’t Microsoft giving away free advertising?

Abstracterrific!

Dan McKinley takes a look under the covers of a couple Python PostgreSQL abstraction layers:

Database client drivers intended for the same database can do drastically different things. By Python standards, the Postgres driver situation is completely schizo. There are a lot of them available - there are five dedicated Postgres drivers listed on the wiki, as opposed to just one for MySQL. People might choose different drivers for licensing reasons, for religious reasons, randomly (because they never did any analysis like I am about to do), or for completely inscrutable reasons because they are just plain out of their minds. You really would not believe how much blood I have seen spilled over Postgres client drivers.

Read it!

Goo Smart by Half

I typed “postgis” into Google this evening, and in addition to the exquisitely organized first entry there were two ads, one relevant (EnterpriseDB, the Postgres company) and one seemingly utterly random, for an English language school.

Has the Google algorithm made a mistake? How is this relevant? Well, the language school is in Victoria, BC, birth-place of PostGIS, but it gets better than that.

Here’s a picture of their sign, which I walked past every morning for almost five years, since the school shares a building (1207 Douglas Street) with Refractions Research, birth-company of PostGIS.

But wait, there’s more! The name of the school is “GEOS Language Academy”, and the GEOS spatial library was also created in Victoria, specifically to bring spatial predicate algorithms to PostGIS!

Poor Google algorithm, you never stood a chance.

PgPatch!

It’s not much, but it’s not nothing, my first ever patch to PostgreSQL proper is part of yesterday’s 8.4.2 release:

Fix incorrect logic for GiST index page splits, when the split depends on a non-first column of the index (Paul Ramsey)

It was just a minor syntax error I found because I was reading through that section of the code very, very, very closely (it’s true, I copy the work of smarter people than I) while implementing the indexes for GEOGRAPHY in PostGIS 1.5.

Extra, extra, PostGIS Really Fast!

Because, frankly, I love nothing more than approbation, I am going to quote this comment on “Much Faster Unions in PostGIS” in full:

This is a truly spectacular piece of work. We have often been asked by clients to buffer and merge point datasets with several million points. We attempted this using ArcWhatever (could barely open the points, let along buffer them) and FME, which ran for a week and then gave an out of memory error. So, I do the whole configure, make, make install thing, 4 times, for postgres, goes, proj4 and postgis. After a lot of swearing and running ldconfig a few million times I eventually get postgis to accept that geos really is installed – MySQL might have more limited spatial functionality, but it sure is a lot easier to build from source. Anyway, I digress. I run a few random queries using the excellent generate series capability in postgres, and manage to create, buffer and merge 100,000 points in a few seconds. Finally, I try this on a real world dataset, namely all of the postal addresses in Wales, 1.4 million or so. With a 200m buffer, this ran on a reasonably pokey 64-bit linux box in 19 minutes. Truly astonishing. Well done. Much as I love MySQL, this was a bit of St. Paul on the road to Damascus moment.

Full credit to Martin Davis, who implemented this technique in JTS. We just borrowed it for database land.