Extra, extra, PostGIS Really Fast!

Because, frankly, I love nothing more than approbation, I am going to quote this comment on “Much Faster Unions in PostGIS” in full:

This is a truly spectacular piece of work. We have often been asked by clients to buffer and merge point datasets with several million points. We attempted this using ArcWhatever (could barely open the points, let along buffer them) and FME, which ran for a week and then gave an out of memory error. So, I do the whole configure, make, make install thing, 4 times, for postgres, goes, proj4 and postgis. After a lot of swearing and running ldconfig a few million times I eventually get postgis to accept that geos really is installed – MySQL might have more limited spatial functionality, but it sure is a lot easier to build from source. Anyway, I digress. I run a few random queries using the excellent generate series capability in postgres, and manage to create, buffer and merge 100,000 points in a few seconds. Finally, I try this on a real world dataset, namely all of the postal addresses in Wales, 1.4 million or so. With a 200m buffer, this ran on a reasonably pokey 64-bit linux box in 19 minutes. Truly astonishing. Well done. Much as I love MySQL, this was a bit of St. Paul on the road to Damascus moment.

Full credit to Martin Davis, who implemented this technique in JTS. We just borrowed it for database land.