Sponsor GEOS, Make PostGIS Faster

Martin Davis just posted about his improvements to the JTS buffering routines, speeding up buffering by a mere factor of 20 or so.

Martin has also added some improvements in the area of unions for large sets of geometries, a technique he calls “cascaded union”. It too is good for orders-of-magnitude performance improvements.

Do you have PostGIS queries of this form:

SELECT ST_Buffer(the_geom,1000) 
FROM [...] 
WHERE [...]


SELECT ST_Union(the_geom) 
FROM mytable 
WHERE [...] 
GROUP BY [...]

If you do, then getting Martin’s JTS algorithms ported to GEOS (the C++ geometry library used by PostGIS) will make your database run faster. Lots faster.

How can you help that happen? Become an OSGeo “Project Sponsor” for GEOS. Project sponsor commit a modest sum to the ongoing maintenance of the code, which is generally used for hiring a maintainer to do things like ensure patches are properly integrated, that tests are added for reliability, and that upgrades like the ones Martin has created get folded into the code base in a timely manner.

If you’re interested in sponsoring GEOS development, please get in touch with me. If you are using PostGIS in your business, it is money well spent.

Voting Day in Canada

Happy voting day, Canadians. I’m just about to head out my local polling place to nullify somebody’s vote by casting an equal, opposite vote (I like to identify them in line, for maximum enragement. “Oh, you’re voting Conservative? I’m voting NDP, so you might as well have not even bothered coming.”)

I wouldn’t be a Canadian if I didn’t take this opportunity to point out how this day demonstrates (yet again) how superior we are to our cousins to the south. We started our election later than them (September 9), but we’re still finishing first. I will not have to spend three hours in line to vote. I will vote using a pencil and paper, not a touch screen doohickey. We have four national parties to choose from (five, if you’re from Quebec and think Quebec is a nation) instead of two. We also have lots of other great options on the ballot, for those who like to get off the beaten path (see right). And we don’t just have hockey moms running for office, but honest-to-god hockey players.

Credit Crisis and Trade

I found this item today, particularly chilling:

The credit crisis is spilling over into the grain industry as international buyers find themselves unable to come up with payment, forcing sellers to shoulder often substantial losses.

Before cargoes can be loaded at port, buyers typically must produce proof they are good for the money. But more deals are falling through as sellers decide they don’t trust the financial institution named in the buyer’s letter of credit, analysts said.

I think everyone should take a deep breath and go read Roosevelt’s first inaugural, both for a perspective on how much worse things can get, and the mindset needed to address these things.

So, first of all, let me assert my firm belief that the only thing we have to fear is fear itself–nameless, unreasoning, unjustified terror which paralyzes needed efforts to convert retreat into advance. In every dark hour of our national life a leadership of frankness and vigor has met with that understanding and support of the people themselves which is essential to victory. I am convinced that you will again give that support to leadership in these critical days.

In such a spirit on my part and on yours we face our common difficulties. They concern, thank God, only material things. Values have shrunken to fantastic levels; taxes have risen; our ability to pay has fallen; government of all kinds is faced by serious curtailment of income; the means of exchange are frozen in the currents of trade; the withered leaves of industrial enterprise lie on every side; farmers find no markets for their produce; the savings of many years in thousands of families are gone.

And it goes on… For something delivered 75 years ago, it feels surprisingly topical.

PostGIS Performance: Prepared Geometry

Spatial joins are a common use case in spatial databases, putting together two tables based on the spatial relationships of their geometry fields. For example:

SELECT census.income, houses.value 
FROM census JOIN houses 
ON (ST_Contains(census.geom, houses.geom))

The way this gets evaluated by the database looks something like this:

for each census polygon
  for each house point near the census geom
    run the st_contains test on that pair of geometries

Because the outer loop is driven by the census geometry, you will get repeated calls to the “contains” algorithm that have the same polygon each time. By recognizing this repetition, you can build a shortcut, that creates a smart, indexed polygon, and uses it over and over each time it is repeated in a function call.

The “smart, indexed polygon” is a “PreparedGeometry”, and the concept was implemented in JTS over a year ago. About six months ago, it was ported to GEOS (a C++ mirror of JTS), but the port still had some nagging memory leaks which made it unready for production use.

Last month, Zonar Systems, who funded the initial JTS algorithm work, asked me to bring the functionality the rest of the way out to PostGIS. We found a C++ expert who identified and removed the last GEOS leaks, and I cleaned the leaks out of the PostGIS side of the implementation.

The speed difference is impressive!

I have a test data set of two tables: one table of 80 large polygons, and another table of 8000 small polygons. Each large polygon contains about 100 small ones.

Without the prepared geometry, a spatial join using ST_Intersects takes about 40 seconds. With the prepared geometry, the join takes 8 seconds, five times faster. The larger the size difference between your tables, the larger the speed-up you see will be.

The functions effected by the PreparedGeometry upgrade are ST_Intersects(), ST_Contains(), ST_Covers() and ST_ContainsProperly().

To try out the new functionality, you’ll need to check out and compile the GEOS SVN trunk (http://svn.osgeo.org/geos/trunk) which will become GEOS 3.1.0 in a little while, and the PostGIS 1.3 SVN branch (http://svn.osgeo.org/postgis/branches/1.3), which will become PostGIS 1.3.4 shortly. First compile and install GEOS, then PostGIS, since PostGIS checks the GEOS version during the compile stage to determine whether to activate the functionality.

Major thank you to Zonar Systems for funding the initial work and then stepping up a second time to fund the clean-up and roll-out to production-ready status. Why did they do it? They run a major fleet tracking and data analysis system on PostGIS, and they need lots of speed to handle the huge data volumes generated by their real-time tracking devices.

Rotten Afternoon

Anybody want a set of wisdom teeth? I’ve got a couple that I won’t be using anymore…