Saturday, July 26, 2008

Buffering Jujutsu

Update: Martin Davis tried out a variant of simplification, and produced results even faster than Michaël's.

Inquiring about some slow cases in the JTS buffer algorithm, Michaël Michaud asked:
I have no idea why a large buffer should take longer than a small one, and I don't know if there is a better way to compute a buffer.
Martin Davis and I dragged out the usual hacks to make large buffers faster, simplifying the input geometry, or buffering outwards in stages. Said Martin:
The reason for this is that as the buffer size gets bigger, the "raw offset curve" that is initially created prior to extracting the buffer outline gets more and more complex. Concave angles in the input geometry result in perpendicular "closing lines" being created, which usually intersect other offset lines. The longer the perpendicular line, the more other lines it intersects, and hence the more noding and topology analysis has to be performed.
A couple days later, Michaël comes back with this:
I made some further tests which show there is place for improvements with the excellent stuff you recently added.

I created a simple linestring: length = 25000m, number of segments = 1000

Trying to create a buffer at a distance of 10000 m ended with a heapspace exception (took more than 256 Mb!) after more than 4 minutes of intensive computation !

The other way is: decompose into segments (less than 1 second); create individual buffers (openjump says 8 seconds but I think it includes redrawing); cascaded union of resulting buffer (6 seconds).
Holy algorithmic cow, Batman! Creating and merging 1000 wee buffers is orders of magnitude faster than computing one large buffer, if you merge them in the right order (by using the cascaded union). Huge kudos to Michaël for attempting such a counter-intuitive processing path and finding the road to nirvana.
 

2 comments:

Matt said...

Love this problem-solving post! I understand the buffering function fairly well, but would love to have some examples of applications woven in. Are there practical examples that could be tested with both approaches. That would make a compelling read that could draw in a lot of practitioners, in addition to the geospatial developer crowd.

Dr JTS said...

Hold on there cowboy!

So far there hasn't been any speed comparison between the partitioned/union approach and the other two possibilities (iterated buffer and simplified input). I don't see any reason a priori why one of those approaches might not offer even better performance than the partition & union approach.

About Me

My Photo
Victoria, British Columbia, Canada

Followers

Blog Archive

Labels

bc (35) it (27) postgis (19) icm (11) enterprise IT (10) video (10) sprint (9) open source (8) osgeo (8) cio (6) management (6) enterprise (5) foippa (5) foss4g (5) gis (5) spatial it (5) foi (4) mapserver (4) outsourcing (4) politics (4) bcesis (3) oracle (3) COTS (2) architecture (2) boundless (2) esri (2) idm (2) natural resources (2) ogc (2) open data (2) opengeo (2) openstudent (2) postgresql (2) rant (2) technology (2) vendor (2) web (2) 1.4.0 (1) HR (1) access to information (1) accounting (1) agile (1) aspen (1) benchmark (1) buffer (1) build vs buy (1) business (1) business process (1) cathedral (1) cloud (1) code (1) common sense (1) consulting (1) contracting (1) core review (1) crm (1) custom (1) data warehouse (1) deloitte (1) design (1) digital (1) email (1) essentials (1) evil (1) exadata (1) fcuk (1) fgdb (1) fme (1) foocamp (1) foss4g2007 (1) ftp (1) gds (1) geocortex (1) geometry (1) geoserver (1) google (1) google earth (1) government (1) grass (1) hp (1) iaas (1) icio (1) industry (1) innovation (1) integrated case management (1) introversion (1) iso (1) isss (1) isvalid (1) javascript (1) jts (1) lawyers (1) mapping (1) mcfd (1) microsoft (1) mysql (1) new it (1) nosql (1) opengis (1) openlayers (1) oss (1) paas (1) pirates (1) policy (1) portal (1) proprietary software (1) qgis (1) rdbms (1) recursion (1) redistribution (1) regression (1) rfc (1) right to information (1) saas (1) salesforce (1) sardonic (1) seibel (1) sermon (1) siebel (1) snark (1) spatial (1) standards (1) svr (1) tempest (1) texas (1) tired (1) transit (1) twitter (1) udig (1) uk (1) uk gds (1) verbal culture (1) victoria (1) waterfall (1) wfs (1) where (1) with recursive (1) wkb (1)