Tuesday, February 04, 2014

Introspection Double-Shot

Davy Stevenson has a great post (everyone should write more, more often) on a small Twitter storm she precipitated and that I participated in. Like all sound-and-fury-signifying-nothing it was mostly about misunderstandings, so I'd like to add my own information about mental state to help clarify where I come from as a maintainer.

First, PostGIS is full of shortcomings. Have a look at the (never shrinking) ticket backlog. Sure, a lot of those are feature ideas and stuff in "future", but there's also lots of bugs. Fortunately, most of those bugs affect almost nobody, and are easily avoided (so people report them, then avoid them).

When we first come up an a "major" release (2.0 to 2.1 for example) I expect lots of bugs to shake out in the early days, as people try the new release in ways that are not anticipated by our regression tests. (It's worth noting that the ever-growing collection of regression tests provides a huge safety net under our work, allowing us to add features and speed without breaking things... for cases we test.)

My expectation is that the relative severity of bugs reported decreases as the time from initial release increases. Basically, people will always be finding bugs, but they will be for narrower and narrower user cases, data situations that come up extremely infrequently.

The bug Davy's team ran across broke my usual rules. It took quite a while for users to find and report, and yet was broad enough to affect moderately common use cases. However, is was also something the vast majority of PostGIS users would not run across:

  • If you were using the Geography type, and
  • If you were storing polygons in your Geography column, and
  • If you queried your column with another Polygon, and
  • If the query polygon was fully contained in one of the column polygons, then
  • The distance reported between the polygons would be non-zero (when it should be zero!)

It happens! It happened to Davy's team, and it happened to other folks (the ones who originally filed #2556) — I was actually working on the bug on the plane a couple weeks before. It was a tricky one to both find and to diagnose, because it was related to caching behaviour: you could not reproduce it using a query that returned a single record, it had to return more than one record.

If I was prickly about the report from Davy:

And pricklier still about the less nuanced report of her colleague Jerry:

That prickliness arose because, on the basis of a very particular and narrow (but real!) use case, they were tarring the whole release, which had been out and functioning perfectly well for thousands and thousands of users for months.

Also, I was feeling guilty for not addressing it earlier. PostGIS has gotten a lot bigger than me, and I don't even try to address raster or topology bugs, but in the vector space I take pride in knocking down real issues quickly. But this issue had dragged on for a couple months without resolution, despite the diligent sleuthing of Regina Obe, and a perfect reproduction case from "gekorob", the original reporter.

That's where I'm coming from.

I can also empathize with Jerry and others who ran across this issue. It's slippery, it would eat up a non-trivial amount of time isolating. Having had the time eaten, a normal emotional response would be "goddamn it, PostGIS, you've screwed me, I won't let you screw others!" Also, having eaten many of your personal hours, the bug would appear big not narrow, worthy of a broadcast condemnation, not a modest warning.

Anyways, that's the tempest and teapot. I'm going to finish my morning by putting this case into the regression suite, so it'll never recur again. That's the best part of fixing a bug in PostGIS, locking the door behind you so it can never come out again.

4 comments:

Jerry Sievert said...

Thanks Paul. We don't tend to "Blog First", but we do have a blog entry in the works as well, explaining the bug and how it affects us. I wasn't trying to be "pickly", but this is something major given the release of RDS bundling PostGIS 2.1.

We LOVE PostGIS, and are enthused to use it and be part of the community. I am sorry that my tweet came across as tarring it, it was definitely not meant to.

Paul Ramsey said...

Thank you Jerry, for the feedback and sticking with your sleuthing. (And if you can continue to look at _ST_DistanceUncached() I would be grateful!) One thing to note is that your words as Esri-PDX now have a lot (more) weight: Esri staff saying "don't use PostGIS on RDS" can affect a lot of opinions, you guys control the horizontal and the vertical in a lot of minds.

One of the reasons I worked late yesterday on this was just because... it was Esri publicly shaming me. You're holding a sledgehammer, watch where you swing it :)

Jerry Sievert said...

Paul,

Well noted, and thanks for following up! My post to the mailing list from my personal account was partly in an effort to not add more weight.

You also now know that there are people inside of Esri using, and trying to improve PostGIS. :)

Mateusz Loskot said...

Good reading, thanks Paul.

Retweeting also can easily make one taking over a jerk badge.
I hit the retweet button w/o a any reflection a couple of times myself, spreading the whole misunderstanding.

Stories like this are a good lessons, help to build more concious community.

About Me

My Photo
Victoria, British Columbia, Canada

Followers

Blog Archive

Labels

bc (32) it (26) postgis (17) icm (11) enterprise IT (9) sprint (9) open source (8) osgeo (8) video (8) management (6) cio (5) enterprise (5) foippa (5) gis (5) spatial it (5) foi (4) mapserver (4) outsourcing (4) bcesis (3) foss4g (3) oracle (3) politics (3) COTS (2) architecture (2) boundless (2) esri (2) idm (2) natural resources (2) ogc (2) open data (2) opengeo (2) openstudent (2) postgresql (2) rant (2) technology (2) vendor (2) web (2) 1.4.0 (1) HR (1) access to information (1) accounting (1) agile (1) aspen (1) benchmark (1) buffer (1) build vs buy (1) business (1) business process (1) cathedral (1) cloud (1) code (1) common sense (1) consulting (1) contracting (1) core review (1) crm (1) custom (1) data warehouse (1) deloitte (1) design (1) digital (1) email (1) essentials (1) evil (1) exadata (1) fcuk (1) fgdb (1) fme (1) foocamp (1) foss4g2007 (1) ftp (1) gds (1) geocortex (1) geometry (1) geoserver (1) google (1) google earth (1) government (1) grass (1) hp (1) iaas (1) icio (1) industry (1) innovation (1) integrated case management (1) introversion (1) iso (1) isss (1) isvalid (1) javascript (1) jts (1) lawyers (1) mapping (1) mcfd (1) microsoft (1) mysql (1) new it (1) nosql (1) opengis (1) openlayers (1) oss (1) paas (1) pirates (1) policy (1) portal (1) proprietary software (1) qgis (1) rdbms (1) recursion (1) regression (1) rfc (1) right to information (1) saas (1) salesforce (1) sardonic (1) seibel (1) sermon (1) siebel (1) snark (1) spatial (1) standards (1) svr (1) tempest (1) texas (1) tired (1) transit (1) twitter (1) udig (1) uk (1) uk gds (1) verbal culture (1) victoria (1) waterfall (1) wfs (1) where (1) with recursive (1) wkb (1)