A Busy Day

Had a great day today!

Started by talking up Victoria’s bid to host the FOSS4G conference in 2007, on an IRC meeting. Doing the budgeting and planning just to put together the bid has been very educational. I has been a good reminder of the value of working with professionals – the conference center staff and organizer I am working with pulled together the background information and base budget with magical ease. Everyone has their area of expertise, and it is sure nice to have experts on your side when entering a domain in which you know nothing at all!

Then I talked with Mark about pushing PostGIS 1.2.0 out the door, and we did! Curve support started, some performance hooks I’ve been wanting for a while, and function signatures to allow easy integration with other SQL/MM database software (this means you, ArcSDE!).

The performance stuff is fun, because I know there will be more – for every generic spatial predicate (Contains(), for example), there are lots of special cases that don’t have to be handled with full computation. If the bounding boxes of the inputs are disjoint, for example, you know Contains() is false without even starting anything involved. If both arguments are points – false! If one argument is a point and the other is a polygon, you can do a point-in-polygon test instead of a full graph calculation needed for two more complex types. And on and on and on.

Finally, I did a “Friday talk” about the history of the FME, my favourite non-open source GIS tool. Friday talks are something we try to do every week at Refractions – have someone talk about a topic of interest to them that hopefully has some relevance to other folks in the company. A programming technique, a project summary, a quirk of geography. Grist for the mill. Even in a company of less than 30, cross-polination of ideas needs to be promoted, it doesn’t happen on its own the way it does in a company of less than 10.

PostGIS for SDE

One of the interesting nuggets to come out of the ESRI User Conference this year was the news that ESRI was going to support ArcSDE on PostgreSQL “sometime soon”. Which, to PostGIS people like ourselves suggests the question: “implemented how?”

  • One possibility would be basically a cut’n’paste of their existing SQLServer code, with the SQLServer quirks replaced with PostgreSQL quirks, using SDEBINARY as the spatial type.
  • Another possibility would be to use the PostGIS spatial objects as the underlying storage mechanism, in the same way ArcSDE supports using SDO_GEOMETRY in Oracle.
  • A third possibility would be ESRI implementing their own spatial type in PostgreSQL and then using that.

Sounds strange, doesn’t it? Writing a whole new spatial type, when one already exists. Ordinarily I would dismiss the idea – except that ESRI has already done it for Oracle!.

The ST_GEOMETRY type in ArcSDE 9.1 and up is a native Oracle type (built using the Oracle type-extension mechanism) provided, and recommended, by ESRI for use by ArcSDE.

Why would ESRI do this?

The cynical explanation (get this out of the way first) is that it helps break the growing Oracle momentum in tools supporting SDO_GEOMETRY, and confuses the marketplace further about what the “right type” to use is in Oracle for spatial work.

The practical explanation is that ESRI’s ST_GEOMETRY for Oracle implements the same semantics and function signatures as the ST_GEOMETRY objects in DB2 and Informix (coincidentally, also implemented in part by ESRI). This allows ArcSDE to expose a uniform “raw spatial SQL” to clients while still maintaining it’s position as the man-in-the-middle of client/server interaction. Adding ST_GEOMETRY further reinforces the “database neutral” aspect of ArcSDE by allowing spatial SQL without exposing the differences between the SDO_GEOMETRY function signatures and the ST_GEOMETRY ones.

So where does that leave PostGIS? Removing the practical excuses for not using PostGIS as the underlying geometry type as fast as possible. We have looked up the function signatures used by ArcSDE and implemented them for the 1.1.7 release.

If anyone on the ArcSDE team reads this and wants to talk about what else is needed to make PostGIS the default geometry type for ArcSDE-on-PostgreSQL, get in touch. We aim to please.

Can WFS Really Work?

Of all the standards that have come out of the OGC in the last few years, few has had the promise of the Web Feature Server standard.

  • View and edit features over the web
  • Client independent
  • Server independent
  • Format independent
  • Database independent

What is not to like? Nothing!

One of the promises of uDig is to be an “internet GIS”, by which we mean a thick client system capable of consuming and integrating web services in a transparent and low-friction way. The GIS equivalent of a web browser. Web browsers use HTTP and HTML and CSS and Javascript to create a rich and compelling client/server interaction, regardless of the client/server pairing. An internet GIS should use WMS and WFS and SLD to do the same thing, independent of vendor.

So, we have been working long and hard on a true WFS client, one that can connect to any WFS and read/write the features therein without modification. And here’s the thing – it is waaaaaaay harder than it should be.

Here is why:

  • First off, generic GML support is hard. Every WFS exposes its own schema which in turn embeds GML, so a “GML parser” is actually a “generic XML parser that happens to also notice embedded GML”, and the client has to be able to turn whatever odd feature collection the server exposes into its internal model to actually render and process it. However, it is only a hard problem, not an impossible one, and we have solved it.
  • The solution to supporting generic GML is to read the schema advertised by the WFS, and use that to build a parser for the document on the fly. And this is where things get even harder: lots of servers advertise schemas that differ from the instance documents they actually produce.

    • The difference between schema and instance probably traces back to point #1 above. Because GML and XML schema are “hard”, the developers make minor mistakes, and because there have not been generic clients around to heavily test the servers, the mistakes get out into the wild as deployed services.

So, once you have cracked the GML parsing problem (congratulations!) you run headlong into the next problem. Many of the servers have bugs and don’t obey the schema/instance contract – they do not serve up the GML that they say they do.

And now, if you aren’t just building a university research project, you have a difficult decision. If you want to interoperate with the existing servers, you have to code exceptions around all the previously-deployed bugs.

Unfortunately, our much loved UMN Mapserver is both (a) one of the most widely deployed WFS programs and (b) the one with the most cases of schema/instance mismatch. Mapserver is not the only law-breaker though, we have found breakages even in proprietary products that passed the CITE tests.

All this before you even start editing features!

The relative complexity of WFS (compared to, say, WMS) means that the scope of ways implementors can “get it wrong” is much much wider, which in turn radically widens the field of “special cases to handle” that any client must write.

In some ways, this situation invokes to good old days of web browsers, when HTML purists argued that when encountering illegal HTML (like an unclosed tag) browsers should stop and spit up an error, while the browser writers themselves just powered through and tried to do a “best rendering” based on whatever crap HTML they happened to be provided with.

Flame Bait

Why end the evening on a high note, when I can end it rancourously and full of bile!

On the postgis-users mailing list, Stephen Woodbridge writes:

Can you describe what dynamic segmentation is? What is the algorithm? I guess I can google for it …

As with many things, the terminological environment has been muddied by the conflation of specific ESRI terms for particular features with generic terms for the similar things. Call it the “Chesterfield effect”.

  • ESRI “Dynamic segmentation” is really just “linear referencing of vectors and associated attributes”.
  • ESRI “Geodatabase” is “a database with a bunch of extra tables defined by and understood almost exclusively by ESRI”
  • ESRI “Coverage” is a “vector topology that covers an area” (ever wonder why the OGC Web Coverage Server specification is about delivering raster data, not vector topologies? because most people have a different understanding of the word than us GIS brainwashees).
  • ESRI “Topology” is a “middleware enforcement of spatial relationship rules”

ESRI rules the intellectual world of GIS people so thoroughly that they define the very limits of the possible. Just last week someone told me “oh, editing features over the web? the only way to do that is with ArcServer”.

The only way, and said with complete certainty. You don’t want to argue with people like that, it seems almost rude, like arguing with people about religion.

Tiles Tiles Tiles

One of the oddball tasks I came home from the FOSS4G conference with was the job of writing the first draft of a tiling specification. My particular remit was to do a server capable of handling arbitrary projections and scale sets, which made for an interesting design decision: to extend WMS or not?

I mulled it over at the conference, and talked to some of the luminaries like Paul Spencer and Allan Doyle. My concern was that the amound of alteration required to WMS in order to support the arbitrary projections and scales was such that there was not much benefit remaining in using the WMS standard in the first place – existing servers wouldn’t be able to implement, and existing clients wouldn’t be able to benefit.

On top of that, a number of the client writers wanted something a little more “tiley” in their specification than WMS. Rather than requests in coordinate space, they wanted requests in tile space: “give me tile [4,5]!”

So, I originally set off to write either a GetTile in WMS or a Tile Server using the Open Web Services baseline from the Open Geospatial Consortium.

But then I had an Intellectual Experience, which came from reading Sean Gillies’ blog on REST web services, and his thoughts on how Web Feature Server (WFS) could have been implemented more attractively as a REST interface. I was drawn in by the Abstract Beauty of the whole concept.

So I threw away the half-page of OWS boiler-plate I had started with and began anew, thinking about the tiling problem as a problem of exposing “resources” ala REST.

The result is the Tile Map Service specification, and no, it is not really all that RESTful. That’s because tiles themselves are really boring resources, and completely cataloguing a discovery path from root resource to individual tile would add a lot of scruft to the specification that client writers would never use. So I didn’t.

That was the general guiding principle I tried to apply during the process – what information can client writers use. Rather than writing for an abstract entity, I tried to think of the poor schmuck who would have to write a client for the thing and aim the content at him.

I have put up a reference server at http://mapserver.refractions.net/cgi-bin/tms and there are other servers referenced in the document. My colleague Jody Garnett is working on a client implementation in Java for the GeoTools library, for exposure in the uDig interface. Folks from OpenLayers and WorldKit have already built reference clients. It has been great fun!