Not-so-free Client Libraries

While at the FOSS4G event this fall I had the opportunity to talk at length with Xavier Lopez, the product manager for Oracle Spatial.

He was at the event to talk up more open source use of Oracle’s products, and he asked me what Oracle could do to help open source, short of shovelling money at it.

“Make your client libraries freely redistributable.” I answered, citing the pains that Geotools (and by extension Geoserver and uDig) have to go to in supporting Oracle servers while not distributing the actual Oracle JDBC JAR files. Because we can’t include Oracle libraries with our builds, end users have to independently download the JARs and copy them into the right places themselves – it is not a user-friendly situation at all.

“But”, Xaviar replied, “they are redistributable.”

This did not match my understanding of the situation, but he is the expert. So I said I would look into it and get back to him.

The source of our mutual misunderstanding turned out to be (surprise!) based on legaleze. As Xaviar said, the client JDBC JARs are freely redistributable… but there is a catch. You can only get the JARs in the first place by accepting a license, and the license says that you are allowed to redistribute them only if you do so under the same terms as those given in the original Oracle license.

Among the restrictions which any open source project would have to place on its users if they wanted to redistribute the Oracle client libraries are:

  • You are not a citizen, national, or resident of, and are not under control of, the government of Cuba, Iran, Sudan, Libya, North Korea, Syria, nor any country to which the United States has prohibited export.
  • You will not download or otherwise export or re-export the Programs, directly or indirectly, to the above mentioned countries nor to citizens, nationals or residents of those countries.
  • You are not listed on the United States Department of Treasury lists of Specially Designated Nationals, Specially Designated Terrorists, and Specially Designated Narcotic Traffickers, nor are you listed on the United States Department of Commerce Table of Denial Orders.
  • You will not download or otherwise export or re-export the Programs, directly or indirectly, to persons on the above mentioned lists.

To abide by the license, not only would we have to to (somehow) ensure that none of the people downloading our builds met the conditions above, but we would also have to get them to promise in turn not to distribute to such people. Forcing the users to copy the JARs by hand starts to look a whole lot better all of a sudden.

(Stop. Take a moment to contemplate the lush absurdity of the idea that placing a piece of software behind such a click-through license materially assists the enforcement of export controls in any way. Breath in. Breath out. Ahhhh.)

Open source cannot place restrictions on use or redistribution. It is the ethos. It is the way things work. It is not because the restrictions in this case are oddball notions from the national security state. Any such restrictions are anathema. My favourite case of this was a piece of Geotools, which had to be re-written when it turned out that a component was licensed under a “public domain except for military use” clause. Everyone gets to download, use and redistribute open source. The US Department of Defense, Cuba, Peace Now, the Marine Corps, the lunatic down the street, even me.

A Busy Day

Had a great day today!

Started by talking up Victoria’s bid to host the FOSS4G conference in 2007, on an IRC meeting. Doing the budgeting and planning just to put together the bid has been very educational. I has been a good reminder of the value of working with professionals – the conference center staff and organizer I am working with pulled together the background information and base budget with magical ease. Everyone has their area of expertise, and it is sure nice to have experts on your side when entering a domain in which you know nothing at all!

Then I talked with Mark about pushing PostGIS 1.2.0 out the door, and we did! Curve support started, some performance hooks I’ve been wanting for a while, and function signatures to allow easy integration with other SQL/MM database software (this means you, ArcSDE!).

The performance stuff is fun, because I know there will be more – for every generic spatial predicate (Contains(), for example), there are lots of special cases that don’t have to be handled with full computation. If the bounding boxes of the inputs are disjoint, for example, you know Contains() is false without even starting anything involved. If both arguments are points – false! If one argument is a point and the other is a polygon, you can do a point-in-polygon test instead of a full graph calculation needed for two more complex types. And on and on and on.

Finally, I did a “Friday talk” about the history of the FME, my favourite non-open source GIS tool. Friday talks are something we try to do every week at Refractions – have someone talk about a topic of interest to them that hopefully has some relevance to other folks in the company. A programming technique, a project summary, a quirk of geography. Grist for the mill. Even in a company of less than 30, cross-polination of ideas needs to be promoted, it doesn’t happen on its own the way it does in a company of less than 10.

PostGIS for SDE

One of the interesting nuggets to come out of the ESRI User Conference this year was the news that ESRI was going to support ArcSDE on PostgreSQL “sometime soon”. Which, to PostGIS people like ourselves suggests the question: “implemented how?”

  • One possibility would be basically a cut’n’paste of their existing SQLServer code, with the SQLServer quirks replaced with PostgreSQL quirks, using SDEBINARY as the spatial type.
  • Another possibility would be to use the PostGIS spatial objects as the underlying storage mechanism, in the same way ArcSDE supports using SDO_GEOMETRY in Oracle.
  • A third possibility would be ESRI implementing their own spatial type in PostgreSQL and then using that.

Sounds strange, doesn’t it? Writing a whole new spatial type, when one already exists. Ordinarily I would dismiss the idea – except that ESRI has already done it for Oracle!.

The ST_GEOMETRY type in ArcSDE 9.1 and up is a native Oracle type (built using the Oracle type-extension mechanism) provided, and recommended, by ESRI for use by ArcSDE.

Why would ESRI do this?

The cynical explanation (get this out of the way first) is that it helps break the growing Oracle momentum in tools supporting SDO_GEOMETRY, and confuses the marketplace further about what the “right type” to use is in Oracle for spatial work.

The practical explanation is that ESRI’s ST_GEOMETRY for Oracle implements the same semantics and function signatures as the ST_GEOMETRY objects in DB2 and Informix (coincidentally, also implemented in part by ESRI). This allows ArcSDE to expose a uniform “raw spatial SQL” to clients while still maintaining it’s position as the man-in-the-middle of client/server interaction. Adding ST_GEOMETRY further reinforces the “database neutral” aspect of ArcSDE by allowing spatial SQL without exposing the differences between the SDO_GEOMETRY function signatures and the ST_GEOMETRY ones.

So where does that leave PostGIS? Removing the practical excuses for not using PostGIS as the underlying geometry type as fast as possible. We have looked up the function signatures used by ArcSDE and implemented them for the 1.1.7 release.

If anyone on the ArcSDE team reads this and wants to talk about what else is needed to make PostGIS the default geometry type for ArcSDE-on-PostgreSQL, get in touch. We aim to please.

Can WFS Really Work?

Of all the standards that have come out of the OGC in the last few years, few has had the promise of the Web Feature Server standard.

  • View and edit features over the web
  • Client independent
  • Server independent
  • Format independent
  • Database independent

What is not to like? Nothing!

One of the promises of uDig is to be an “internet GIS”, by which we mean a thick client system capable of consuming and integrating web services in a transparent and low-friction way. The GIS equivalent of a web browser. Web browsers use HTTP and HTML and CSS and Javascript to create a rich and compelling client/server interaction, regardless of the client/server pairing. An internet GIS should use WMS and WFS and SLD to do the same thing, independent of vendor.

So, we have been working long and hard on a true WFS client, one that can connect to any WFS and read/write the features therein without modification. And here’s the thing – it is waaaaaaay harder than it should be.

Here is why:

  • First off, generic GML support is hard. Every WFS exposes its own schema which in turn embeds GML, so a “GML parser” is actually a “generic XML parser that happens to also notice embedded GML”, and the client has to be able to turn whatever odd feature collection the server exposes into its internal model to actually render and process it. However, it is only a hard problem, not an impossible one, and we have solved it.
  • The solution to supporting generic GML is to read the schema advertised by the WFS, and use that to build a parser for the document on the fly. And this is where things get even harder: lots of servers advertise schemas that differ from the instance documents they actually produce.

    • The difference between schema and instance probably traces back to point #1 above. Because GML and XML schema are “hard”, the developers make minor mistakes, and because there have not been generic clients around to heavily test the servers, the mistakes get out into the wild as deployed services.

So, once you have cracked the GML parsing problem (congratulations!) you run headlong into the next problem. Many of the servers have bugs and don’t obey the schema/instance contract – they do not serve up the GML that they say they do.

And now, if you aren’t just building a university research project, you have a difficult decision. If you want to interoperate with the existing servers, you have to code exceptions around all the previously-deployed bugs.

Unfortunately, our much loved UMN Mapserver is both (a) one of the most widely deployed WFS programs and (b) the one with the most cases of schema/instance mismatch. Mapserver is not the only law-breaker though, we have found breakages even in proprietary products that passed the CITE tests.

All this before you even start editing features!

The relative complexity of WFS (compared to, say, WMS) means that the scope of ways implementors can “get it wrong” is much much wider, which in turn radically widens the field of “special cases to handle” that any client must write.

In some ways, this situation invokes to good old days of web browsers, when HTML purists argued that when encountering illegal HTML (like an unclosed tag) browsers should stop and spit up an error, while the browser writers themselves just powered through and tried to do a “best rendering” based on whatever crap HTML they happened to be provided with.

Flame Bait

Why end the evening on a high note, when I can end it rancourously and full of bile!

On the postgis-users mailing list, Stephen Woodbridge writes:

Can you describe what dynamic segmentation is? What is the algorithm? I guess I can google for it …

As with many things, the terminological environment has been muddied by the conflation of specific ESRI terms for particular features with generic terms for the similar things. Call it the “Chesterfield effect”.

  • ESRI “Dynamic segmentation” is really just “linear referencing of vectors and associated attributes”.
  • ESRI “Geodatabase” is “a database with a bunch of extra tables defined by and understood almost exclusively by ESRI”
  • ESRI “Coverage” is a “vector topology that covers an area” (ever wonder why the OGC Web Coverage Server specification is about delivering raster data, not vector topologies? because most people have a different understanding of the word than us GIS brainwashees).
  • ESRI “Topology” is a “middleware enforcement of spatial relationship rules”

ESRI rules the intellectual world of GIS people so thoroughly that they define the very limits of the possible. Just last week someone told me “oh, editing features over the web? the only way to do that is with ArcServer”.

The only way, and said with complete certainty. You don’t want to argue with people like that, it seems almost rude, like arguing with people about religion.