Tuesday, February 17, 2009

WKT Itch

Originally in GeoSpeil.

Eric Raymond says that open source software comes from developers “scratching their itch” — an unpleasant bodily metaphor for “solving their own problems” — a side effect of which is software being released and usable by everyone.

My current itch is attacking the “well-known text” (WKT) representation of spatial reference systems. In theory, an Open Geospatial Consortium standard spatial database like PostGIS stores spatial reference information in the SPATIAL_REF_SYS table, and the actual information about reference system parameters is serialized in a “well-known text” string, stored in the SRTEXT column of that table. In practice, what PostGIS really uses for coordinate transformations is a PROJ4TEXT string in a spare column of the table, and the SRTEXT is just window dressing — we carry it, but we don’t use it.

Using the WKT representation directly is attractive, because it drops needless duplication of information and allows more direct interoperation with things like ESRI “prj” files, which are themselves just WKT serializations. Unfortunately WKT is not as “well-known” as the name would have us believe. Every vendor has used slightly different naming for things like projection operations, parameters, datum names, and so on.

So my itch is multi-fold: I want to be able to parse WKT, I want to learn the technologies necessary to parse WKT (bison and flex), I want to be able to standardize WKT (to strip out the vendor-specific bits) and I want to be able to turn my parsed form into PROJ4 projection objects, because I’ll still be using the PROJ4 engine for transformations at the end of the day. I’m an itchy guy.

So far, I have achieved the parsing and learning-how-to-parse goals, and placed my results in a spike in the PostGIS SVN repository. Next up is standardization, and finally creating PROJ4 objects. Then I’ll try to hook the whole thing into PostGIS.
 

2 comments:

jasonbirch.com said...

Hey Paul,

To a non-developer, the CSMap WKT parser seems fairly comprehensive, and the comments (in the header file anyway) can be humorous:

http://tinyurl.com/c3jbgf

Not sure if this is too tied to CSMap for your use, but may be worth looking at.

As an aside, Hugues went through an interesting exercise during the RFC process for switching MapGuide to CSMap. MapGuide relies on WKT internally, and it was possible that users not running MapGuide Studio had used strings that CSMap would not recognise. All of the Proj.4 WKT strings were parsed by the CSMap parser. Over half "round-tripped" to CSMap codes (like "LL84"), the majority of the rest were usable as "custom" codes without user intervention, and only 84 could not be converted into valid CS.

http://tinyurl.com/akb8k7

MapGuide's libraries to interface with CSMap are here:

http://tinyurl.com/cqlcbq

Jason

Paul Ramsey said...

@jason, if I wanted another C++ hair-ball dependency, I'd just use the OGR SRS class. :) That said, CSMap would have the advantage of at least swapping one dependency (PROJ4) for another (CSMAP) rather than adding one.

About Me

My Photo
Victoria, British Columbia, Canada

Followers

Blog Archive

Labels

bc (35) it (27) postgis (19) icm (11) enterprise IT (10) video (10) sprint (9) open source (8) osgeo (8) cio (6) management (6) enterprise (5) foippa (5) foss4g (5) gis (5) spatial it (5) foi (4) mapserver (4) outsourcing (4) politics (4) bcesis (3) oracle (3) COTS (2) architecture (2) boundless (2) esri (2) idm (2) natural resources (2) ogc (2) open data (2) opengeo (2) openstudent (2) postgresql (2) rant (2) technology (2) vendor (2) web (2) 1.4.0 (1) HR (1) access to information (1) accounting (1) agile (1) aspen (1) benchmark (1) buffer (1) build vs buy (1) business (1) business process (1) cathedral (1) cloud (1) code (1) common sense (1) consulting (1) contracting (1) core review (1) crm (1) custom (1) data warehouse (1) deloitte (1) design (1) digital (1) email (1) essentials (1) evil (1) exadata (1) fcuk (1) fgdb (1) fme (1) foocamp (1) foss4g2007 (1) ftp (1) gds (1) geocortex (1) geometry (1) geoserver (1) google (1) google earth (1) government (1) grass (1) hp (1) iaas (1) icio (1) industry (1) innovation (1) integrated case management (1) introversion (1) iso (1) isss (1) isvalid (1) javascript (1) jts (1) lawyers (1) mapping (1) mcfd (1) microsoft (1) mysql (1) new it (1) nosql (1) opengis (1) openlayers (1) oss (1) paas (1) pirates (1) policy (1) portal (1) proprietary software (1) qgis (1) rdbms (1) recursion (1) redistribution (1) regression (1) rfc (1) right to information (1) saas (1) salesforce (1) sardonic (1) seibel (1) sermon (1) siebel (1) snark (1) spatial (1) standards (1) svr (1) tempest (1) texas (1) tired (1) transit (1) twitter (1) udig (1) uk (1) uk gds (1) verbal culture (1) victoria (1) waterfall (1) wfs (1) where (1) with recursive (1) wkb (1)