Open Source GIS Fights the Three-Horned Monster [2002]

This article (PDF) was published in the August 2002 edition of GeoWorld. I came across my hard copy cleaning my desk and thought it might be of historical interest.

They say you can have your software good, cheap or soon, but you can’t have all three. Information technology (IT) project managers have assumed since the dawn of microchips that any improvement in one measure of software quality must inevitably be accompanied by a reduction in others.

Last year, Ross Searle of the Department of Natural Resources and Mines, Queensland, Australia, faced this “three-horned” IT dilemma. He had a problem, and the solution had to be cheap, quick and good. Searle wanted to create an online permitting application that allowed resource officers in his department to quickly evaluate the environmental consequences of tree-clearing permits.

“In Queensland, our state government has legislation controlling the clearing of trees,” says Searle. “If a landholder wishes to clear trees, then he or she has to apply for a permit. A permit will only be issued if the clearing does not cause environmental degradation. The state government has the role of assessing the permits. To do this, officers need access to a broad range of clatasets all generally held in GIS form.”

The three-horned dilemma loomed. The application had to be cheap—budgets for the department were shrinking, no discretionary funding was available, and all the existing licenses for proprietary Web mapping software were tied up in the departmental head office. The application had to be good—data volumes were huge, encompassing several spatial coverages of more than 500MB apiece, so lightweight solutions weren’t going to work. The application had to be ready soon—Searle didn’t have the time or money to program a complex system from scratch.

Searle slew his three-homed dilemma with a combination of “open-source” tools, using the University of Minnesota (LIMN) MapServer to provide Web mapping capabilities and PostGIS/PostgreSQL as the spatial database backend.

“We are using PostGIS to deliver large amounts of natural resource information via a MapServer interface,” adds Searle, “The MapServer/PostGIS application allows users to quickly search for a parcel of land and bring up all the relevant information in a standard format.

“Apart from the cost factors, I believe that the open-source software for this particular purpose is every bit as good—if not better—than the solutions offered by commercial vendors. We have found that the developers of open-source software are responsive to bugs/suggestions/inquiries, etc., more than commercial vendors. In fact, one of the biggest problems we have found in using open source is keeping up with all the improvements.”

What’s Open Source?

Unlike “freeware” or “shareware,” open-source software provides users with more than just a program and some documentation. As defined by the Open Source Initiative, open-source software “must be distributed under a license that guarantees the right to read, redistribute, modify and use the software freely.”

DM Solutions used MapServer to create G1S-based Web sites for applications such as cancer research (left) and finding hiking trails (right)

Open-source programs are distributed along with their “source code,” i.e., the programming instructions that control how the software works. Using open-source software is like eating at a restaurant where the recipes are served alongside the meals—you can simply enjoy the food, but you also have the option of taking the recipe home, changing the seasonings and serving the result to your friends.

Successful open-source projects attract developers interested in improving the software. Sometimes their motives are personal, but often they’re professional—the software helps solve a problem, and improvements to the software make doing their job easier. Through time, success breeds success. The projects with the most development activity attract more developers and become more active, improving and addling features at a rapid rate.

After users become accustomed to having complete access to the inner workings of the software they use, proprietary software begins to feel a little limiting, even unnatural. Bob Young, co-founder of the successful open-source company Red Hat, likes to compare purchasing proprietary software to “buying a car with the hood welded shut.”

“We demand the ability to open the hood of our cars, because it gives us, the consumer, control over the product we’ve bought and takes it away from the vendor,” notes Young.

In Young’s view, the software market should be one in which consumers don’t purchase software per se, but instead purchase whatever services they need to effectively use the software they choose.

Rather than purchase a proprietary database system and then purchase support from the proprietary database company, customers instead choose an open-source database and purchase support from an array of support companies with expertise in the chosen database. The net effect is the same—customers have functioning and supported products—but the balance of power is shifted in favor of customers.

An Open-Source Economy

In a healthy open-source economy, every successful open-source software project should have an accompanying set of companies prepared to offer support and consulting to customers who choose to implement systems with the software. DM Solutions Group, for example, is one of the companies supporting the LIMN MapServer—the open-source Web mapping application Searle used to implement his online permitting application. DM Solutions started as a traditional systems integrator, providing consulting services that implement proprietary software packages.

“We were frustrated with the fact that we were dealing with ‘closed boxes’ that magically did all the work for us,” says Dave Mcllhagga, president of DM Solutions, “If it didn’t do it the way we wanted it to, we couldn’t change it or would have to depend and wait on a third party to take care of any problems.

“Now that we are using open-source software, we’re in full control of the situation and can offer not only consulting services, but also free and open software to base it on. We then can guarantee that if there are any problems in the base software, we can fix them. The word ‘workaround’ no longer is part of our vocabulary. If it doesn’t work, we fix it.”

In addition to providing MapServer consulting services, DM Solutions soon became actively involved in MapServer development, adding new features like OpenGIS Web Map Server, Macromedia Flash and GML support. Mcllhagga notes that effort spent on development actually promotes consulting skills, demonstrating that “we are the industry leaders in use of the product, have a high level of expertise and can therefore offer a premium service to our clients.”

A new open-source map-building tool, MapLab, was created by DM Solutions using MapServer

Refractions Research occupies a similar position with respect to PostGIS/PostgreSQL, Searle’s other key application component. As the original developers and current maintainers of PostGIS, Refractions Research occupies a market niche and can offer special expertise to the growing community of PostGIS users. Dave Blasby, principal developer of PostGIS, wonders where it’s all leading.

“If you had told me last year that we would be working for clients in Germany, Florida, Montreal and Los Angeles during the next 12 months, I would have thought you were crazy,” says Blasby.

An Unlocked Hood

Products like UMN MapServer and PostGIS/PostgreSQL gain leverage by building on the prior efforts of other open-source projects. For example, by building on the capabilities of PostgreSQL, PostGIS gains all the strengths of an existing industrial-strength database: transactional integrity, write-ahead logs, Structured Query Language (SQL) and standard application programming interfaces.

Similarly, MapServer garners much of its GIS file format compatibility by using OGR and Geospatial Data Abstraction Library (GDAL) file format libraries. Shared code such as software libraries is extremely important to open-source projects, because it allows all projects to improve together in lockstep as enhancements are made to base libraries.

The author of the OGR and GDAL libraries is Frank Warmerdam, an independent contractor in Ontario, Canada. Warmerdam provides customizations of his many geospatial libraries to software companies and system integrators, who bundle the libraries in their products.

“In many cases, the clients gain substantial leverage from building on an existing open-source library, only needing to pay for the specific improvements they require,” explains Warmerdam. “Clients funding initial work on libraries often gain from testing and improvements provided by later users.”

Ironically, Warmerdam’s open-source TIFF image-format library is the basis for TIFF support in several well-known proprietary GIS products as well as many open-source projects. Even proprietary software vendors can take advantage of the group source libraries provide.

Warmerdam has been working in the software industry for many years and says one of the reasons he now works mainly on open-source contracts “is the sense that I’m building something that will outlast the commercial decisions and market success of any one company.”

Future Solutions

As the foundation of several open-source projects, Warmerdam’s libraries will be used for many years. The sheer quantity of open-source GIS projects available can be appreciated by browsing the entries at FreeGIS.org, a clearinghouse Web site for project information.

Unlike Linus Torvalds, the author of Linux, open-source GIS promoters don’t talk of “world domination”—not even in jest as Torvalds did. Instead, they point to the flexibility that’s the hallmark of open-source software and predict increasing ubiquity behind the scenes.

Dave Mcllhagga points to MapServer as an example of open-source GIS infrastructure with strong momentum, and notes the importance of providing a viable alternative to the status quo.

“There’s a reason why MapServer’s user base has been growing,” notes Mcllhagga. “With the availability of user-friendly tools complementing some technically robust technology, there’s reason to believe it can play a substantial role in this business. If open-source alternatives can continue to improve and at least keep commercial vendors honest, they will be successful.”

Regardless of whether open-source OS ever lands on desktops and workstations, it likely will play an increasing role in meeting the specialized needs of the OS community. Wherever people have problems to solve and a willingness to share their solutions with others, open source will continue to flourish.

A PostGIS/MapServer parcel-status application was created by Ross Searle and the Department of Natural Resources and Mines, Queensland, Australia.

BC IT Outsourcing 2020/21

Public accounts came out at the end of summer, so I have finally gotten around to entering in the central government data (the Health Authorities put out their reports much later) and the results show … the first decline since 2015?

Historically there have been growth pauses every 5 years or so, but the last two were associated with brief phases of government austerity, though that may have been a mirage, since the other thing the pauses all have in common is IBM taking it on the chin.

The loss of a book of Ministry of Health business (twice), then the loss of the desktop support contract, seem to tell the tale of IBM’s decline as an outsourcing juggernaut.

Maximus remains curiously resiliant, even as the MSP program phases out.

And the big shark, the back-office systems contract, which has migrated through a series of corporate names, from EDS (does anyone remember Ross Perot?) to HP Advanced Solutions to ESIT Advanced Solutions is now just … “Advanced Solutions, a DXC Technology Company”.

Looking over the sweep of the curve (which is now 23 years long!) you can see a bunch of interesting moments:

  • The initial major privatization around 2005 under the Campbell Liberals, which starts off the whole process.
  • The entry of Maximus into the Ministry of Health in 2005 at that same time, which may help explain some of the decline of IBM, as a more nimble competitor gobbles up prospects.
  • The entry of Deloitte in 2011, for the Social Services ICM project, which leads to solid growth, but is also the high-water mark, as the planned successor project in Natural Resources around 2015 ends up going more to CGI than Deloitte.

What might the future hold? I have no idea, but unless and until the major backend contract held by “Advanced Solutions” changes (was supposed to run out this year?) I expect the total outsourcing spend to remain in the “holy cow that is a lot of money” range.

Coda: It’s waaaay to early to make any serious statements, but one thing I did notice eyeballing the vendors chart is that while most vendors have been treading water since the NDP took government in 2017, one “vendor” that has notched up 50% growth is my aggregate “local vendors” vendor, which I batch all the smaller companies into so they show up on the chart.

Since one of the NDP platform promises in 2017 was a preference to local companies, this could be sign of progress. Or it could be random! But local companies have gone from about $50M/yr to $75M/yr since 2017, that’s not nothing. Something to keep an eye on for next year.

GeoMob Podcast - PostGIS

I neglected to post about this at the time, which I guess is a testament to the power of twitter to suck up energy that might otherwise be used in blogging, but for posterity I am going to call out here:

Have a listen.

PostGIS at 20, The Beginning

Twenty years ago today, the first email on the postgis users mailing list (at that time hosted on yahoogroups.com) was sent, announcing the first numbered release of PostGIS.

Refractions

The early history of PostGIS was tightly bound to a consulting company I had started a few years prior, Refractions Research. My first contracts ended up being with British Columbia (BC) provincial government managers who, for their own idiosyncratic reasons, did not want to work with ESRI software, and as a result our company accrued skills and experience beyond what most “GIS companies” in the business had.

We got good at databases, and the FME. We got good at Perl, and eventually Java. We were the local experts in a locally developed (and now defunct) data analysis tool called Facet, which was the meat of our business for the first four years or so.

Facet

That Facet tool was a key part of a “watershed analysis atlas” the BC government commissioned from Facet in the late 1990’s. We worked as sub-contractors, building the analytical routines that would suck in dozens of environmental layers, chop them up by watershed, and spit out neat tables and maps, one for each watershed. Given the computational power of the era, we had to use multiple Sun workstations to run the final analysis province-wide, and to manage the job queue, and keep track of intermediate results, we placed them all into tables in PostgreSQL.

Putting the chopped up pieces of spatial data as blobs into PostgreSQL was what inspired PostGIS. It seemed really obvious that we had the makings of an interactive real-time analysis engine, with all this processed data in the database, if we could just do more with the blobs than only stuff them in and pull them out.

Maybe We Should do Spatial Databases?

Reading about spatial databases circa 2000 you would find that:

This led to two initiatives on our part, one of which succeeded and the other of which did not.

First, I started exploring whether there was an opportunity in the BC government for a consulting company that had skill with Oracle’s spatial features. BC was actually standardized on Oracle as the official database for all things governmental. But despite working with the local sales rep and looking for places where spatial might be of interest, we came up dry.

Oracle

The existing big Oracle ministries (Finance, Justice) didn’t do spatial, and the heavily spatial natural resource ministries (Forests, Environment) were still deeply embedded in a “GIS is special” head space, and didn’t see any use for a “spatial database”. This was all probably a good thing, as it turned out.

Our second spatial database initiative was to explore whether any of the spatial models described in the OpenGIS Simple Features for SQL specification were actually practical. In addition to describing the spatial types and functions, the specification described three ways to store the spatial part of a table.

OpenGIS

  • In a set of side tables (scheme 1a), where each feature was broken down into x’s and y’s stored in rows and columns in a table of numbers.
  • In a “binary large object” (BLOB) (scheme 1b).
  • In a “geometry type” (scheme 2).

Since the watershed work had given us experience with PostgreSQL, we carried out the testing with that database, examining: could we store spatial data in the database and pull it out efficiently enough to make a database-backed spatial viewer.

JShape

For the viewer part of the equation, we ran all the experiments using a Java applet called JShape. I was quite fond of JShape and had built a few little map viewer web pages for clients using it, so hooking it up to a dynamic data source rather than files was a rather exciting prospect.

All the development was done on the trusty Sun Ultra 10 I had taken out a $10,000 loan to purchase when starting up the company. (At the time, we were still making a big chunk of our revenue from programming against the Facet software, which only ran on Sun hardware.)

Ultra10

  • The first experiment, shredding the data into side tables, and then re-constituting it for display was very disappointing. It was just too slow to be usable.
  • The second experiment, using the PostgreSQL BLOB interface to store the objects, was much faster, but still a little disappointing. And there was no obvious way to add an index to the data.

Breakthrough

At this point we almost stopped: we’d tried all the stuff explained in the user-level documentation for PostgreSQL. But our most sophisticated developer, Dave Blasby, who had actually studied computer science (most of us had mathematics and physics degrees), and was unafraid of low-level languages, looked through the PostgreSQL code and contrib section and said he could probably do a custom type, given some time.

So he took several days and gave it a try. He succeeded!

When Dave had a working prototype, we hooked it up to our little applet and the thing sang. It was wonderfully quick, even when we loaded up quite large tables, zooming around the spatial data and drawing our maps. This is something we’d only seen on fancy XWindows displays on UNIX workstations and now were were doing it in an applet on ordinary PC. It was quite amazing.

We had gotten a lot of very good use out of the PostgreSQL database, but there was no commercial ecosystem for PostgreSQL extensions, so it seemed like the best business use of PostGIS was to put it “out there” as open source and see if it generated some in-bound customer traffic.

At the time, Refractions had perhaps 6 staff (it’s hard to remember precisely) and many of them contributed, both to the initial release and over time.

  • Dave Blasby continued polishing the code, adding some extra functions that seemed to make sense.
  • Jeff Lounsbury, the only other staffer who could write C, took up the task of a utility to convert Shape files into SQL, to make loading spatial data easier.
  • I took on the work of setting up a Makefile for the code, moving it into a CVS repository, writing the documentation, and getting things ready for open sourcing.
  • Graeme Leeming and Phil Kayal, my business partners, put up with this apparently non-commercial distraction. Chris Hodgson, an extremely clever developer, must have been busy elsewhere or perhaps had not joined us just yet, but he shows up in later commit logs.

Release

Finally, on May 31, Dave sent out the initial release announcement. It was PostGIS 0.1, and you can still download it, if you like. This first release had a “geometry” type, a spatial index using the PostgreSQL GIST API, and these functions:

  • npoints(GEOMETRY)
  • nrings(GEOMETRY)
  • mem_size(GEOMETRY)
  • numb_sub_objs(GEOMETRY)
  • summary(GEOMETRY)
  • length3d(GEOMETRY)
  • length2d(GEOMETRY)
  • area2d(GEOMETRY)
  • perimeter3d(GEOMETRY)
  • perimeter2d(GEOMETRY)
  • truly_inside(GEOMETRY, GEOMETRY)

The only analytical function, “truly_inside()” just tested if a point was inside a polygon. (For a history of how PostGIS got many of the other analytical functions it now has, see History of JTS and GEOS on Martin Davis’ blog.)

Reading through those early mailing list posts from 2001, it’s amazing how fast PostGIS integrated into the wider open source geospatial ecosystem. There are posts from Frank Warmerdam of GDAL and Daniel Morissette of MapServer within the first month of release. Developers from the Java GeoTools/GeoServer ecosystem show up early on as well.

There was a huge demand for an open source spatial database, and we just happened to show up at the right time.

Where are they Now?

  • Graeme, Phil, Jeff and Chris are still doing geospatial consulting at Refractions Research.
  • Dave maintained and improved PostGIS for the first couple years. He left Refractions for other work, but still works in open source geospatial from time to time, mostly in the world of GeoServer and other Java projects.
  • I found participating in the growth of PostGIS very exciting, and much of my consulting work… less exciting. In 2008, I left Refractions and learned enough C to join the PostGIS development community as a contributor, which I’ve been doing ever since, currently as a Executive Geospatial Engineer at Crunchy Data.

MapScaping Podcast - GDAL

Yesterday I talked about all-things-GDAL (or at least all the things that fit in 30 minutes) with MapScaping Podcast’s Daniel O’Donohue.

In the same way that Linux is the under-appreciated substrate of modern computing, GDAL is the under-appreciated substrate of modern geospatial data management. If the compute is running in the cloud, it’s probably running on Linux; if the geospatial data are flowing through the cloud, they’re probably flowing through GDAL.

GDAL

At the same time as it has risen to being the number one spatial data processing tool in the world (by volume anyways), GDAL has maintained an economic support model from the last century. One maintainer (currently Even Rouault), earning a living with new feature development, and doing all the work of code quality, integration, testing, documentation, and promotion as a loss leader. This model burns out maintainers, and it doesn’t ask the organizations that gain the most value from GDAL (the ones pushing terrabytes of pixels through the cloud) to contribute commensurate with the value they receive.

With the new GDAL sponsor model, the organizations who receive the most value are stepping up to do their share. If your organization uses GDAL, and especially if it uses it in volume, consider joining the other sponsors in making sure GDAL remains high quality and cutting edge by sponsoring.

Thanks Daniel, for having me on!