Surveyor's Sermon

Last month I had an opportunity to give a 5-minute “Ignite” talk at Where 2.0 in San Jose. I chose my topic because I fear one of the things missing in the technologist enthusiasm for geolocation, particularly in the population of technically astute but non-geo people, is a respect for how locations are actually derived, and knowledge of the provenance of the data that undergirds our new mapping tools.

FOSS4G 2010 Going to be Big

The presentation abstract deadline has passed for FOSS4G 2010 and there have been 354 abstracts submitted! I’m not sure the exact number of session slots in the planned program, but for comparison 2007 had 120 slots and subsequent events have been similar – that’s about how many 30 minute talks you can fit into two days in a five-track format. In 2007, the biggest FOSS4G so far, we had around 240 submissions for our 120 slots. So, with three submissions for every slot, 2010 is looking like it is going to be gangbusters! Everyone wants to go to Barcelona, I guess they heard about the escudella i carn d’olla.

On the road to Damascus... GPL to BSD

In response to a Twitter comment I made about my change in attitude towards the BSD and GPL open source licenses, Martin Daly asks:

Blog post on the Damascene (or otherwise) conversion from GPL-ish to BSD-ish please. In your own time.

The reason I was even mentioning the license issue is because I was watching a panel discussion at Where 2.0, in which Steve Coast, in defending the GPL-ish license used for OpenStreetMap, said (paraphrase) that the rationale was keeping a third party from taking the open data, closing it, and working from there. And that rationale pretty closely mirrors how we were thinking when choosing the GPL license for PostGIS, back in 2001.

I have experienced no spectral presence or flashing lights.

However, over the years, I have spent far too much time talking to various corporate folks about how the PostGIS GPL license wouldn’t affect their plans to use PostGIS as a database component in their systems. For everyone who came and asked, I am sure many didn’t. But in general the license was an impediment to some organizations engaging with the project. So there was a downside to the GPL. And was there an upside? Did the license protect us from privatized forks?

Well, yes, insofar is as it legally prevented them from happening. But it is worth questioning the implicit assumption that such forks are actually harmful to the project.

And here my experience watching the MapServer community (working with the BSD MIT license) was useful. Because over those same years, I watched the MapServer project chug healthily along, even as some third parties did in fact take the code and work on closed forks from time to time.

The lesson I have taken from my observation is that the strength of open source projects resides not in the licenses or the code, but in the communities arrayed about them. Copying the code of MapServer and doing your own thing with it does not stop MapServer developers from working, and in the long run you’ll probably be better off working within the community than outside it, to gain from the efforts of others. Trying to maintain a private parallel branch and patch in the changes from the open development will quickly become more effort than it is worth. At the same time, because the BSD license is so permissive, there are no legal impediments for companies to engage with the community.

Look at any healthy open source project and ask yourself: what is more valuable, the code or the community? You could take all the code away from a healthy project, and it would start right up again from scratch and probably do a better job the second time. The value is in the human relationships and the aggregation of cooperating talent.

The GPL tends to keep corporate actors out of the community (for good reasons or bad, that’s a separate discussion, but it does). That slows down the development of the project by reducing size of the development pool. Which, counterintuitively, makes forking or ignoring the open project more attractive. Because a slow moving project is easy to beat. A fast moving project is risky to fork, because it is likely that your (relatively understaffed) fork will be left behind by the open project.

So, back to what originally made me delve into license wankery, the question of OpenStreetMap license: the value of the OSM community and philosophy and infrastructure is way higher than the data at this point. And bringing corporate actors into the OSM community would only increase the relative advantage of the open project over any equivalent closed effort.

But changing licenses is a brutally hard thing to do, and it gets harder the longer a license is in place. PostGIS will never change licenses, for example. There are too many developers who contributed in the past under the GPL, and changing license would require the assent of all of them.

But if I ever start a new project, it will definitely be under the BSD.

FOSS4G 2011 Decision Process

The 2011 FOSS4G siting process has begun. After the 2010 process, where there were too many good proposals and not enough slots, we decided to do a couple things to lower the effort level required by bidders in aggregate.

First, we wanted to be more definitive about our regional siting plan, which is to rotate between Europe (2010), North America (2011) and elsewhere (with elsewhere being Asia (2012) in the current rotation plan). For 2011 the regional arrow points to North America, since 2010 is in Europe.

Second, we wanted to lower the bar to submitting a proposal. In 2010 we got four full proposals, and could choose only one. So 75% of the bidders did a large amount of preliminary research and spade-work for no purpose. For the 2011 process, we will start with a “letter of intent” phase, a short document outlining expressions of interest. Hopefully having all the interested parties publicly declared will promote locations coming together and forming joint bids.

The OSGeo conference committee will vote on the letters and only the top two will be asked to prepare full bid documents. That would still leave one potentially disappointed party at the end of the process, but is a big improvement over leaving three, as happened in the 2008 and 2010 processes. And hopefully the letters phase will encourage groups to consolidate bids so we won’t have to even run the final round.

The winning site will still have to submit a proposal and budget, so the Board and Conference Committee have confidence that the local team has thought things through, but the kind of research necessary to prepare a full proposal is a necessary precursor to putting on a conference, so the effort will be put to good purpose.

I’m looking forward to seeing the letters!

How to get Your bug fixed

Mike Leahy is providing a textbook demonstration of bug-dogging on the PostGIS users list this week, and anyone interested in learning how to interact with a development community to get something done would do well to study it.

Some key points:

He does as much of the work in diagnosing the problem as possible, including combing through Google for references, cutting down the test data as small as possible, trying to find smaller cases that exercise the issue.

He responds very quickly (you’ll have to read the timestamps) to questions and suggestions for gathering more information. Since the problem appears initially to manifest only on his machine, any delay on his part risks disengaging the folks helping him.

He prepares a sample database and query to allow the development team to easily replicate the situation on their machines.

And he also gets lucky. The problem is replicable, and the discussion catches the attention of Greg Stark, who recalls and digs up some changes Tom Lane made to PostgreSQL which in turn leads me to find the one-argument-change that can remove the problem. Very lucky, really, it’s unlikely I would have been able to debug it by brute force.