Davy Stevenson has a great post (everyone should write more, more often) on a small Twitter storm she precipitated and that I participated in. Like all sound-and-fury-signifying-nothing it was mostly about misunderstandings, so I’d like to add my own information about mental state to help clarify where I come from as a maintainer.
First, PostGIS is full of shortcomings. Have a look at the (never shrinking) ticket backlog. Sure, a lot of those are feature ideas and stuff in “future”, but there’s also lots of bugs. Fortunately, most of those bugs affect almost nobody, and are easily avoided (so people report them, then avoid them).
When we first come up an a “major” release (2.0 to 2.1 for example) I expect lots of bugs to shake out in the early days, as people try the new release in ways that are not anticipated by our regression tests. (It’s worth noting that the ever-growing collection of regression tests provides a huge safety net under our work, allowing us to add features and speed without breaking things… for cases we test.)
My expectation is that the relative severity of bugs reported decreases as the time from initial release increases. Basically, people will always be finding bugs, but they will be for narrower and narrower user cases, data situations that come up extremely infrequently.
The bug Davy’s team ran across broke my usual rules. It took quite a while for users to find and report, and yet was broad enough to affect moderately common use cases. However, is was also something the vast majority of PostGIS users would not run across:
If you were using the Geography type, and
If you were storing polygons in your Geography column, and
If you queried your column with another Polygon, and
If the query polygon was fully contained in one of the column polygons, then
The distance reported between the polygons would be non-zero (when it should be zero!)
It happens! It happened to Davy’s team, and it happened to other folks (the ones who originally filed #2556) — I was actually working on the bug on the plane a couple weeks before. It was a tricky one to both find and to diagnose, because it was related to caching behaviour: you could not reproduce it using a query that returned a single record, it had to return more than one record.
If I was prickly about the report from Davy:
And pricklier still about the less nuanced report of her colleague Jerry:
That prickliness arose because, on the basis of a very particular and narrow (but real!) use case, they were tarring the whole release, which had been out and functioning perfectly well for thousands and thousands of users for months.
Also, I was feeling guilty for not addressing it earlier. PostGIS has gotten a lot bigger than me, and I don’t even try to address raster or topology bugs, but in the vector space I take pride in knocking down real issues quickly. But this issue had dragged on for a couple months without resolution, despite the diligent sleuthing of Regina Obe, and a perfect reproduction case from “gekorob”, the original reporter.
That’s where I’m coming from.
I can also empathize with Jerry and others who ran across this issue. It’s slippery, it would eat up a non-trivial amount of time isolating. Having had the time eaten, a normal emotional response would be “goddamn it, PostGIS, you’ve screwed me, I won’t let you screw others!” Also, having eaten many of your personal hours, the bug would appear big not narrow, worthy of a broadcast condemnation, not a modest warning.
Anyways, that’s the tempest and teapot. I’m going to finish my morning by putting this case into the regression suite, so it’ll never recur again. That’s the best part of fixing a bug in PostGIS, locking the door behind you so it can never come out again.
Has it come to this? Has government decided that it is completely incapable of recruiting and training its own technical staff?
Elections BC is not a big shop, and frankly, its technical problems are not that challenging (with the interesting exception of the quadrennial Failure Is Not An Option election event). And yet, they are now building a list of 5 bidders (to be winnowed down to one in the end) to outsource all their systems work to. Maintenance, new builds, the lot of it.
Does government no longer have an HR department? This is a very very very expensive version of Kelly Services except instead of placing secretaries we’re now placing IT staff, and at markups that would old Kelly blush.
Place your bets now: who will win? HP? CGI? Accenture? Deloitte? The important thing is that the shareholders get a good return. Which reminds me, time to rebalance the portfolio…
Last week I was driving home from the NDP convention with a semi-retired traffic engineer in the backseat (and he was not backseat driving) when we passed under the Fraser River via the Massey Tunney.
“This tunnel provides 3 lanes in the direction of peak flow during rush hour”, he said, “and if they build the 8-lane Massey Bridge that will add one lane, which is 2000 cars per hour, which over a 3 hour rush hour is 6000 new commuters. Assuming half the new households they put in here have a commuter the new bridge will support the development of 12,000 new houses.”
Now, 12,000 new houses in south Delta/Ladner/Tsawassen will certainly eat up a lot of nice farm land, and that’s a tragedy in its own right. But given that number of households, I immediately divided it into the $3B probable cost of a bridge-plus-Richmond-highway-expansion, and was blown away.
Three billion dollars divided by twelve thousand homes is $250,000 per home. Be generous if you like, and assume a much lower number of Fraser crossing commuters in the new developments. It’s still hard to get the number lower than $100,000 per new home.
So new development is the only reason to build an 8-lane monster bridge over the river.
The extra lanes on the bridge will increase capacity by 2,000 cars per hour, which can be extrapolated to 12,000 to 24,000 new homes, depending on your assumptions.
The last major bridge/highway project cost 3 billion dollars, and everyone seems to think building a new bridge and widening the freeway north of it will also cost “about” 3 billion dollars.
3 billion dollars divided by 24,000 homes is $125,000 per home.
Which leads me to the inescapable conclusion that BC taxpayers are going to be subsidizing “affordable” new housing south of the Fraser to the tune of about $125K per new home.
Can someone stop the ride now? I’m feeling a bit queezy and I want to get off.
In the comments of my last post I was asked to provide a take on the healthcare.gov fiasco. Since I cannot improve a whit on Clay Shirky’s perfect post today, I will settle for just quoting my favorite paragraph.
The vision of “technology” as something you can buy according to a plan, then have delivered as if it were coming off a truck, flatters and relieves managers who have no idea and no interest in how this stuff works, but it’s also a breeding ground for disaster. The mismatch between technical competence and executive authority is at least as bad in government now as it was in media companies in the 1990s, but with much more at stake.
I’ve been asked by my technical readers to stop writing so much about politics, but I cannot help myself, and this week I have the perfect opportunity to apply my technical skills to a local political topic.
History and Background
Like Britain and the USA (and very few other jurisdictions anymore), British Columbia has a first-past-the-post representative electoral system. The province is divided up into “electoral districts” (also known as “ridings” in the British tradition) and each district returns a single member to the Legislature. In each district, the candidate with the most votes wins the district, even if they only obtain a plurality. In the Legislature, the party with most seats forms government.
In such a system, the geographical layout of the districts has a great deal of importance, because it is possible for a party to win a majority of votes in the province, but a minority of seats in the legislature, if the votes are concentrated in particular seats. This actually happened in British Columbia in 1996. It also happened in the USA in the 2000 Presidential election, since their presidential Electoral College effectively acts like a weighted version of a first-past-the-post Legislature.
In a representative democracy, it’s important that everyone’s vote have the same weight, which means ensuring that the each district has approximately the same number of people in it. As relative populations grow in some regions and shrink in others, districts can become unbalanced, and districts need to be redrawn. In the USA, this “redistricting” process is often driven by partisan considerations, and can lead to districts like this (try out this awesome gerrymandered districts puzzle game):
Fortunately, since 1989 British Columbia’s districts have been drawn every 10 years by non-partisan “Electoral Boundaries Commissions”, and the primary consideration has been creating districts that are as equal in population as possible while allowing for effective representation.
Effective Representation
BC is a big place and a place of extremes. The smallest district in BC is in downtown Vancouver (Vancouver-West End), with an area of under 500 hectares: it takes less than 30 minutes to walk across it and 48,500 people live there.
The largest district is in the north-west of the province (Stikine), with an area of almost 20,000,000 hectares: about the size of Ireland, Switzerland, Denmark and the Netherlands, combined. Just over 20,000 people live in it. When you’re dealing with areas this sparsely populated “effectiveness of representation” begins to have some concrete meaning.
On the other hand, the principle of “one person, one vote” is the corner-stone of democracy, and the goal of an electoral boundary re-distribution is to try to achieve it, as far as possible. There is a tension inherent in the process.
2008 Commission Catastrophe
Population growth in BC over the last generation has been concentrated in the south: mostly in Vancouver and its suburbs, with some in Vancouver Island and Kelowna. Commissions have dealt with this growth through a combination of increasing the number of seats in the Legislature (to suppress the growth of the average seat size), and slowly increasing the size of the rural districts (to keep defensibly close to the average).
In 2008, this process reached a tipping point, as the Commission recommended two new seats, and the transfer of three seats from rural areas to urban areas. Rural BC exploded in anger, and the government of the day rushed in legislation directing the Commission to add more seats than recommended and to avoid removing seats from certain rural areas.
At this point, though the process remained non-partisan (both parties in the Legislature supported the new plan), it had become thoroughly politicized (the carefully considered deliberations of the Commission had been hastily overturned by politicians for public relations purposes).
Formalization of Politicization
No doubt remembering the tumult of the 2008 experience, the current government of BC has released a proposal for the rules governing the next Electoral Boundary commission. The proposal aims to avoid a messy politicization of the process at the end, by politicizing quietly it in advance:
The Commission may not recommend adding any further seats to the Legislature
The Commission may not remove seats from three protected regions: North, Columbia-Kootenay, and Cariboo-Thompson
The protected regions look like this.
Note that the Okanagan region is isolated from the rest of the “unprotected” areas of the province, making it impossible to juggle population into or out of the region. That means the Okanagan can either gain or lose a whole seat, but never lose a “half” a seat by having population juggled in or out via boundary changes.
Anyone with a passing familiarity with BC electoral geography will recognize that this proposal entrenches an already large and growing deviation from the principle of one-person-one-vote, but I want to calculate just how large, and also to measure the “fairness” of this particular proposal.
Population
The electoral district boundaries of BC are available online as GIS files, but do not have population information attached (and would be out of date if they did, since they pre-date the most recent census).
Similarly, the StatsCan boundary files can be downloaded, and the attribute file giving the census 2011 population in each block is also available. There are about 500-800 blocks in each electoral district, making for a very fine-grained profile of where people are concentrated in each district.
I loaded the GIS files into a PostGIS spatial database for analysis. Once the electoral districts (ed) and dissemination blocks (db) were loaded, calculating the electoral district population in PostGIS was a simple spatial join query:
The results of this calculation and others in this article can be seen in the spreadsheet I’ve placed online.
A quick summary of the population results shows that, among other things:
The current distribution is extremely lopsided, with the most heavily populated riding (Surrey-Cloverdale, 73042) having well over 3 times the population of the least populated (Stikine, 20238)
The current provincial average population is 51765 per riding
The average population in the 17 “protected” ridings is 35609, 31% less that the provincial average
The average population in the 68 “unprotected” ridings is 55804, 8% higher than the provincial average
A vote in the protected regions will be over 1.5 times more “powerful” than one in the unprotected regions
Of the 85 ridings, 26 are below average and 59 are above average, indicating that the problem of underpopulation is concentrated in a minority of ridings
There is no doubt that the government proposal will enshrine the regional imbalance in representation, and further worsen it as continued migration into the south pushes the balance even further out of line.
“Fair” Imbalances
Legal challenges to imbalanced representation have resulted in court decisions that indicate that it is constitutional within limits, and with reasonable justification. The limits generally accepted by the courts are +/- 25% of the provincial average, and the starting point of this proposal already exceeds that on average–some individual ridings (like Stikine) will be much worse. Political commentators in BC are already musing about whether ridings built under this scheme would survive a court challenge.
Of more analytical interest is whether the scheme of selecting “protected regions” is a good one for choosing which ridings should receive preferential treatment.
“Representing” a riding involves being available to your constituents, meeting with other orders of government in your riding (cities, school boards), and attending local events. Representation is very much tied up with being where the people are.
If the people are all in one place, near together, then representing them is easy.
If the people are spread out, in many different localities, then representing them is hard.
Can we measure the “spreadoutness” of people? Yes, we can!
Calculating Dispersion
Each riding contains several hundred census dissemination blocks, each of which has a population associated with it. Imagine measuring the distance between each block, and all the other blocks in the riding, and weighting that distance by the population at each end.
For Vancouver-Fairview, the picture looks like this.
The blocks are fairly regular, the population is all very close together, and the dispersion is not very high.
For Skeena, the picture looks like this.
The population is concentrated in two centers (Terrace and Kitimat) reasonably far apart, giving a much higher dispersion than the urban ridings.
In mathematical terms, the formula for “dispersion” looks like this.
In the database, after creating a table of census blocks that are coded by riding, the calculation looks like this.
Taking the ratio of the distance scaled populations against the unscaled populations allows populations that are far apart to dominate ones that are close together. Scaling the final result down by 1000 just makes the numbers more readable.
As before, the results of this calculation and others in this article can be seen in the spreadsheet.
Is Regional Protection “Fair”
Using the measure of dispersion allows us to evaluate the government proposal on its merits: does protecting the North, Kootenays, Cariboo and Thompson protect those ridings that are most difficult to represent?
In short, no.
The regional scheme protects some difficult ridings (Stikine) but leaves others (North Island) unprotected. It also protects ridings that are not particularly dispersed at all (Kamloops-South Thompson), while leaving more dispersed ridings (Powell River-Sunshine Coast, Boundary-Similkameen) unprotected.
Among the larger ridings, Skeena is notable because even though it is the 10th largest riding by area, and 10th sparsest by population density, it’s only the 17th most dispersed. There are many smaller ridings with more dispersion (Powell River-Sunshine Coast, Nelson-Creston, Boundary-Similkameen). This is because most of the people in Skeena live in Terrace and Kitimat, making it much easier to represent than, say, North Island. Despite that, Skeena’s population is 43% below the average, while North Island’s is 5% above.
Kamloops-South Thompson is the least dispersed (score 15.2) protected riding, and it’s worth comparing it to the similar, yet unprotected, Nanaimo-North Cowichan (score 16.2).
Kamloops-South Thompson (protected) consists of a hunk of Kamloops, and a string of smaller communities laid out to the east for 50KM along Highway 1.
Nanaimo-North Cowichan (unprotected) consists of a hunk of Nanaimo, and a string of smaller communities laid out to the south for 45KM along Highway 1 (and some settled islands).
What is it about Kamloops-South Thompson that recommends it for protected status along with truly dispersed difficult ridings like Stikine? Nothing that I can determine.
Let the Commission Work
The intent of the government’s proposal to amend the Redistribution Act is clearly to avoid the firestorm of protest that accompanied the 2008 Commission report, and it’s good they are thinking ahead.
They need to think even further ahead: the consequences of having the boundaries enacted, then reversed in court, will be far more disruptive than allowing the Commission to proceed with the necessary work of redistributing BC’s districts to more fairly reflect our actual population distribution.
The end result of an unconstrained Commission will be fair boundaries that still reflect the representation needs of dispersed ridings by giving them lower populations within the limits already acknowledged by the courts: +/- 25% with a handful of exceptions (I’m looking at you, Stikine).
I encourage you to explore the data on dispersion, and how it relates to the regional “protection” scheme, in the spreadsheet.