Counting Votes

This is orthogonal to geospatial, but c’est la vie. Apparently the United States’ quadrennial meltdown over how to count things is firing up again.

A big part of the problem appears to be the de-centralization of the US institutions responsible for conducting elections (in some cases each county gets to decide how to manage the electoral process) and another part is that the people responsible for running elections are themselves elected partisans (the state “secretaries of state”, for example Katherine Harris and Kenneth Blackwell).

Rather than address these structural issues at the root (have elections run by a single federal organization with leadership acceptable to both parties), the USA keeps trying to fix the problem with new and better machines.

For reference, here is how we do it in the civilized world.

British Columbia provincial elections are managed by Elections BC, an independent agency of the government, the head of which is chosen by an all-party committee of the legislature. Federal elections are managed by Elections Canada, ditto.

Each polling station handles a number of polls, each of which has a few hundred people in it. Each poll has its own vote box. Each ballot is numbered, and comes from a book of ballots, so that the number of ballots that end up in a box can be reconciled to the number of voters in a poll, and to the books which were assigned to each poll. The polling stations are staffed by temporary paid workers, two workers to each box, so the box is never unattended. Parties are allowed to have “scrutineers” in the station, who may observe the process. Parties often keep an independent count of how many people have voted, and who has voted (this information is fed back to the central campaign and used to drive get-out-the-vote efforts, by calling known supporters who have not shown up to vote earlier in the day).

Paper ballot in sealed cardboard box. It's that easy

At the end of the day, each box is audibly counted, under the eyes of the scrutineers and the poll supervisor. It takes about an hour to an hour and a half to count all the ballots in a poll. Because the scrutineers are keeping tallies during the count, boxes will sometimes get re-counted if the tallies don’t match.

The effect of having an open process (everyone gets to watch, right down to the individual votes for each poll counts) both increases confidence in the process and the accuracy of the result, because many eyes are working on the problem at once. The poll supervisor reports the totals to his superior in the district, who in turn reports to the central electoral authorities. And the scrutineers independently report the poll numbers to their party headquarters, where they are totaled up to provide the politicians with an early snapshot of the race. Multiple eyes, multiple paths for the data to flow, physical ballots for post-facto processing.

In the event of a really close election, things slow down a lot, because a panel of judges has to sit down in a room and hand-count the complete set of votes for the whole riding. This takes about a week. It’s the difference between parallel processing (each box counted simultaneously on election day) and sequential processing.

The reason such a primitive system can work better than all the fancy US computers is because the standards for things like ballots, handling ballots, counting ballots, box security, etc, are all set and managed centrally, by an independent agency (not an elections equipment vendor). It is basic logistics. Standard processes mandated and followed consistently make logistically difficult problems (like accurately gathering and counting 2 million individually marked ballots) achievable.

Enough with the machines, USA, get a decent organization and return to the pen and paper!

Open Source on a GSA Schedule

After much wailing and gnashing of teeth, Refractions secured a GSA schedule today (it’s not on their web site yet, but we just got the “contract is in the mail” notification)! For those not in the know, it’s worth an explanation.

GSA

GSA is the US General Services Administration, a catch-all purchasing authority for the US federal government. Government purchasing is difficult, because spending taxpayer’s money requires a good deal more transparency and fairness than needs to be exercised in the private sector. Doing “requests for proposals” is a lot of work, to write and evaluate, and many jurisdictions have routed around RFPs by creating the idea of a “standing order”, a pre-negotiated contract for specific products or services. GSA creates standing orders for the entire federal government, so having a GSA contract allows you to sell to a lot of clients with a lot less paperwork.

So this is a big deal for Refractions, and now opens the door to the question: can we sell enough open source (and other) geospatial services to keep our GSA contract? GSA contracts are a use-it-or-lose-it affair, so we have to hit minimum sales targets or they rip it up.

They Called It Off!

Good news from down Redmond way. The SQL Server spatial team has decided to make the coordinate order returned by their STAsText and STAsBinary functions consistent with the existing industry practice: (easting, northing) or (longitude, latitude) or (x, y), depending on how you look at it.

That kind of community responsiveness practically cries out for… a five pound box of chocolates! Note to Isaac and company: don’t eat them all in one sitting, no matter how tempting it might be!

Let's Call the Whole Thing Off...

You say “potato”, I say “potato”; you say “long/lat”, I say “lat/long”. Long/lat, lat/long, lat/long, long/lat, let’s call the whole thing off!

This thread at the MSDN forums is an interesting read, if you are a complete loser. (Disclosure: I am a complete loser.)

Y/X World

In a nutshell, Microsoft thinks the world looks like this. OK, OK, I’m being unfair, what they think is that it makes sense to use a latitude/longitude (y/x) order for ordinates in the Well-Known Text (WKT) and Well-Known Binary (WKB) that their STAsText() and STAsBinary() methods return, respectively. (Pls. see above re: my complete loserness.)

This is what the SQL Server spatial team had to say to a user wondering at this behavior:

This is the expected behavior, but as you found there is not a standard industry consensus on the ordering of latitude / longitude coordinates in formats such as WKT and WKB. The OGC SFS document does not cover geographic coordinates, only planar data, so it is not clear that the same ordering is necessary. However, the EPSG definition itself for 4326 defines the axis order as latitude / longitude, and that is what we use and what is defined by other formats such as GML / GeoRSS.

Here is a thread from the OpenGeoSpatial mailing list defining this behavior.

The place where I (surprise!) violently disagree with the Microsoft team is the assertion that “there is not a standard industry consensus on the ordering of … coordinates in … WKT and WKB “. There is, in fact, a massive industry consensus on WKT and WKB coordinate order. If the Microsoft team can find a shipping product that deliberately creates WKT or WKB in lat/long order I’ll send them a 5lb box of Roger’s Chocolates.

The comment goes on to muddy the waters by talking about GML, and GeoRSS and the OGC discussions on the topic of axis order, but is totally wrong about the core issue: what is the industry standard order for WKT and WKB. In this case, Microsoft is late to the game, they don’t get to set the de facto standard, because there is already a de facto standard, and it is long/lat.

If Microsoft wants to interoperate easily with the standards-based products already in the marketplace, they will implement the de facto standard for their STAsBinary() and STAsText() functions. If they are paying lip service to actual interoperability (“we implemented the standard but for some reason absolutely everybody else is doing it different! who knew?”) they’ll do something else.

The de jure standard is, as the comment correctly notes, well nigh impossible to divine, because the OGC guidance on the subject has been scattered through so many areas, and because there is no explicit guidance on the topic for WKB and WKT. But the de facto standard, the “standard industry consensus” is clear: long/lat.

Update: To clarify the chocolate challenge, products that produce backwards WKB and WKT to satisfy SQL Server (FME, Manifold) don’t count. This is about the industry standard that pre-existed SQL Server.

Drived Products

The news that a third center-line road data provider is starting up is very interesting. In an interview with Directions they are relatively closed-lipped about why they think they can compete with the big boys, but a good guess can be had by looking at their earlier news releases. This not a road mapping company, this is a computer vision company. Presumably the liberal application of computer vision allows them to combine their GPS and intertial data with road sign readings to build out full navigation-restricted data without a heavy manual data post-processing step.

That still leaves the tricky aspect of actually driving their cars down every street in the country, and I wonder what mapping source they used to ensure they weren’t missing any. The mathematics of driving every road in the country are also daunting: 4 million miles of roads in the USA, at an average speed of 20 miles per hour, for 8 hours a day, implies 25,000 days of driving! Or $2M using $10/hour drivers, plus the capital cost (at least $5M) of the 70+ vehicles needed to do the thing in about a year.

On the other hand, given that Nokia just paid $8B for NavTeq, the cost of entry into this marketplace is incredibly low — $20M maybe? I think I’m in the wrong business, time to hit the road!