Tokenization and Your Private Data (2)

So, (Day 1) the BC government’s vendors (and thus, by extension, the BC government) are hot to trot to use the cloud CRM to store the personal data of BC citizens. But, BC privacy law does not allow that. Whatever will the government do?

Enter stage left: “tokenization”. The CIO has recommended tokenization technology for Ministries looking to use and other cloud services to manage private information:

Using tokenization – a method of substituting specified data fields for arbitrary values – these solutions allow for the use of foreign-based services while remaining within the residency-based restrictions of FOIPPA.
Bette-Jo Hughes, Oct 2, 2013

Tokenization is a strategy that takes every word in an input text, and replaces it with a random substitution “token”, and keeps track of the relationship between words and tokens. So, the input to a tokenization process would be N words, and the output would be N random numbers, and an N-entry dictionary matching the words to the numbers that replaced them.

Crytography buffs will note that this is just a one-time pad, an old but unbreakable scheme for encoding messages, only operating word-by-word instead of letter-by-letter.

This seems like a nice trick!

Input Dictionary Output
Paul Ramsey
Paul Jones
Tim Jones
Paul = rtah
Ramsey = hgat
Paul = fasp
Jones = nasd
Tim = yhav
Jones = imfa
rtah hgat
fasp nasd
yhav imfa

If you are clever, you can put a tokenizing filter between your users and American web sites like, and have the tokenizer replace the words you send to with tokens, and replace the tokens sends you with words. So the data at will be gobbledegook, but what you see on your screen will be words. Magic!

If all we wanted to do was just store data securely somewhere outside of Canada, and then get it back, “tokenization” would be a grand idea, but there’s a hitch.

  • First, storing tokenized data means storing 3-times the volume of the original (one copy of tokens stored at, and a locally stored dictionary that contains both the original and the tokens). You get no benefit from the cloud from a storage standpoint (in fact it’s worse, you’re storing twice as much local data); and, you get no redundancy benefit, since if you lose your local copy of the dictionary the cloud data becomes meaningless.
  • Second, and most importantly this whole exercise isn’t about storing data, it’s about making use of a customer relationship management (CRM) system,, and secure tokenization, as described above, is not consistent with using effectively.

Tomorrow, we’ll discuss why this most excellent “tokenization” magic doesn’t work if you want to use it inside a CRM (or any other system that expects its data to have meaning).

Tokenization and Your Private Data (1)

One morning this winter, while I was sipping my coffee at the cafe below our office, a well-dressed man and woman sat down at the table next to me, and started talking. Turns out, they were my favourite kind of people — IT people! They were going to bid on the Integrated Decision Making project, and were talking about my favourite systems integrator, Deloitte.

“Is Deloitte trying to bring ICM and Siebel into this project?” she asked.

“No, not anymore” he replied “now they are really pushing”

Now this was interesting! Chastened by their failure to shoehorn social services case management into a CRM, Deloitte has adroitly pivoted and is trying to shoehorn natural resource permitting into … a cloud CRM.

(I should parenthetically point out that, unsurprisingly, the SALES people in our company find very useful in coordinating and tracking their SALES activities.)

Certainly pushing a platform that is actually growing in usage makes more sense than pushing one that end-of-lifed a decade ago, but still, again with the CRM?

Deloitte isn’t being coy with their plans, they are selling them to the highest levels of the government. On October 7, 2013, the BC CIO spent two and a half hours enjoying the hospitality of Deloitte and at a “BC government executive luncheon” on the topic “Innovation, Transformation and Cloud Computing in the Public Sector”.

And there’s another wrinkle. is a US-based cloud service provider, and our Freedom of Information and Protection of Privacy Act (FOIPPA) says that personal data must be stored in Canada. is also a US legal entity, which means they are subject to the PATRIOT Act which allows authorities to access personal data without notifying the subject of the search. That is also not allowed by BC’s FOIPPA.

What is an ambitious system integrator with a hammer suitable for every nail to do? Not change hammers! That would be silly. Far better to try and get an exemption or figure out a workaround. Workarounds add nice juicy extra complexity to the hammer, which can only help billable hours.

More on the workaround, tomorrow.

Keynote @ FME User Conference

FME was one of the first geospatial tools I learned at the start of my career, back in the mid-90s, and getting invited to keynote the quintennial FME Users Conference this year was quite an honour, so I wrote up a special keynote just for them.

When is an IT project just an IT project?

And when is it something more?

Every year, I report on the progress of IT outsourcing in BC (news flash: it keeps going up, 2011, 2012, 2013) and marvel at the sums we lavish on international consultancies, fees that largely march offshore, generating no local innovation or economic growth.

Last fall, I came across a news release from the Ministry of Health, describing a $842 MILLION “Clinical and Systems Transformation Project”. I now realize, I’ve not been tracking a significant seam of IT spending: the systems being commissioned by the five regional health authorities and their central services arm, the Provincial Health Services Authority.

Indeed, a quick perusal of the 2012/13 PHSA suppliers list shows a $50M spend on IBM, and an $11M spend on HP in just one year. That’s enough to change my annual spending tracker quite a bit!

So, IBM won the new “Clinical and Systems Transformation Project”, worth $842 MILLION over 10 years, I wonder what that RFP looked like? I asked for it, and was refused, so I FOI’ed it, and it came back. It’s 500 pages long. Have a look.

Fun sidebar: On page 186, in the “economic model” of the RFP, they direct that “proponents are to include 4% growth per year in infrastructure (e.g. storage capacity, network bandwidth, processing capacity, etc.) needs over the Term.” Any readers see a problem modelling IT capacity requirements at 4% growth per year over 10 years? Hint: A 2003 iMac shipped with 256Mb RAM; a 2013 iMac ships with 8Gb RAM: that’s 32 times more capacity. 4% compounding over 10 years generates only a 50% increase in capacity over a decade. Think those terms will need to be renegotiated?

It’s a long read, but fortunately there’s a really interesting bit right away, in the Mandatory Requirements:

Proponent is willing and able to transition any Public Sector union agreements relevant to the Managed Services to their organization, if required

Whoa! This isn’t just an IT systems agreement after all, it’s an outsourcing deal.

The government seems to have learned little from the experience of BC Hydro outsourcing to Accenture or Medical Services Plan to Maximus, or from reports by the Auditor General, or even their own consultants who reviewed outsourcing from 2001-2010 and noted that:

  • Contracts were structured towards a specific solution or specific outputs rather than a desired outcome
  • Contracts were negotiated in isolation gave the same scope of services to multiple vendors
  • The procurement process resulted in contracts that while defined, are no longer what is required
  • Risk transfer objectives were not met
  • There was no consolidated vendor management
  • There was no central management of the deals or the benefits achieved

The “Alternative Service Delivery Secretariat” wound down in 2010, but the government is still hard at it, now quietly preparing to outsource the clinical systems of three health authorities to IBM, for $84M a year over 10 years. Significant portions of critical government operations are being transferred beyond direct government control for very long periods of time.

Perhaps the managers who pushed this solution didn’t trust their own staff, or themselves, to successfully bring an ambitious project to conclusion. They didn’t want to “take the risk” so they took the “safe” option. They need to spend some time behind the velvet curtain in organizations like IBM or Accenture: the only results that matter to those organizations are the quarterly results.

There will be some good people in them, and some bad ones, but the level of competence or capability won’t be orders of magnitude better than you could build yourself in-house. And as organization, as corporations, they have only one bottom line, and it’s theirs, not ours.

Examples are not Normative

Once, when I still had the energy, I was reading an Open Geospatial Consortium specification document, and found an inconsistency between a directive stated in the text, and the examples provided to illustrate the directive. This seemed pretty important, since most “Humans” use the examples and ignore the text, so I raised it with the author, who replied to me:

“Examples are not normative”

To me, this seemed to summarize in four words everything there was to dislike about the standards community: dismissive, self-referential (“normative”? really?), and unconcerned with real-world practice. One of the reasons I no longer have the energy.