COTS uber alles?

I continue to follow the ongoing $180M-and-counting IT debacle of “integrated case management” in British Columbia with interest, perhaps because it’s a local catastrophe and not some far-flung disaster.

The latest tidbit is an independent consultant’s review of ICM from the perspective of the Ministry of Children and Families (MCF). I hadn’t previously appreciated the extent to which the hydra-headed nature of ICM as a project of both MCF and the Ministry of Social Development (MSD) was contributing to disfunction.

In politic terms, the consultants point out that MCF was the weaker partner in the partnership, and therefore got steamrolled during the design phase. MSD had a stronger team, was better prepared, and got what they wanted. MCF was the red-headed step-child.

Another issue the consultants noted was the way the “COTS” requirement and “no customization” made a bad situation worse. No matter what the clients wanted, there was a arbitrary external rule always in play:

The ICM solution [should] not be based on a customized system, but as much as possible rely on an “out of the box” product (with necessary configuration to meet the respective needs of MCFD and MSD).

If you knew the magic incantations, you could sometimes get an exemption, but the MCF team didn’t know how to play the game:

It became clear in our interview process that while the project principles did support exceptions to the “no customization”’ rule where there was a solid business rationale, current MCFD leadership did not recognize they could make this appeal, and so remained fully constrained by the design and implementation of a single instance untailored to MCFD needs.

And so the final product rolled out was a design disaster for child protection:

The decision to implement a single instance of the Siebel product required that all users (regardless of ministry or program) share a common set of forms and data attributes, even though they touch on completely different topics and clients, and speak very different practice “languages”. This has also led to many forms being developed with redundant fields and or labels that are meaningless to the majority of users.

In fact, the “phase 2” rollout was such a disaster that the whole COTS “no customization” rule seems be to up for review! Tucked into the consultant report is this little gem of a status update:

The ICM/Deloitte Project Team has also been working on the design and development of an external service provider portal which will utilize a custom user interface (i.e. leveraging the information within Siebel but presented through a completely separate web application). While the service provider portal does not have any direct relationship to MCFD’s Child Protection Services, it does demonstrate that the project team can design and develop a solution that builds on the Siebel platform with a customized user interface and significantly improved usability. This approach will be of value as we consider the revised child protection solution and we see it as a very positive step.

Plan B is now officially under way! The generic Siebel interfaces are being ditched in favour of (gasp!) custom designed user interfaces fit to purpose. Will the Siebel interfaces all be eventually phased out? That’s my prediction.

The only question left is how long MSD and MFC will have to pay Siebel licensing for a system that has been delivered over-budget and with insufficient functionality largely to cater to the predictable limitations of the Siebel software and the “no customization” COTS philosophy. A long, long time, I fear.

Is building an enterprise systems a capital expense?

First, terminology: operating versus capital expenses. Bah! Accounting! However, it’s important stuff. Wikipedia provides a good example of the difference:

For example, the purchase of a photocopier involves capital expenditures, and the annual paper, toner, power and maintenance costs represents operating expenses.

The key thing to remember is that a capital expense is supposed to convert cash into an asset.

Second, application: most enterprise information system builds these days are funded as capital expenses. Money is spent, and at the end of the day the organization places an entry on the balance sheet saying “System Z is worth X million dollars”.

Third, contention: this common IT accounting practice is bullshit.

The reason it is bullshit is that the asset has to be placed on the books with a value, some dollar figure that represents what it is worth. This isn’t just some made up number, it’s important. This number is a component of the total asset value of the organization, and if you are adding up the value of the organization, it will be added in along with the cash, the real estate, the fixtures, all the other things that we know do have value.

Does the enterprise information system have value? How much? Where does that number come from?

Is the information system value a re-sale value? No. Unlike the photocopier from the Wikipedia example, the enterprise information system has no value outside the organization that built it: it’s sui generis, custom-built for the purposes of one organization.

Is the information system value a replacement cost? No. Governments build things like bridges and highways that don’t necessarily have a re-sale value (who is going to buy them?) but still have a provable value in terms of their replacement cost. If it takes $100M to build a bridge, it’s a fair bet that it’ll take $100M to build a second one just like it. Is this true of enterprise information systems? If I build a billing system and it costs me $1M, will it cost you the same thing to build a second one? If you have access to my source code and specifications, you could probably build an even nicer one for 1/10 the cost or less, since you wouldn’t have to spend any time at all discovering the requirements or doing data cleansing. There’s no expense in materials and relatively little of the labor value ends up embodied in the final product–the system value is not the replacement cost.

Is the information system value durable? No. Long-lived public infrastructure may not be re-sellable, but it often has a useful life span reckoned in multiple decades. Given that, it’s fair to say that the cash involved in building it has not been spent, but has been converted into a fixed asset. Do enterprise information systems have that kind of durability? Do I even have to ask?

Does the information system value reflect a one-time acquisition cost? No. The $3B (gulp) Port Mann bridge newly opened on BC’s Lower Mainland will have an annual maintenance cost of perhaps $30M a year, %1 of the capital cost. The asset is built, and we get to keep using it almost for free (except for the tolls to cover the loan, ha ha!). Is this true of enterprise information systems? In my experience, information systems budget 10-20% of build cost for expected annual maintenance. So, it’s very different again.

Is there any case at all to be made that enterprise information systems can be treated as assets, and hence budgeted as capital expenses? Yes. But it requires that the asset value be assessed very conservatively (the whole build budget is not indicative of final value), and that the value depreciate very quickly (the system has a relatively short lifespan, years not decades).

But rapid asset depreciation is just as hard on a balance sheet as operating spending is! Build a $100M system and depreciate it at 10% a year, all you’ve done is concentrate $100M of IT spending into a very short period of time and spread out the depreciation over a decade.

So, skill testing question for all you IT practitioners out there. Who will get the better results:

  • the manager who spends $100M in one year on a system build and depreciates his asset over the ensuing decade? or,
  • the manager who spends $10M a year over a decade in incremental system enhancements and improvements?

Note that both approaches have exactly the same effect on the organizational balance sheet. Take your time, don’t rush your answers.

Accounting for enterprise information systems as capital expenses is a mistake. It’s dubious from an accounting perspective, because the “asset” on the books isn’t re-sellable, doesn’t hold its value, and doesn’t cost nearly its book value to replace. And it’s dubious from a practical perspective because it forces system development and maintenance into an incredibly risky and inefficient spending pattern.

Don’t do it if you can help it.

What's so hard about that download?

Twitter is a real accountability tool. This post is penance for moaning about things in public.

Soooo…. last Friday, while cooling my heels in Denver on the way home, I took another stab at Chad Dickerson’s electoral district clipping problem, and came up with this version.

I ended up taking the old 1:50K “3rd order watershed” layer, using ST_Union to generate maximal provincial outline, using ST_Dump and ST_ExteriorRing to get out just the land boundaries (no lakes or wide rivers), used ST_Buffer to and ST_Simplify to get a reduced-yet-still-attractive version, differenced this land polygon from an outline polygon to get an “ocean” polygon, then (as I did previously) differenced that ocean from the electoral districts to get the clipped results. Phew.

And then I complained on the Twitter about the webstacle that now exists for anyone like me who wants to access those old 1:50K GIS files.

And the OpenData folks in BC, to their credit, wonder what I’m on about.

So, first of all, caveats:

  • The obstacles to access to this data were constructed years before open data existed as an explicit government initiative in BC. This is not a problem with the work the open data folks have done.
  • It could certainly be a whole lot harder to access, it is still theoretically available for download, I don’t need to file an FOI or go to court or anything like that to get this data.

This is a story of contrasts and “progress”.

Back when I actually downloaded these GIS files, in the early 2000s, I was able to access the whole dataset like this (the fact that I can still type out the process from memory should be indicative of how useful I found the old regime):

ftp ftp.env.gov.bc.ca
cd /dist/arcwhse/watersheds/
cd wsd3
mget *.gz

Here’s how it works now.

I don’t know where this data is anymore, so I go to data.gov.bc.ca. This is an improvement, I don’t have to (a) troll through Ministry sites first trying to figure out which one holds the data or (b) not troll though anything because I have no idea the data exists.

Due to the magic of inflexible design standards, the data.gov.bc.ca site has two search boxes, one that does what I want (the smaller one, below), and one that just does a google search of all the gov.bc.ca sites (that larger one, at the top). Ask me how I figured that out.

So, I type “watersheds” into the search box and get 10 results. Here I have to lean on my domain knowledge and go to #10, which is the old 3rd order watersheds layer.

The dataset record is pretty good, my only complaint would be that unlike the old FTP filesystem there’s no obvious indication that there are other related data sets that together form a collection of related data, the watershed atlas. The keywords field gets towards that intent, but a breadcrumb trail or something else might be clearer. I think the idea of a data collection made of parts is common to a lot of data domains, and might help people organically discover things more easily.

Anyhow, here’s where things get “fun”, because here we leave the domain of open data and enter the domain of the “GIS data warehouse”. I click on the “SHP” download link:

The difference between hosting data on an FTP site and hosting it in a big ArcSDE warehouse is that the former has very few moving parts, is really simple, and practically never does down, while the latter is the opposite of that.

Let’s just skip the convenient direct open data link, and try to download the data directly from the warehouse. Go to the warehouse distribution service entry page:

I like ad for Internet Explorer, that’s super stuff. It’s almost like these pages are put up and never looked at again. We’ll enter as a guest.

Two search boxes again, but at least this time the one we’re supposed to use is the big one. Thanks to our trip through the data.gov.bc.ca system, we know that typing “WSA” is the thing most likely to get us the “watershed atlas”.

Boohyah, it’s even the top set of entries. Let’s compare the metadata for fun (click on the “Info” (i)).

Pretty voluminous, and there’s a tempting purple download button up there… hey, this one works!

Hm, it wants my email address and for me to assent to a license… I wonder what the license is?

Why make people explicitly assent to a license that is only implicitly defined? Fun. Ok, fine, have my email address, and I assent to something or other. I Submit (in both senses)!

And now I wait for my e-mail to arrive…

Hey presto, it’s alive! (Sunday 11:27AM) But no data yet.

W00t! Data is ready! (Sunday 11:30AM)

Uh, oh, something is wrong here. My browser perhaps? Let’s try wget.

wget ftp://slkftp.env.gov.bc.ca/outgoing/apps/lrdw/dwds/LRDW-1235441-Public.zip
--2012-12-02 11:33:10--  ftp://slkftp.env.gov.bc.ca/outgoing/apps/lrdw/dwds/LRDW-1235441-Public.zip
           => ‘LRDW-1235441-Public.zip’
Resolving slkftp.env.gov.bc.ca (slkftp.env.gov.bc.ca)... 142.36.245.171
Connecting to slkftp.env.gov.bc.ca (slkftp.env.gov.bc.ca)|142.36.245.171|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /outgoing/apps/lrdw/dwds ... 
No such directory ‘outgoing/apps/lrdw/dwds’.

This is awesome. OK, back to the FTP client!

Connected to slkftp.env.gov.bc.ca.
220-Microsoft FTP Service
220 This server slkftp.env.gov.bc.ca is for British Columbia Government business use only.
500 'AUTH GSSAPI': command not understood
Name (slkftp.env.gov.bc.ca:pramsey): ftp
331 Anonymous access allowed, send identity (e-mail name) as password.
Password:
230 Anonymous user logged in.
Remote system type is Windows_NT.
ftp> dir
200 PORT command successful.
150 Opening ASCII mode data connection for /bin/ls.
dr-xr-xr-x   1 owner    group               0 Jul 23  2010 MidaFTP
dr-xr-xr-x   1 owner    group               0 Aug 25  2010 outgoing
226 Transfer complete.
ftp> cd outgoing
250 CWD command successful.
ftp> dir
200 PORT command successful.
150 Opening ASCII mode data connection for /bin/ls.
dr-xr-xr-x   1 owner    group               0 Jul  2  2010 apps
226 Transfer complete.
ftp> cd apps
250 CWD command successful.
ftp> dir
200 PORT command successful.
150 Opening ASCII mode data connection for /bin/ls.
d---------   1 owner    group               0 Jul  2  2010 lrdw
d---------   1 owner    group               0 Jul  2  2010 mtec
226 Transfer complete.
ftp> cd lrdw
550 lrdw: Access is denied.

So, the anonymous FTP directory where the jobs are landing is not readable (by anyone). Oh, and serious demerits for running an FTP server on Windows (NT!).

The whole data warehouse/data distribution thing substantially pre-dates open data, and actually one of the reasons it (a) exists and (b) is so f***ing terrible is because at the time it was conceived and designed BC was still trying to sell GIS data, so the distribution system has crazy layers of security and differentiation between free and non-free data (even though it still forces you to go through “checkout” for free data (which all data now is)).

My request was for only 50Mb of data, and the system is (theoretically) willing to give it to me in one chunk. If I had wanted to access all of TRIM (the 1:20 BC planimetric base map product) I would be, as the French say, “up sh** creek”.

The current process is also, clearly, not amenable to automation. If I wanted to regularly download a volatile data set, I would also be, as in the German proverb, FUBAR.

So, there you go, open data folks. I am fully cognizant that the problem is 100% Not Of Your Design or Doing, I watched it happen in real time (and even won a contract to maintain the system after it was built! gah!) But it is also, still, now many years on, a Problem.

Remember, I originally got the data like this.

ftp ftp.env.gov.bc.ca
cd /dist/arcwhse/watersheds/
cd wsd3
mget *.gz

It ain’t Rocket Science, we’ve just made it seem like it is.

ICM and ExaData

I went to an Oracle Users Group meeting yesterday afternoon, to see a presentation by Marcin Zaranski on the Integrated Case Management system’s use of Oracle ExaData hardware.

Disclosure: I went expecting to be shocked, shocked at yet another criminal waste of money on the part of ICM.

ExaData is Oracle’s fairly successful attempt to turn the hardware engineering skills they acquired with Sun Microsystems into something with unique marketability. I’d say they’ve succeeded, which is good news for the excellent engineers at Sun. By combining the kinds of hardware database optimizations pioneered at Netezza and Teradata with Sun’s overall server engineering prowess and their industry-leading sales and marketing team, Oracle has a winner. It might not be the best, but they are going to sell more of them than anyone else.

ExaData (I’m going to speak in the singular, about the ExaData Database Machine even though the “Exa” prefix has now been splashed across a wide range of Oracle “engineered systems”) is basically an enterprise appliance. A database in a box, where the “box” is a server rack. It ships with the database pre-installed and configured for the underlying hardware. The underlying hardware not only includes the kind of monitoring, reliability and redundancy that Sun fanboys like myself have come to expect, but also includes custom storage modules that can push portions of SQL queries down to just above the disk heads, dramatically improving query performance, particularly for OLAP workloads.

It’s really clever technology, but the true cleverness is the sales pitch, which Zaranski touched on and Oracle rep Dev Dhindsa hammered us over the head with during his talk: because ExaData is basically an appliance, all the coordination costs of getting systems and network administrators to interface with database and application administrators goes away. It’s a technology fix for the organizational problem of IT silos, and the pitch to the database and apps departments is simple: “buy this product so you don’t have to talk to those f***ers in system admin and networking anymore.”

And any pitch to IT that involves talking to people less is a stone cold winner.

So, back to ICM. The justification for ICM buying ExaData was to alleviate performance problems experienced in their phase 1 rollout to about 1800 users before they rolled out to 8000 users in phase 2. The result: success! They didn’t have any complaints in phase 2… about performance.

After his presentation, I asked Zaranski how much ExaData cost the ICM project, and he would not provide a number, presumably thanks to the magic of “secret contracts” with Oracle (pricing is a “trade secret” and thus protected by FOI, one of the many counterproductive consequences of the “third party confidential” exception in the BC FOI law).

However, later the Oracle hardware rep was nice enough to tell me the list price for ExaData: $250K for a “quarter rack”. ICM purchased two of those (one for production, one for fail-over), presumably for less than list. It’s a lot of money for servers, but within the context of a $200M project I find it hard to get worked up. It makes their system run faster, which makes it less awful for the users. And the cost of the ExaData hardware will look small next to the cost of the Oracle and Siebel software licenses that are going to run on it.

Way before the project was forced to buy top-end hardware to coax reasonable performance out of their application, the clusterf*** that is ICM was already baked in: by the decision to simultaneously integrate so many systems; by the decision to use the “COTS” Seibel solution; and by the decision to outsource to expensive international consultancies.

So, enjoy your cool hardware ICM, it’s pretty boss.

Best moment: In his presentation, Zaranski repeated the ICM mantra: that one of the big wins is replacing the 30-year-old “legacy” systems previously doing the social services records keeping. “Legacy” is a favourite put-down of all IT presenters. “Legacy” software is crufty old stuff, in the process of being phased out, unlike the cool software you work with. So I got a perverse kick when Dev Dhindsa, in praising Oracle’s new “cloud enabled” Fusion Middleware contrasted it favorably with their suite of “legacy” applications, such as PeopleSoft, JD Edwards, and … Siebel, the software ICM is using to replace the “legacy” social services systems.

Spatial IT vs GIS

So, Stephen Mather has taken a crack at analyzing my Spatial IT meme.

And why then the artificial distinction between GIS and Planning? If GIS is Planning technically embodied, should they not be conflated? Two reasons why not. One: The efficacy of GIS can be hindered by slavishly tying it to Planning in large part because there is wider and deeper applicability to GIS than to Planning’s typical functions. Lemma: Paul is partially right.

Stephen’s found a weak seam in my argument, and it’s around the planning aspects of GIS. There’s a place where GIS provides the interface between raw data and planning decisions, which remains:

  • high touch and interpersonal;
  • qualitative and presentational;
  • ad hoc and unpredictable.

This is the GIS that is taught in schools, because it’s the “interesting” GIS, the place where decision meets data.

However, as we know, GIS courses are just the bait in the trap, to suck naïve students into a career where 90% of the activity is actually in data creation (digitization monkey!) and publication (map monkey!), not in analysis. The trap that “GIS” has fallen into is to assume that these low-skill, repetitive tasks are (a) worth defending and (b) should be done with specific “GIS technology”. They aren’t, and they shouldn’t; they should (and, pace Brian Timoney, will) be folded into generic IT workflows, automated, and systematized.

That will leave the old core of “real GIS” behind, and that’s probably a good thing, because training people for analysis and then turning them into map monkeys and digitization monkeys (and image color-balancing monkeys, and change detection monkeys) is a cruel bait-and-switch.