PgSQL on EC2

The theory behind putting a PostgreSQL (and PostGIS) instance on an Amazon EC2 instance with an Elastic Block Store (EBS) file system underneath is pretty straightforward, even for big databases. But when you want those databases to show the kind of properties we have come to expect from our systems, like durability, throughput, and reliability, things get much harder.

This thread on pgsql-general was very illuminating to me. Among the tidbits:

Let’s be clear here, physical I/O is at times terrible. :)

There’s no way we could run this database on a single EBS volume.

We had to fail over to one of our spares twice in the last 1.5 years. Not fun. Both times were due to instance failure.

Basically the assumptions of AWS architecture (virtual instances will be less reliable than real world computers, but that doesn’t matter because getting a new one is really easy) don’t map well with the requirements of running a classic production database.

There are probably some engineering solutions around for this (GlusterFS, for example, but the core PgSQL would need some serious work and end up looking a lot more like OracleRAC than the currently single-machine set-up.

More Stories from the Future of Computing

From the Jobs keynote of yesterday, a slide with a quote from Theo Gray of Wolfram, regarding the popular “Elements” iPad application:

I earned more on sales of The Elements for iPad in the first day than from the past 5 years of Google ads on periodictable.com.

Quoth Jobs, “That’s what I like to hear from you guys.” Audience whoops.

Right now the walled garden is kicking the jungle’s ass, but for how long? It’s incredibly interesting, that for the moment the old school revenue model of application sales is actually besting the new school free-with-strings (ads) model that we were told was the Future. Perhaps once HTML5 application quality gets up to the level of fit and finish that the current crop of native apps is providing we will flip back again.

I think, for example, of a stock market application. Would people pay a buck for a really excellent application that “just works” in a clean and uncluttered way for displaying current information, research, blah blah blah, instead of just going to Yahoo! Finance? The information is available for free (just like the periodic table!) but a really excellent encapsulation of that information might be compelling enough to pay for. Walled garden starts to fall apart where it interfaces with the jungle… once the application has to link out to things like company reports, and other non-structure pieces in the raw internet, it re-gains the clunkiness of the old browser experience. So why not start with the browser?

For geo, I think that sites like GeoCommons, which have applied a baseline level of structure to a wide swath of data, are fertile grounds for the “app treatment”. An application that provides superior interactive access to their data archives would be an alternative monetization path for leveraging their growing holdings of structured GIS data.

Interesting times!

Finding Corrupt PostgreSQL Data Files

While PostgreSQL itself will never create corrupt data files, that doesn’t stop other processes or hardware failures for corrupting the files underneath the database, which can cause database crashes. Josh Williams of End Point provides a super rundown of how to track and repair a file corruption.

Who's Your Dealer?

The mapbutcher thinks it’s OK to get hooked on proprietary software:

People are interested in the ‘express’ editions because on the surface of it the marketing works on them, they’re familiar with the brand and are attracted by the ‘free’ carrot dangling from the end of the stick. Open Source software starts from that position – it’s already free so open source projects need another carrot to get us hooked.

To which I can only say “Simon! Look at yourself in the mirror, man! Do you want to end up whoring yourself down on Dalgety Street to pay for your ESRI habit?”

OK, metaphor is tricky and fun, because there’s so many ways to approach a metaphor. And for this one, it’s easy to get distracted by the “drug” side, but the point of my metaphor is not that addictive drugs put you in a subordinate relationship to the drug (though some do) but that they put you in a subordinate relationship to the dealer.

The reason we can all survive and function in society despite our crippling addiction to oxygen is because oxygen is free and plentiful. The pusherman isn’t trying to hook new customers because he believes that drugs are wonderful, he’s doing it because he wants their money.

Software is “addictive” to organizations. Once you choose a piece of software and implement it, you’re going to be “addicted”. It’s going to be hard to change. There will be withdrawal symptoms. Given that fact, what kind of software do you want to use? Software that is as free as the air you breath? Or software that is only available on terms dictated by someone else?

Your choice. Your future. Your life.

Free like...

A favorite bon mot of open source critics is that open source software is “free like a free puppy”. Tee hee! Open source advocates should remember to keep the rejoinder handy, that proprietary software is “free like a free hit of crack”. Oracle “Express”, SQL Server “Express”, ESRI educational copies, yes I am looking at you.