PostGIS Back-up / Restore

A very common question on PostGIS in production is “how do I upgrade”, which is actually a variant on “how do I backup and restore”?

First, for patch version increases (e.g. X.Y.Z -> X.Y.(Z+1)) in PostgreSQL and PostGIS you do not need to do anything at all other than install the new software. The data can remain in place and everything will Just Work.

For minor version increases in PostgreSQL (e.g. X.Y.Z -> X.(Y+1).Z) you need to dump and restore. For minor version increases in PostGIS, you need to do a “soft upgrade”, which means leaving the data in place, but running the upgrade scripts (e.g. postgis_upgrade_14_to_15.sql) after you install the software update.

Finally, for major version increases in PostgreSQL and PostGIS (e.g. X.Y.Z -> (X+1).Y.Z) you need to dump and restore.

Which brings me to the actual point I want to make: you can ensure the greatest ease in doing dump and restore of PostGIS data if you ensure that you store no data in the “public” schema.

The “public” schema is where the PostGIS functions and system tables get installed, so if you dump that schema you get all those definitions in your dump. If those definitions are mixed in amongst your data, loading them into a fresh database gets tricky: are the paths to the libraries the same? are there function name clashes? (The utils/ script attempts manfully to strip out PostGIS components from a dump file to allow a clean restore, but it is hard to get 100% performance.)

If, on the other hand, all your data is neatly separated into its own schema, you can neatly backup just that schema and avoid having PostGIS system information mixed in with your data. That means you can easily restore your data into any version of PostGIS and PostgreSQL that you like. So upgrades are easy easy easy.

Remember: Store your data in a schema other than “public”.

“But Paul,” you say, “I already have a full database dump, does that mean I’m SOL?” No, but you will need a strong stomach. First, set up your new PostgreSQL. Create a blank database, load PostGIS into it. Now, load your backup file into that database. You will see lots of errors. However, these errors will be caused by old PostGIS function and type definitions from your dump file conflicting with the existing type definitions in your database. And since you want the new definitions, not the old ones, that’s OK. Your load, despite all the noise and errors, should actually work. Once it’s done, you can move your data into a nice separate schema, so that next time you can do a clean, error-free restore.

WPS Smackdown

I was a little surprised last week when Jody Garnett told me that the most contentious session of FOSS4G turned out to be the “WPS compliance testing” talk. Really? “Really”, Jody insisted, “Voices were raised.”

Over web processing service (WPS) compliance? Well, we are all knee-jerk nerds, so I guess I shouldn’t be surprised. The results were presented by just one WPS project, the 52North one, but tested a broad swathe of them. I think the lesson to be learned is that you should be careful testing other people’s software, and even more careful when brandishing the results of those tests.

The WMS benchmarking exercise, where each team runs their own service, seems like a better model, though demonstrably hard to coordinate in its own right.

The ZooWPS team has now published their response, so if you are interested in WPS, here’s a chance to dig deep into XML response payloads. Yum.

PostGIS 1.5.2rc1

The PostGIS web site is temporarily down, but the project team is pushing ahead (belatedly) with the 1.5.2 release series in the meanwhile. The reason we are not waiting is the same reason we are belated: 1.5.2 is required to run with PostgreSQL 9.0, which was freshly released recently (we should have released 1.5.2 a couple months ago, really).

You can get the 1.5.2rc1 source code from the PostGIS issue tracking page and wiki:

Please test and post any issues as tickets, we are aiming to drop a final release pretty quickly. Try to confine yourself to posting deal-breaker tickets only.

ST_Intersects and ST_Buffer: No

If you find yourself writing a query like this:

... WHERE ST_Intersects(ST_Buffer(g1, r), g2)

Stop. Take a cleansing breath. Do this:

... WHERE ST_DWithin(g1, g2, r)

With the carbon emissions you save doing it the efficient way, you can afford to drive to the ice cream store for a well-deserved reward.

Calling all Technoweenies...

Did you know that OpenGeo is hiring a software engineer? It’s true. Let’s build something beautiful together.