Mass Shape File Load into PostGIS

I needed some test data to do some performance investigations, and had to load 235 shape files, all of identical schema. Here’s what I did.

First, get the table schema into the database, by loading a small file, and then deleting the data. We delete the data so we can loop through all the files later without worrying about duplicating the data from the initial file.

shp2pgsql -s 3005 -i -D lwssvict.shp lwss | psql mydatabase
psql -c "delete from lwss" mydatabase

Then use the shell to loop through all the shape files and append them into the table.

foreach f (*.shp)
foreach? shp2pgsql -s 3005 -i -D $f -a lwss | psql mydatabase
end

Note the “-a” switch to tell shp2pgsql we are in append mode, rather than the default create mode. Add a spatial index, and we’re done.

psql -c "create index lwss_gix on lwss using gist (the_geom)" mydatabase

Seven hundred thousand line segments, ready to play!

psql -c "select count(*) from lwss" mydatabase

count
--------
 755373
(1 row)

Titanic Exhibit @ Royal BC Museum

For our reception at FOSS4G 2007 we have rented several galleries at the Royal BC Museum, for a stand-up reception amongst the exhibits. Should be very cool.

The Museum is currently hosting a “special traveling exhibit” of artifacts from the Titanic, but the cost of renting the Titanic gallery was exorbitant (more than the rest of the galleries put together), so we did not rent it. Thank goodness!

This weekend was rainy, so I took my daughter to the Museum on Sunday, and the price of getting in included the Titanic exhibit. So in we went. In fairness to the exhibitors, they did their best with what they had to work with. But let’s remember what those things are: artifacts just 100 years old, and pretty ordinary; the limited selection of things they could drag up from 2 miles under the ocean that had not been completely pulped by time.

“Wow, what an attractive chamber pot.” “Goodness, who knew they had taps back then.” “Hm, a bottle full of an unknown fluid.”

If you want a visceral experience of the Titanic… watch James Cameron’s blockbuster movie. There is also an IMAX movie with underwater shots of the wreck showing at the theater attached to the Museum, which might be a bit more impressive.

While the Titanic exhibit is garnering all the hoopla, upstairs in the First Nations gallery is where the really good stuff is (we’ve got that one rented)! 100 year old totem poles, exquisite art, great historical presentation. And, especially good right now, they have the Dundas collection of Tsimshian art, collected 150 years ago. Really magnificent stuff, very museum-worthy and worth seeing. Sadly, the Dundas is also a traveling exhibit, and by the time FOSS4G rolls into town it will have already moved on.

REST Feature Server

Piling on this meme! Pile!

Chris Holmes has taken a stab at some of the semantics of a REST feature server, and of course, Chris Schmidt has already written one. (!!!!!)

I would like to take issue with one of Chris Holmes’ design points:

The results of a query can then just return those urls, which the client may already be caching

http://sigma.openplans.org/geoserver/major_roads?bbox=0,0,10,10

returns something like

<html>
<a href=’http://sigma.openplans.org/geoserver/major_roads/5′>5</a>
<a href=’http://sigma.openplans.org/geoserver/major_roads/1′>1</a>
<a href=’http://sigma.openplans.org/geoserver/major_roads/3′>3</a>
<a href=’http://sigma.openplans.org/geoserver/major_roads/8′>8</a>
</html>

Ouch! Resolving a query that return 100 features would require traversing 100 URLs to pull in the resources. What about if we include the features themselves in the response? Then we are potentially sending the client objects it already has. What is the solution?

It is already here! Tiling redux! Break the spatial plane up into squares, and assign features to every square they touch. You get nice big chunks of data, that are relatively stable, so they can be cached. Each feature can also be referenced singly by URL, just as Chris suggests, but the tiled access allows you to pull more than one at a time.

What about duplication? Some features will fall in more than one tile. What about it? Given the choice between pulling 1000 features individually, and removing a few edge duplicates as new tiles come in, I know what option I would choose. Because each feature in the tile includes the URL of the feature resource, it is easy to identify dupes and drop them as the tiles are loaded into the local cache.

Tiling on the brain, tiling on the brain…

Everybody Loves Metadata

Or, more precisely, everybody loves <Metadata>.

Hardly is KML into the OGC standards process and already folks are getting ready to standardize what goes into the anonymous <Metadata> block.

Ron Lake thinks that … wait for it … wait for it … GML would be an “ideal” encoding to use in <Metadata>.

Chris Goad at Platial thinks that we should be doing content attribution (who made this, who owns this) in <Metadata>.

Even Google is getting into the game. The explanations of how to integrate your application schema for <Metadata> extensions into the KML schema are a nice reminder of the sort of eye-glazing details that have made life so hard for GML. Doing things right is hard.

It is particularly delicious that the very thing that makes adding information to <Metadata> fiddly is the preparation of schemas: you need metadata about the metadata you are adding to <Metadata>.

Where will this all end? I think it will end with the Google Team picking one or a few <Metadata> encodings to expose in their user interfaces (Earth and Maps). At that point all content will converge rapidly on that encoding, and the flexibility of <Metadata> will be rapidly ignored.

OGC Apologies

I owe an apology to OGC, and the participants of the Mass Market working group. The post below on KML includes information about preliminary decisions that are not the policy of the OGC until they are officially approved by the higher governing bodies of the OGC. So to Mark and Carl and anyone else who has more stress to deal with today than they did yesterday because of me, a heartfelt “I am very very sorry”.