Tuesday, April 28, 2009

OpenStreetMap moves to PostgreSQL

Recently, the OpenStreetMap project put out a very successful call for donations to upgrade their physical database infrastructure, from a dual-core Athlon with 8Gb of RAM and lots (~1Tb) of disk, to a quad-core Xeon with 32Gb of RAM and heaps (4Tb) of (15K RPM) disk.

The speedy success of the hardware appeal (target reached in less than three days) was pretty impressive, but what really perked my (PostgreSQL fanboi) ears up was the news that the new hardware was going to run PostgreSQL, instead of the MySQL database OSM has used from the start. As of April 19, OSM is running their new API live on PostgreSQL.

So, why has OSM abandoned the worlds most popular open source database? I asked the OSM folks, and this is what Tom Hughes of OSM told me:
Personally I've been very frustrated with MySQL from when I first got involved with running things. Some of the problem was of our own making in that we had a mix of MyISAM and InnoDB tables (originally everything was in MyISAM) and some tables were using MyISAM features that meant they couldn't be easily moved to InnoDB.

On top of that it seemed that virtually any non-trivial query would completely defeat MySQL's optimiser.
The comment about a mix of tables really hits home, since so many MySQL features are split across table types. Want transactions? InnoDB! Want full text search? MyISAM! Want spatial? MyISAM! Want spatial or full-text and transactions? Tough. The devil is in the details. When asked: does MySQL support spatial, transactions, full-text? the MySQL answer is "yes", "yes", "yes", but the reality in production is not nearly so clear-cut.

Note that OSM is not using PostGIS for the main database at this time (their current data model of nodes and ways wouldn't get much leverage from it) but it is used for other processes like OSM tile generation. And a growing number of people on the PostGIS users list seem to be using osm2pgsql to extract data from the OSM production server for rendering / analysis in PostGIS.

So, welcome OSM, to the PostgreSQL community!


נמרוד כנען said...

News like this make it ease for us to convince for the use of OS tools

Peter said...

>> (their current data model of nodes and ways wouldn't get much leverage from it)

Is there a way that PostGIS could provide the features they need? Through pgRouting. Or by trying to raise some funds to go on with topology in PostGIS?



Paul Ramsey said...


Building a working OSM-like model involves building a good deal of model smarts into your middle-ware, which takes a lot of effort, and once you've done that it becomes hard to convince you that there are benefits in migrating that smarts down into the database tier. In part because most of the benefits are avoiding work (that you have already expended). They have a working production system, which is golden. Now that they have PostgreSQL as the foundation, they have the option of experimenting with full spatial and perhaps they'll get there eventually.

About Me

My Photo
Victoria, British Columbia, Canada

Blog Archive


bc (44) it (35) postgis (24) video (15) enterprise IT (11) icm (11) gis (9) sprint (9) foi (8) open source (8) osgeo (8) enterprise (7) cio (6) foippa (6) foss4g (6) management (6) politics (6) spatial it (6) outsourcing (5) mapserver (4) bcesis (3) boundless (3) email (3) opengeo (3) oracle (3) rant (3) COTS (2) architecture (2) cartodb (2) deloitte (2) esri (2) hp (2) idm (2) javascript (2) natural resources (2) ogc (2) open data (2) openstudent (2) oss (2) postgresql (2) technology (2) vendor (2) web (2) 1.4.0 (1) HR (1) access to information (1) accounting (1) agile (1) archive (1) aspen (1) bcpoli (1) benchmark (1) buffer (1) build vs buy (1) business (1) business process (1) c (1) career (1) cathedral (1) client (1) cloud (1) code (1) common sense (1) consulting (1) contracting (1) core review (1) crm (1) crockofshit (1) cunit (1) custom (1) data science (1) data warehouse (1) design (1) development (1) digital (1) environment (1) essentials (1) evil (1) exadata (1) fcuk (1) fgdb (1) fme (1) foocamp (1) foss4g2007 (1) ftp (1) gdal (1) gds (1) geocortex (1) geometry (1) geoserver (1) geotiff (1) google (1) google earth (1) government (1) grass (1) hadoop (1) iaas (1) icio (1) imagery (1) industry (1) innovation (1) integrated case management (1) introversion (1) iso (1) isss (1) isvalid (1) jpeg (1) jts (1) lawyers (1) mapping (1) mcfd (1) media (1) microsoft (1) money (1) mysql (1) new it (1) nosql (1) nrs transformation (1) oipc (1) opengis (1) openlayers (1) paas (1) pgconfsv (1) pirates (1) policy (1) portal (1) proprietary software (1) public accounts (1) qgis (1) r (1) rdbms (1) recursion (1) redistribution (1) regression (1) rfc (1) right to information (1) saas (1) salesforce (1) sardonic (1) scandal (1) seibel (1) sermon (1) server (1) siebel (1) snark (1) spatial (1) standards (1) statistics (1) svr (1) taxi (1) tempest (1) texas (1) tired (1) transit (1) tripledelete (1) twitter (1) uber (1) udig (1) uk (1) uk gds (1) verbal culture (1) victoria (1) waterfall (1) wfs (1) where (1) with recursive (1) wkb (1)