Tuesday, April 08, 2008

That's Billion with a "B"

This article on scaling PostgreSQL to support Skype's operations is well worth a read for anyone running a high-end PostgreSQL installation.
PostgreSQL is used "as the main DB for most of [Skype's] business needs." Their approach is to use a traditional stored procedure interface for accessing data and on top of that layer proxy servers which hash SQL requests to a set of database servers that actually carry out queries. The result is a horizontally partitioned system that they think will scale to handle 1 billion users.

1 comment:

Jonathan said...

Paul,

The article describes how database queries are routed by a proxy across a set of database servers and that the proxy creates partitions based on a field value, typically a primary key: "For example, you could partition users across a cluster by hashing based on user name. Each user is slotted into a shard based on the hash."

This is great for an application such as Skype's that probably does partition on users, but what about large databases designed exclusively for spatial queries, where the only constraint in the WHERE clause of queries is a test for Intersects() or some other spatial relationship?

Is there some clever way, perhaps, to partition based on the bounding box of each geometry?

- Jonathan

About Me

My Photo
Victoria, British Columbia, Canada

Followers

Blog Archive

Labels

bc (31) it (26) postgis (17) icm (10) sprint (9) enterprise IT (8) open source (8) osgeo (8) video (8) management (6) cio (5) enterprise (5) foippa (5) gis (5) spatial it (5) foi (4) mapserver (4) outsourcing (4) bcesis (3) foss4g (3) oracle (3) politics (3) architecture (2) boundless (2) esri (2) idm (2) natural resources (2) ogc (2) open data (2) opengeo (2) openstudent (2) postgresql (2) rant (2) technology (2) vendor (2) web (2) 1.4.0 (1) COTS (1) HR (1) access to information (1) accounting (1) agile (1) aspen (1) benchmark (1) buffer (1) build vs buy (1) business (1) business process (1) cathedral (1) cloud (1) code (1) common sense (1) consulting (1) contracting (1) core review (1) crm (1) custom (1) data warehouse (1) deloitte (1) design (1) digital (1) email (1) essentials (1) evil (1) exadata (1) fcuk (1) fgdb (1) fme (1) foocamp (1) foss4g2007 (1) ftp (1) gds (1) geocortex (1) geometry (1) geoserver (1) google (1) google earth (1) government (1) grass (1) hp (1) iaas (1) icio (1) industry (1) innovation (1) integrated case management (1) introversion (1) iso (1) isss (1) isvalid (1) javascript (1) jts (1) lawyers (1) mapping (1) mcfd (1) microsoft (1) mysql (1) new it (1) nosql (1) opengis (1) openlayers (1) oss (1) paas (1) pirates (1) policy (1) portal (1) proprietary software (1) qgis (1) rdbms (1) recursion (1) regression (1) rfc (1) right to information (1) saas (1) salesforce (1) sardonic (1) seibel (1) sermon (1) siebel (1) snark (1) spatial (1) standards (1) svr (1) tempest (1) texas (1) tired (1) transit (1) twitter (1) udig (1) uk (1) uk gds (1) verbal culture (1) victoria (1) waterfall (1) wfs (1) where (1) with recursive (1) wkb (1)