Tuesday, April 08, 2008

That's Billion with a "B"

This article on scaling PostgreSQL to support Skype's operations is well worth a read for anyone running a high-end PostgreSQL installation.
PostgreSQL is used "as the main DB for most of [Skype's] business needs." Their approach is to use a traditional stored procedure interface for accessing data and on top of that layer proxy servers which hash SQL requests to a set of database servers that actually carry out queries. The result is a horizontally partitioned system that they think will scale to handle 1 billion users.

1 comment:

Jonathan said...


The article describes how database queries are routed by a proxy across a set of database servers and that the proxy creates partitions based on a field value, typically a primary key: "For example, you could partition users across a cluster by hashing based on user name. Each user is slotted into a shard based on the hash."

This is great for an application such as Skype's that probably does partition on users, but what about large databases designed exclusively for spatial queries, where the only constraint in the WHERE clause of queries is a test for Intersects() or some other spatial relationship?

Is there some clever way, perhaps, to partition based on the bounding box of each geometry?

- Jonathan

About Me

My Photo
Victoria, British Columbia, Canada


Blog Archive