Saturday, August 16, 2008

Valgrinding PostGIS

So, you want to be a super-hero? How about tracking down memory leaks and otherwise using valgrind to make PostGIS a cleaner and better system? However, getting PostGIS into valgrind can be a bit tricky.

First of all, what is valgrind? It's a tool for finding memory leaks and other memory issues in C/C++ code. It only runs under Linux, so you do need to have sufficiently portable code to run it there. Many memory checking tools rely on "static code analysis", basically looks at what your code says it does and seeing if you have made any mistakes. These kinds of tools have to be very clever, since they not only need to understand the language, they have to understand the structure of your code. Valgrind takes the opposite tack – rather than inspect your code for what it says it does, it runs your code inside an emulator, and sees what it actually does. Running inside valgrind, every memory allocation and deallocation can be tracked and associated with a particular code block, making valgrind a very effective memory debugger.

In order to get the most useful reports, you have to compile your code with minimal optimization flags, and with debugging turned on. To grind both GEOS and PostGIS simultaneously, compile GEOS and PostGIS with the correct flags:
Make GEOS:
CXXFLAGS="-g -O0" ./configure
make clean
make
make install

Make PostGIS:
CFLAGS="-g -O0" ./configure --with-pgconfig=/usr/local/pgsql/bin/pg_config
make clean
make
make install

Once you have your debug-enabled code in place, you are ready to run valgrind. Here things get interesting! Usually, PostgreSQL is run in multi-user mode, with each back-end process spawned automatically as connections come in. But, in order to use valgrind, we have to run our process inside the valgrind emulator. How to do this?

Fortunately, PostgreSQL supports a "single user mode". Shut down your database instance (pg_ctl -D /path/to/database stop) first. Then invoke a postgres backend in single-user mode:
echo "select * from a, b where st_intersects(a.the_geom, b.the_geom)" | valgrind --leak-check=yes --log-file=valgrindlog /usr/local/pgsql/bin/postgres -D /usr/local/pgdata --single postgis
So, here we are echoing the SQL statement we want tested, and piping it into a valgrind-wrapped instance of single-user PostgreSQL. Everything will run much slower than usual, and valgrind will output a report to the valgrindlog file detailing where memory blocks are orphaned by the code.
 

2 comments:

Regina Obe said...

Nit pick,
Shouldn't that be st_intersects or was that intentional to stress test valgrind.

Found any interesting leaks.

Paul Ramsey said...

Nit picked. Lots of interesting leaks, but the most important one is the gusher in the prepared geometry

About Me

My Photo
Victoria, British Columbia, Canada

Followers

Blog Archive

Labels

bc (35) it (27) postgis (19) icm (11) enterprise IT (10) video (10) sprint (9) open source (8) osgeo (8) cio (6) management (6) enterprise (5) foippa (5) foss4g (5) gis (5) spatial it (5) foi (4) mapserver (4) outsourcing (4) politics (4) bcesis (3) oracle (3) COTS (2) architecture (2) boundless (2) esri (2) idm (2) natural resources (2) ogc (2) open data (2) opengeo (2) openstudent (2) postgresql (2) rant (2) technology (2) vendor (2) web (2) 1.4.0 (1) HR (1) access to information (1) accounting (1) agile (1) aspen (1) benchmark (1) buffer (1) build vs buy (1) business (1) business process (1) cathedral (1) cloud (1) code (1) common sense (1) consulting (1) contracting (1) core review (1) crm (1) custom (1) data warehouse (1) deloitte (1) design (1) digital (1) email (1) essentials (1) evil (1) exadata (1) fcuk (1) fgdb (1) fme (1) foocamp (1) foss4g2007 (1) ftp (1) gds (1) geocortex (1) geometry (1) geoserver (1) google (1) google earth (1) government (1) grass (1) hp (1) iaas (1) icio (1) industry (1) innovation (1) integrated case management (1) introversion (1) iso (1) isss (1) isvalid (1) javascript (1) jts (1) lawyers (1) mapping (1) mcfd (1) microsoft (1) mysql (1) new it (1) nosql (1) opengis (1) openlayers (1) oss (1) paas (1) pirates (1) policy (1) portal (1) proprietary software (1) qgis (1) rdbms (1) recursion (1) redistribution (1) regression (1) rfc (1) right to information (1) saas (1) salesforce (1) sardonic (1) seibel (1) sermon (1) siebel (1) snark (1) spatial (1) standards (1) svr (1) tempest (1) texas (1) tired (1) transit (1) twitter (1) udig (1) uk (1) uk gds (1) verbal culture (1) victoria (1) waterfall (1) wfs (1) where (1) with recursive (1) wkb (1)