Because MySQL has so few functions that actually do anything (see the note here) there's not a great deal to test beyond raw performance. The early throughput results seem to indicate it's comparable for simple CRUD on one table, but for anything non-trivial it falls down.
Here's a basic spatial join: pull 23 roads from a 3.4M row line table and spatially join to a 66K row tract polygons table, calculating the sum of the areas of tract polygons found. There are spatial indexes on both tables.
mysql> select sum(area(t.geom))
from tiger_roads_texas r, tiger_tracts t
where
mbrintersects(r.geom, GeomFromText('LINESTRING(453084 -1650742,452384 -1650442)'))
and
mbrintersects(r.geom,t.geom);
+-------------------+
| sum(area(t.geom)) |
+-------------------+
| 1260394420.00453 |
+-------------------+
1 row in set (9.43 sec)
And in PostGIS:
tiger=# select sum(area(t.geom))
from tiger_roads_texas r, tiger_tracts t
where r.geom && GeomFromText('LINESTRING(453084 -1650742,452384 -1650442)',2163)
and r.geom && t.geom;
sum
------------------
1260394420.00684
(1 row)
Time: 5.574 ms
Those are both "hot cache" results, after running them a couple times each.

4 comments:
Did you do an explain on both to verify both are using the spatial indexes?
Did you run ANALYZE TABLE on the MySQL table and vacuum analyze on the postgresql one?
No change in time after analyzing everything. Explain on MySQL I can't read, but here it is:
mysql> explain select sum(area(t.geom)) from tiger_roads_texas r, tiger_tracts t where mbrintersects(r.geom, GeomFromText('LINESTRING(453084 -1650742,452384 -1650442)')) and mbrintersects(r.geom,t.geom);
+----+-------------+-------+-------+----------------------------+------+---------+------+-------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+----------------------------+------+---------+------+-------+--------------------------------+
| 1 | SIMPLE | r | range | geom,tiger_roads_texas_gix | geom | 34 | NULL | 1 | Using where |
| 1 | SIMPLE | t | ALL | geom,tiger_tracts_gix | NULL | NULL | NULL | 66437 | Using where; Using join buffer |
+----+-------------+-------+-------+----------------------------+------+---------+------+-------+--------------------------------+
2 rows in set (0.02 sec)
It's certainly a bad plan, but that's the point innit? :)
Yap from the plan looks like its using the spatial index on tiger_roads and a table scan on tiger_tracts.
Just wanted to be fair. From experience MySQL does poorly at optimizing any query that is marginally complicated.
Though 6.0 (or is it 5.3 I forget) is supposed to be better. Which version are you running?
I meant to say 5.4 -- but doesn't look like the planner optimization changes are mentioned in 5.4 release notes so assume its still the same sucky planner
http://dev.mysql.com/doc/mysql-5.4-features/en/index.html
I think its in 6.0
http://dev.mysql.com/doc/refman/6.0/en/mysql-nutshell.html
See the part about
"Optimizer enhancements for faster subqueries and joins, including batched index access of table rows for sequences of disjoint ranges by the MyISAM and InnoDB storage engines. "
Ooh and I read somewhere that Foreign keys might actually do something in MyISAM in 6.0. What a brilliant idea. Though can't remember where I read that.
Post a Comment