Sunday 3 October 2010

OS Vector Map District with PostGIS - Take 3

To recap:

  • The entire UK's VectorMapDistrict dataset has been loaded into my postgresql/postgis database using my vmd2pgsql script, which is based around shp2pgsql.
  • I created some stylesheets for mapnik rendering of the data based on the mapnik tutorial, but they came out with a transparent background and black lines and text, which is not what I wanted.
The rendering problem turned out to be quite easy - it is just that the mapnik tutorial is based on Mapnik 0.7, not mapnik2.  To make it work with mapnik2 you need to:
  1. Replace the bgcolor parameter of the map element with background-color.
  2. Remove all of the CSSParameter business in the style definitions with simple name="xyz" constructs.
It is a shame that mapnik did not complain about these issues, even in debug mode, but never mind!
Adding the different linear features works ok - you see something of an improvement from the black and white version to something resembling a map:
So now all I need to do is add the areas (water, woodland towns etc.) to make it look extra pretty, and sort out what to plot at different zoom levels so it is reasonable.

Here I hit a problem.   When I tried to add the settlement_area or naturalfeature_area items, mapnik crashed with an "invalid geometry" error, after taking quite a long time thinking about it!   It is something to do with polygons not being closed properly.   I suspect it is rounding errors, but am not too sure.  One possibility is that I should have used the shp2pgsql option to just use integers, which would have avoided rounding, or I could try to fix the database.  As it took 9 hours to import, and I don't know if integers will work, I'll try to fix it.

With much internet searching I discovered that you can check the validity of each geometric feature with the postgis st_isvalid() function.  Running a simple select statement on the database like:
select gid from naturalfeature_area where not st_isvalid(the_geom);
takes a long time (about an hour I think) and gives me a list of invalid geometries, after pages and pages of warnings about things crossing over themselves.
It seems that one trick that people use to force gemetries to be valid is to use the st_buffer() command with the buffer radius set to zero.   I checked it worked by doing:
select st_isvalid(st_buffer(the_geom,0.0)) from naturalfeature_area where gid=2342149;
(2342149 was the first entry in my list of invalid geometries) - Success - it returns 't' for true meaning the st_buffer trick fixes it.
Now to update the database.  To correct the error I had to do:
update naturalfeature_area set the_geom=st_multi(st_buffer(the_geom,0.0)) where gid=2342149;
Note that I do not really know what the st_multi thing does, but I got errors about geometry constraint violations without it - I think that st_buffer returns the simplest type of geometry it can, and st_multi forces it to be a multipolygon, but I could be wrong!.

Now all I need to do is remember enough about subqueries to write a bit of SQL that does the correction for every invalid geometry - back to that well known search engine....

Well, I have set it going using the following to try to repair the geometries:
update naturalfeature_area set the_geom=st_multi(st_buffer(the_geom,0.0)) where gid in (select gid from naturalfeature_area where not st_isvalid(the_geom));
I think I'll leave it for a few hours....

No comments: