[HN Gopher] Serving Dynamic Vector Tiles from PostGIS ___________________________________________________________________ Serving Dynamic Vector Tiles from PostGIS Author : liotier Score : 104 points Date : 2020-01-02 16:36 UTC (6 hours ago) (HTM) web link (info.crunchydata.com) (TXT) w3m dump (info.crunchydata.com) | crmrc114 wrote: | I have a pal who does GIS work in the oil and gas industry- I | think its crazy how much influence ESRI has on that market. Would | love to learn more about interaction with map data like this. | | For a non-gis person this was a fun read. So thanks for the post! | bransonf wrote: | I have the same sentiment on ESRI. It's basically all I got | taught in my university courses, but it's not what I ever want | to use. | | It's crazy that a privately held company holds like a third of | the market share. | | And personally, I don't think their software is that good. I | find their documentation to be undesirable and their solutions | to be strict. | | Case in point, the geodatabase (gdb) standard is purposefully | meant to obfuscate the data within. No one has ever been able | to explain to me why this is, and the standard has been open | sourced by now. | | Not to mention, the number of times I've had ArcMap crash | without any helpful information as to why it crashed... | | That said, ArcMap is the Excel of GIS. It captured market share | (especially government contracts) two or three decades ago and | no one has disrupted the desktop GIS platform. On the web | front, however, I see companies like MapBox far outpacing | anything ESRI is capable of yet. | | And to anyone looking to learn GIS: Post GIS, GDAL and any | scripting language will make you more powerful than most of the | people I know within the field. | trynewideas wrote: | oblig. mention for others outside of the ESRI sphere of | influence that QGIS exists, is still FOSS, and still actively | developed https://qgis.org/en/site/ | sleavey wrote: | > And to anyone looking to learn GIS: Post GIS, GDAL and any | scripting language will make you more powerful than most of | the people I know within the field. | | Funny this article was posted today, because yesterday I was | looking into rendering a custom map for a ~100x100 km area | from OpenStreetMap data for a particular application. I've | got basically no experience making maps but I've dabbled with | GDAL and Rasterio. I was thinking of using Mapnik with a dump | of (part of) the OpenStreetMap database into a local PostGIS | instance. Ideally the rendered tiles should be vector format. | Do you think this approach seems reasonable or am I missing a | potentially simpler way? | bransonf wrote: | Are you trying to render a static map or make something | interactive? | | If this is static, Mapnik is a good call. It has some extra | anti-aliasing under the hood and it's exceptionally fast. | sleavey wrote: | Static; thanks for the info. I would ideally like to dump | a bunch of SVG tiles for various zoom levels so I can | store them in a static directory on my server rather than | serve them dynamically. I take it that Mapnik is capable | of dumps like this? And, I would like to use the Python | bindings but they look relatively badly documented. Would | you suggest a newbie like me uses the C or XML interfaces | instead, if they are better documented? | bransonf wrote: | I've only used the Python library, and I think I | survived. I only used it for some visualization however. | | And I'm still unclear. Are you trying to serve these | tiles to another application? Or are you trying to make a | digital/print map? | | And SVG is probably excessive for static tiles. It won't | have the size reduction of a raster tile nor the benefit | of a true vector solution. | sleavey wrote: | > And I'm still unclear. Are you trying to serve these | tiles to another application? Or are you trying to make a | digital/print map? | | Serve them from local storage to a viewer application. | | > And SVG is probably excessive for static tiles. It | won't have the size reduction of a raster tile nor the | benefit of a true vector solution. | | Ah, is there another vector format for tiles other than | SVG? Or are you saying that I should just generate a | bunch of compressed rasters? | bransonf wrote: | Ah, I haven't really got into that territory with mapnik. | But to the second point, yes you should just generate a | bunch of Raster tiles. And before doing this, ask | yourself if you really need to. | | If this isn't a huge project, Mapbox is an easy packed | solution. Otherwise, there are dozens of really good tile | providers already. | aldoushuxley001 wrote: | QGIS does an admirable job competing with ArcGIS. | flippmoke wrote: | Author of Mapbox's Vector Tile specification here and also | contributor to some of the code that is used by PostGIS and I | wanted to add some additional clarity on some topics associated | with Vector Tiles and dynamic serving of them that seems to be a | new trend. | | The Vector Tiles specification was designed for map visualization | but has expanded into other uses as well, but in general the | purpose is to be able to quickly provide a complete subset of | data for a specific area that is highly cacheable. Most of this | provided speed and cache-ability is specifically gained by | preprocessing all the data you will use in your map into tiles. | | The general steps for turning raw data into Vector Tiles are: | | 1. Determine a hierarchy of your data. For example if you are | talking about roads at some zoom levels you will want to see only | highways or major roads while at other zoom levels you will want | all your data. | | 2. For each tile at each zoom level; Select your data following | your hierarchy rules, simplify your data based on your zoom level | (for example you might need less points to display your road) and | then clip your data to your tile and encode it to your Vector | Tile. | | The problem is that doing these steps is often very complex and | requires thought about the cartography of your final resulting | map, but it can also drastically effect performance. If you are | dynamically serving tiles from PostGIS it is very hard to reduce | large quantities of data quickly in some cases. For example take | a very detailed coastline of a large lake that is very precise | and you are wanting to serve this dynamically. If you are | attempting to serve this data on demand each time you need a tile | you have to simplify and clip a potentially massive polygon. | While this might work for single requests, if you increase in | scale this quickly adds lots of load to a PostGIS server. The | only solution is to cache the resulting tiles for a longer period | to limit load on your database or to preprocess all your data | before serving. | | Preprocessing of all the tiles is already something other tiling | tools such as tippecanoe are really good at doing and comes with | the benefit of helping you determine a hierarchy for your data. | Preprocessing might seem excessive when it comes to making | potentially millions of tiles, but in general it makes your | application faster because it is simply serving an already | created tile. | | Therefore, if your data does not very change quickly I would | almost always suggest using preprocessing over dynamic rendering | of tiles. You might spend more effort maintaining something than | you expect if you start using PostGIS to create tiles on demand | over existing tiling tools. | durkie wrote: | Very good comment and thanks for your work on MVT. I use | PostGIS's MVT tools on a daily basis. | | I do an intermediate approach: my queries are sometimes too | expensive to run dynamically, and my data change semi- | frequently (daily/weekly basis), but when they do change I have | a clear idea of what tiles are affected. So any time my data | needs updating I can mark tiles as stale and then I have a | sidekiq job that processes them and uploads them to S3. The | tile server itself pulls from S3. | | This is probably not quite as fast as a dedicated tile server, | but it's far more reliable/responsive than dynamic rendering | and reduces load spikes on the database. | hkchad wrote: | So I saw this post earlier to day and tried it on a dataset we | have (fixed boundaries w/ some properties that change 4x/hr). | We use the value of the properties for styling of the vector | tiles. Currently the tiles are re-rendered every 4hrs (even | though the data is updated every 15 min) using tippiecanoe, | served by tileserver-gl and cached in cloudfront. So I wanted a | way to get new data to users faster. But as you have noted this | dynamic process crunchy posted IS SLOW, it takes about 3 | minutes to paint the world on my brand new macbook pro (about 3 | seconds w/ pre-rendered). Given the country boundaries do not | change very often is there a way to change just the properties | that actually needed updated in the already rendered vector | tiles? Our pipeline takes about 45 min to run completely to | regenerate the new tiles with updated properties. Or is there a | better way to present this data? We started out w/ GeoJSON | directly from the DB but the size of the files were huge, the | vector tiles are 30% the size of GeoJSON. We were in the MTS | private beta but they didn't have the 'update' process worked | out yet so it was a full refresh each time. | anonymousCmntr wrote: | I use this project for serving vector tiles with PostGIS and | Django REST. | | https://github.com/corteva/djangorestframework-mvt | jokoon wrote: | I wish I could learn to build my own tiles, vector or PNG. I | don't really understand where the data comes from, how is data | gathered and assembled. | | I'm also really curious about the choices involving the zoom | level, how do you decide to render things depending on the zoom | level, when is data discarded, to have good detail or better | performance and lighter tiles. I would really be willing to try | build lighter maps so I can have my own mapping software on a | desktop machine. | | The data sizes and hardware requirements involved are generally | pretty big. It could be interesting to see how much details one | could achieve to make a "portable" map browser when limiting the | data size to 2GB, 5GB or 10GB. | | I would really like to ask why, on some mapping software, you | can't see names of large enough cities/places/regions that are | obviously there. It often makes it difficult to browse maps | properly. | bransonf wrote: | I'll try my best to explain the process. | | The data comes from places like the Census Bureau (roads, place | names) and then a lot of it has to be collected by the like of | OpenStreetMap/Google/Other Providers. (GIS Data is big | business) | | For Vector based approaches (See mapbox) these data are stored | in special built databases and usually simplified geometries | are served to the browser. The benefit is continuous zoom, but | the pitfall is more server side computation and hence cost. | | Because of the cost/compute, raster tiles (PNG, jpg, any pixel | format) have been much more popular. These start the same, you | collect all these data and put them in a database. The | difference is the added step of rendering tiles. This one-off | computation saves you work from then on. See maps.stamen.com | for an example of tiles made from OSM data. | | And you're right about place names sometimes not being | apparent. This is a trade off when using open data and auto | generated tiles. With something like MapBox's vector tiles, you | have individual decimal level control of things like labels. | And zoom level is another computational trade off. You start at | 0 and define an arbitrary end. The higher the number, the | computation/data increases four fold each n. O(4^n) | | And as far as why the size requirements are so big, geospatial | data is big. You have to record information on every point for | vectors which depending on quality can be a ton. And for | rasters, we're talking trillions of pixels really. That's why | all of this is server side. | | And lastly to your point about lightweight desktop software, | tiles don't really have a place in the data process. They're | only really useful for the visualization aspect. And frankly, I | think we're reaching the capacity of the technique, we just | might have some headroom in server efficiency. | kylebarron wrote: | > The benefit is continuous zoom, but the pitfall is more | server side computation and hence cost. | | I haven't tested this with dynamic tiles served from PostGIS, | but with static tiles served from S3 it's quite the opposite! | There's an initial cost to generating tiles, but once they're | generated, you can host them on S3 with zero server cost. | snodnipper wrote: | +1 | | > And lastly to your point about lightweight desktop | software, tiles don't really have a place in the data | process. They're only really useful for the visualization | aspect. And frankly, I think we're reaching the capacity of | the technique, we just might have some headroom in server | efficiency. | | Not totally sure what you mean on your last point...data can | be feature centric (e.g. stored by feature id) or area | centric (stored by area location) etc. Storing data by | location is important far beyond visualisation and is | abstracted in databases such as PostGIS/Postgres (a branded | data structure). That said, I acknowledge that ArcGIS Pro, | QGIS etc. have limited support for tiled data but of course | that is changing. Safe funded much (all?) of the OGR MVT | development afaik. | bransonf wrote: | Oh, I meant more about data analysis. Typically you don't | import raster tiles unless we're talking about imagery. | | But as far as like roads and boundaries, you should always | work with the raw vectors. | kylebarron wrote: | This isn't _necessarily_ true, doing data analysis on | vector tiles allow for high parallelization. See | TileReduce [0] | | [0]: https://github.com/mapbox/tile-reduce | [deleted] | sp332 wrote: | This article from 2010 goes into a lot of detail about which | labels to show and how to place them. | https://news.ycombinator.com/item?id=1963612 | sleavey wrote: | I think many systems are backed by a PostGIS database (an | extension for Postgres) with features and their coordinates. | Map zoom levels define which features should be visible as | layers on the map. A rendering frontend then grabs relevant | data for the layer being viewed and builds the tiles. | trynewideas wrote: | > I wish I could learn to build my own tiles, vector or PNG. I | don't really understand where the data comes from, how is data | gathered and assembled. | | The tools are rapidly evolving. There's no great single entry | point and the best advice I can give is pretty generic: find a | small-scale thing you want to do and do research toward | accomplishing it. | | The post you're commenting on is about how PostGIS databases | mostly do this work for vector tiles on its own now, so "to | build your own tiles", you'd set up a PostGIS database and re- | read this post. A year or two ago the advice would be pretty | different. A year or two from now and the advice will be | _completely_ different. | | That said, from zero http://geojson.io is a dead | straightforward way to do basic operations with GeoJSON data. | You can paste in JSON and it renders on the map; you can draw | on the map and it generates GeoJSON. (https://tilejson.io does | the same for raster tile sets.) | | Real-world data is massive and overwhelming to work with -- | just drawing your own fake maps in geojson.io and working with | that might make some of its concepts easier to digest. | | Maperitve[1] is a free and relatively straightforward app | focused on taking geo data as input and outputting maps. Work | with its rendering rules and you'll understand some of the | challenges with rendering at different zoom levels or in | different contexts. | | Then this post from 2018[2] on Tippecanoe (tile and data | converter), TileServer GL (tile server), and Leaflet | (Javascript front end to view served tiles) covers how to | round-trip a package of vector tiles to GeoJSON data and back. | It's straightforward, works with a relatively small area of | data, doesn't require GIS experience, and though outdated it's | still relevant for understanding by practice how a data-to- | tiles pipeline can work. | | Raster tiles are a little difficult to recommend learning as | tooling has mostly moved on from it in favor of vector tiles, | which pack more information and flexibility into less data, and | I honestly don't know what tools still reliably do that work -- | once upon a time I used TileMill but it was already abandoned | by then and has been very lightly maintained since. | | Re: optimization, here's another more advanced post[3] using | real-world data that illustrates some of the challenges. | | The end-game is to get to a point where you can open something | like QGIS[4], a heavyweight tool that can do all of the above | and way too much more, or Maputnik[5], a vector tile styling | tool using a CSS-ish language, and not get immediately lost. | | > I would really like to ask why, on some mapping software, you | can't see names of large enough cities/places/regions that are | obviously there. | | You won't get a great answer why to that question, I'm afraid. | It's dependent on and configured in whatever the front end is, | generally done algorithmically, and in some cases manually | edited. An art as much as a science, and as fallible as both | combined. (See Justin O'Beirne's incredible reviews of Apple | Map updates[6] for an example.) | | No single labeling strategy will make anyone (much less | everyone) happy and most end-user tools don't expose | customizability. | | 1: http://maperitive.net/docs/TwoMinutesIntro.html | | 2: https://medium.com/@kennethchambers/using-tippecanoe- | tileser... | | 3: https://medium.com/@ibesora/a-data-driven-journey-through- | ve... | | 4: https://qgis.org/en/site/ | | 5: https://maputnik.github.io | | 6: https://www.justinobeirne.com/new-apple-maps | grizzles wrote: | > A year or two from now and the advice will be _completely_ | different. | | Could you elaborate on that? Sounds interesting. | snodnipper wrote: | huge topic. | | > I wish I could learn to build my own tiles, vector or PNG. I | don't really understand where the data comes from, how is data | gathered and assembled. | | There are many data providers out there. You might be | interested in OpenMapTiles, which is a pipeline from | OpenStreetMap (OSM) data. | https://github.com/openmaptiles/openmaptiles | | Also check out Maputnik http://maputnik.github.io/editor/ | | If you want to learn about "tiling schemes" then head over to | https://www.maptiler.com/google-maps-coordinates-tile-bounds... | | > I'm also really curious about the choices involving the zoom | level, how do you decide to render things depending on the zoom | level, when is data discarded, to have good detail or better | performance and lighter tiles. I would really be willing to try | build lighter maps so I can have my own mapping software on a | desktop machine. | | Lots of different considerations - is a human going to look at | the map? If so then a cartographer will determine what is going | to be shown at a given scale. There are other constraints too, | such as limited space to show data and also hidden constraints, | such as the maximum amount of data for a region (e.g. ~500kb | per tile in the case of mapbox vector tiles) | | > The data sizes and hardware requirements involved are | generally pretty big. It could be interesting to see how much | details one could achieve to make a "portable" map browser when | limiting the data size to 2GB, 5GB or 10GB. | | Lots of projects out there doing impressive things there. | Quadtree tiles get you so far...k-d trees might yield other | useful properties. Skobbler have some pretty impressive data | compression technology (~12GiB for global coverage, routable | and searchable...with some limitations - skobbler.com/apps). Of | course the trick is to discard all that you don't need. | | > I would really like to ask why, on some mapping software, you | can't see names of large enough cities/places/regions that are | obviously there. It often makes it difficult to browse maps | properly. | | If there is limited budget then the effort to create | appropriate labels is limited. Data sources can be limited / | incomplete...there can be nuances between jurisdictions etc. | and of course label prioritisation has been a longstanding | problem. What happens when you rotate a map and the text labels | collide with one another...which ones do you keep...which do | you discard etc. These things are also context dependent...why | not include continent names? Or region names? Or province | names? What about the difference between physical and political | geography? A cartographer can help ensure that the right | information is available at the right time...whilst | acknowledging that they have to tell little white lies in every | map they make. ___________________________________________________________________ (page generated 2020-01-02 23:00 UTC)