Warning, /education/marble/tools/vectorosm-tilecreator/setup/README.md is written in an unsupported language. File is not indexed.
0001 # Server-side setup notes 0002 0003 Note: this has been deployed to maps.kde.org in November 2020. 0004 0005 ## Overview 0006 0007 The outside interface for this is the standard [Slippy map](https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames), 0008 under `/earth/vectorosm/v1/`. Zoom levels 1, 3, 5, 7, 9, 11, 13, 15 and 17 are offered. The individual tiles are served 0009 as [o5m](https://wiki.openstreetmap.org/wiki/O5m) encoded files. 0010 0011 Internally this is split into two parts: 0012 * Staticly generated low-resolution tiles for zoom levels 1, 3, 5, 7 and 9, based on the [Natural Earth](https://www.naturalearthdata.com/) 0013 data set. Those tiles exist in o5m format on disk in the exact layout they are served by the webserver. 0014 * Dynamically generated high-resolution tiles for zoom levels 11, 13, 15 and 17. Creation and expiry of those is managed 0015 by Tirex, and they are stored in the [metatile](https://wiki.openstreetmap.org/wiki/Tirex/Internals#Metatile_file_structure) 0016 format (8x8 tiles in a single binary file in a 5 layer hashed folder structure). mod_tile takes care of translating that 0017 to the outside interface. 0018 * Input data for the dynamic generation: This is provided via an [OSMX](https://github.com/protomaps/OSMExpress) 0019 database, which allows for fast spatial queries and efficient incremental updates. 0020 0021 ## Dependencies 0022 0023 The following components are assumed to be on the server: 0024 * Apache2 0025 * mod_tile - https://wiki.openstreetmap.org/wiki/Mod_tile 0026 * Tirex - https://wiki.openstreetmap.org/wiki/Tirex 0027 * osmx and osmx-update - https://github.com/protomaps/OSMExpress (static binary of osmx available there, osmx-update is a Python script) 0028 * marble-vectorosm-tirex-backend 0029 0030 The following components are needed for the static/low-z tile generation and can be run on a different machine: 0031 * Python 3 0032 * osmctools - https://gitlab.com/osm-c-tools/osmctools 0033 * ogr2ogr from gdal (?) 0034 * ne_tilegenerator.py 0035 * marble-vectorosm-tilecreator 0036 * marble-vectorosm-process-land-polygons 0037 0038 Precompiled packages: 0039 * mod_tile: PPA by OSM admin team: https://launchpad.net/~osmadmins/+archive/ubuntu/ppa 0040 * Tirex: PPA by the author: https://launchpad.net/~framm/+archive/ubuntu/tirex - unfortunately only for Ubuntu 18.04 0041 To work around this, Debian packages can be built by running `make` in the `build` sub-directory of this folder as well. 0042 0043 ## Setup 0044 0045 See configuration files in the etc/ subdir. 0046 0047 ### Static low-z tile generation 0048 0049 run ne_tilegenerator.py from ../natural-earth-vector-tiling. For this to work, all its dependencies (including marble-vectorosm-tilecreator) 0050 have to be in `PATH`. 0051 0052 ``` 0053 mkdir -p /k/osm/htdocs/earth/vectorosm/v1/ 0054 mkdir -p /k/osm/cache/natural_earth 0055 ./ne_tilegenerator.py -z 1,3,5,7,9 -f `pwd`/level_info.txt -o /k/osm/htdocs/earth/vectorosm/v1/ -i /k/osm/cache/natural_earth/ -c /k/osm/cache/natural_earth/ -r 30 -ow 0056 ``` 0057 0058 TODO: this still generates files in its source dir, so probably this is better run inside the cache directory instead? 0059 0060 The source data updates infrequently, so a low-frequency cron job is an option. This can also be done locally or otherwise 0061 off the live system, the resulting amount of data is small enough to be rsync'ed. 0062 0063 ### Dynamic high-z tile generation 0064 0065 Preparing the land polygon input data by running: 0066 `marble-vectorosm-process-land-polygons -c /k/osm/cache` 0067 0068 This step can also be done offline and the result copied to the production system. 0069 0070 Preparing the OSMX database: 0071 0072 * Download the latest full planet data dump (in PBF format!) from a mirror listed here: https://wiki.openstreetmap.org/wiki/Planet.osm 0073 * Run `osmx expand planet.osm.pbf /k/osm/cache/planet.osmx` to create the OSMX database. 0074 * The downloaded data dump can be discarded afterwards to free some disk space. 0075 0076 This step produces by far the most data, doing this on a different system is probably not feasible in most cases. 0077 0078 Initial pre-generation of level 11 tiles: 0079 0080 ``` 0081 # North America 0082 tirex-batch -f not-exists map=vectorosm/v1 x=310-680 y=660-940 z=11 0083 # South America 0084 tirex-batch -f not-exists map=vectorosm/v1 x=560-824 y=1024-1400 z=11 0085 # North Africa, Asia, Europe 0086 tirex-batch -f not-exists map=vectorosm/v1 x=920-2047 y=432-1000 z=11 0087 # South Africa 0088 tirex-batch -f not-exists map=vectorosm/v1 x=1072-1312 y=1000-1232 z=11 0089 # Australia 0090 tirex-batch -f not-exists map=vectorosm/v1 x=1560-2032 y=1000-1320 z=11 0091 ``` 0092 0093 This enqueues batch jobs for generating all level 11 tiles that don't exist yet. Due to the existance filter this could be re-run 0094 after every server restart for example without causing extra generation cost. 0095 0096 ## Incremental Updates 0097 0098 Run the following command as a daily cron job (for server locations outside for central Europe pick a different mirror): 0099 0100 `osmx-update <path-to>/planet.osmx https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/replication/day/` 0101 0102 ## Resource Requirements 0103 0104 For the static low-z tiles: 0105 * 1.2GB disk space, 265k files, 700 directories, 260k inodes for the generated data 0106 * Generation takes about 60-90min (single core), needs about 2GB of temporary disk space, a few 100MB download volume, and ~6GB RAM peak 0107 0108 For the dynamic high-z tiles (estimates and bounds, exact prediction is not possible here): 0109 * Low-to medium density metatiles (batches of 64 tiles) generate in 100ms or less. 0110 * High-density metatiles take ~15s - this is addressed by pre-generating the level 11 tiles initially. 0111 * Amount of parallel processes used for generation can be adjusted in the Tirex config, each process only uses a single core. 0112 * RAM peak should remain well below 1GB per generation process, exact amount varies with the level of detail of the processed tile. 0113 * Disk space requirement for the generator output varies with access patterns: 0114 * Access stats from mid 2020 show 44k distinct tiles being used in a 2w period. 0115 * Metatiles of high-density areas are up to 1.5M in size, 10x less for lower-density areas. 0116 * Simply multiplying this results in 66GB and 44k files, however that assumes only distinct high-z tiles are requested. 0117 * The full world OSM data in o5m format is around 60GB as well, so that is a sensible upper bound for volume. 0118 * The theoretical upper bound for z17 files is 2^(2*17 - 6) = 268M, however even the 0119 [OSM access statistics](https://wiki.openstreetmap.org/wiki/Tile_disk_usage) only show about 2.5% of z17 tiles actually being loaded. 0120 It can further be assumed that tile access is not random but clustered, which further reduces the amount of metatiles need. 0121 * 10k to 1M files would therefore seem like the best guess for this. 0122 0123 For input data updates: 0124 * Initial download of a full OSM dataset is about 60GB (available on several fast mirrors). 0125 * Initial creation of the OSMX database takes 6h, needs 8GB RAM and generates 700GB on disk in a single file. 0126 * Incremental updates: 100MB download and about 20s CPU time per day, and 6GB RAM peak during that. 0127 * Land polygons: 0128 * 700MB download 0129 * 1GB disk space, 16k inodes 0130 * and an addtional 1.5GB temporary disk use during generation 0131 * generation takes 2-3 minutes and 4.5GB RAM 0132 0133 ## OSMX Database Rebuilds 0134 0135 Incremental updates of the OSMX database seem to cause that to grow faster than what a clean import produces, and there is 0136 no built-in database compaction command. We therefore have to rebuild it from scratch every couple of months to avoid running 0137 out of disk space. 0138 0139 ``` 0140 ssh mapsadmin@rhei.kde.org 0141 0142 # Make sure to start the lengthy process in a screen session independent of the SSH session 0143 screen 0144 0145 # Check the scratch space has at least 1TB of free space 0146 cd /mnt/scratch-space/ 0147 0148 # Download the latest OSM data dump (~15min) 0149 wget https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/pbf/planet-latest.osm.pbf 0150 0151 # Import the OSM dump into the OSMX database (6-9h) 0152 /opt/osm/bin/osmx expand planet-latest.osm.pbf planet.osmx 0153 0154 # Replace the previous OSMX database with the new one (1-2h) 0155 # The server is not able to generate new tiles during this period 0156 cd /var/lib/tirex/cache/ 0157 rm planet.osmx planet.osmx-lock 0158 cp /mnt/scratch-space/planet.osmx . 0159 0160 # Check that both the tirex and mapsadmin user can access the OSMX db ("osmx query planet.osmx" must not fail) 0161 chmod 664 planet.osmx-lock 0162 chown mapsadmin:tirex planet.osmx-lock 0163 0164 # Clean up scratch space again 0165 rm /mnt/scratch-space/planet.osmx /mnt/scratch-space/planet.osmx-lock /mnt/scratch-space/planet-latest.osm.pbf 0166 ```