Warning, /education/marble/tools/vectorosm-tilecreator/setup/README.md is written in an unsupported language. File is not indexed.

0001 # Server-side setup notes
0002 
0003 Note: this has been deployed to maps.kde.org in November 2020.
0004 
0005 ## Overview
0006 
0007 The outside interface for this is the standard [Slippy map](https://wiki.openstreetmap.org/wiki/Slippy_map_tilenames),
0008 under `/earth/vectorosm/v1/`. Zoom levels 1, 3, 5, 7, 9, 11, 13, 15 and 17 are offered. The individual tiles are served
0009 as [o5m](https://wiki.openstreetmap.org/wiki/O5m) encoded files.
0010 
0011 Internally this is split into two parts:
0012 * Staticly generated low-resolution tiles for zoom levels 1, 3, 5, 7 and 9, based on the [Natural Earth](https://www.naturalearthdata.com/)
0013 data set. Those tiles exist in o5m format on disk in the exact layout they are served by the webserver.
0014 * Dynamically generated high-resolution tiles for zoom levels 11, 13, 15 and 17. Creation and expiry of those is managed
0015 by Tirex, and they are stored in the [metatile](https://wiki.openstreetmap.org/wiki/Tirex/Internals#Metatile_file_structure)
0016 format (8x8 tiles in a single binary file in a 5 layer hashed folder structure). mod_tile takes care of translating that
0017 to the outside interface.
0018 * Input data for the dynamic generation: This is provided via an [OSMX](https://github.com/protomaps/OSMExpress)
0019 database, which allows for fast spatial queries and efficient incremental updates.
0020 
0021 ## Dependencies
0022 
0023 The following components are assumed to be on the server:
0024 * Apache2
0025 * mod_tile - https://wiki.openstreetmap.org/wiki/Mod_tile
0026 * Tirex - https://wiki.openstreetmap.org/wiki/Tirex
0027 * osmx and osmx-update - https://github.com/protomaps/OSMExpress (static binary of osmx available there, osmx-update is a Python script)
0028 * marble-vectorosm-tirex-backend
0029 
0030 The following components are needed for the static/low-z tile generation and can be run on a different machine:
0031 * Python 3
0032 * osmctools - https://gitlab.com/osm-c-tools/osmctools
0033 * ogr2ogr from gdal (?)
0034 * ne_tilegenerator.py
0035 * marble-vectorosm-tilecreator
0036 * marble-vectorosm-process-land-polygons
0037 
0038 Precompiled packages:
0039 * mod_tile: PPA by OSM admin team: https://launchpad.net/~osmadmins/+archive/ubuntu/ppa
0040 * Tirex: PPA by the author: https://launchpad.net/~framm/+archive/ubuntu/tirex - unfortunately only for Ubuntu 18.04
0041   To work around this, Debian packages can be built by running `make` in the `build` sub-directory of this folder as well.
0042 
0043 ## Setup
0044 
0045 See configuration files in the etc/ subdir.
0046 
0047 ### Static low-z tile generation
0048 
0049 run ne_tilegenerator.py from ../natural-earth-vector-tiling. For this to work, all its dependencies (including marble-vectorosm-tilecreator)
0050 have to be in `PATH`.
0051 
0052 ```
0053 mkdir -p /k/osm/htdocs/earth/vectorosm/v1/
0054 mkdir -p /k/osm/cache/natural_earth
0055 ./ne_tilegenerator.py -z 1,3,5,7,9 -f `pwd`/level_info.txt -o /k/osm/htdocs/earth/vectorosm/v1/ -i /k/osm/cache/natural_earth/ -c /k/osm/cache/natural_earth/ -r 30 -ow
0056 ```
0057 
0058 TODO: this still generates files in its source dir, so probably this is better run inside the cache directory instead?
0059 
0060 The source data updates infrequently, so a low-frequency cron job is an option. This can also be done locally or otherwise
0061 off the live system, the resulting amount of data is small enough to be rsync'ed.
0062 
0063 ### Dynamic high-z tile generation
0064 
0065 Preparing the land polygon input data by running:
0066 `marble-vectorosm-process-land-polygons -c /k/osm/cache`
0067 
0068 This step can also be done offline and the result copied to the production system.
0069 
0070 Preparing the OSMX database:
0071 
0072 * Download the latest full planet data dump (in PBF format!) from a mirror listed here: https://wiki.openstreetmap.org/wiki/Planet.osm
0073 * Run `osmx expand planet.osm.pbf /k/osm/cache/planet.osmx` to create the OSMX database.
0074 * The downloaded data dump can be discarded afterwards to free some disk space.
0075 
0076 This step produces by far the most data, doing this on a different system is probably not feasible in most cases.
0077 
0078 Initial pre-generation of level 11 tiles:
0079 
0080 ```
0081 # North America
0082 tirex-batch -f not-exists map=vectorosm/v1 x=310-680 y=660-940 z=11
0083 # South America
0084 tirex-batch -f not-exists map=vectorosm/v1 x=560-824 y=1024-1400 z=11
0085 # North Africa, Asia, Europe
0086 tirex-batch -f not-exists map=vectorosm/v1 x=920-2047 y=432-1000 z=11
0087 # South Africa
0088 tirex-batch -f not-exists map=vectorosm/v1 x=1072-1312 y=1000-1232 z=11
0089 # Australia
0090 tirex-batch -f not-exists map=vectorosm/v1 x=1560-2032 y=1000-1320 z=11
0091 ```
0092 
0093 This enqueues batch jobs for generating all level 11 tiles that don't exist yet. Due to the existance filter this could be re-run
0094 after every server restart for example without causing extra generation cost.
0095 
0096 ## Incremental Updates
0097 
0098 Run the following command as a daily cron job (for server locations outside for central Europe pick a different mirror):
0099 
0100 `osmx-update <path-to>/planet.osmx https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/replication/day/`
0101 
0102 ## Resource Requirements
0103 
0104 For the static low-z tiles:
0105 * 1.2GB disk space, 265k files, 700 directories, 260k inodes for the generated data
0106 * Generation takes about 60-90min (single core), needs about 2GB of temporary disk space, a few 100MB download volume, and ~6GB RAM peak
0107 
0108 For the dynamic high-z tiles (estimates and bounds, exact prediction is not possible here):
0109 * Low-to medium density metatiles (batches of 64 tiles) generate in 100ms or less.
0110 * High-density metatiles take ~15s - this is addressed by pre-generating the level 11 tiles initially.
0111 * Amount of parallel processes used for generation can be adjusted in the Tirex config, each process only uses a single core.
0112 * RAM peak should remain well below 1GB per generation process, exact amount varies with the level of detail of the processed tile.
0113 * Disk space requirement for the generator output varies with access patterns:
0114     * Access stats from mid 2020 show 44k distinct tiles being used in a 2w period.
0115     * Metatiles of high-density areas are up to 1.5M in size, 10x less for lower-density areas.
0116     * Simply multiplying this results in 66GB and 44k files, however that assumes only distinct high-z tiles are requested.
0117     * The full world OSM data in o5m format is around 60GB as well, so that is a sensible upper bound for volume.
0118     * The theoretical upper bound for z17 files is 2^(2*17 - 6) = 268M, however even the
0119       [OSM access statistics](https://wiki.openstreetmap.org/wiki/Tile_disk_usage) only show about 2.5% of z17 tiles actually being loaded.
0120       It can further be assumed that tile access is not random but clustered, which further reduces the amount of metatiles need.
0121     * 10k to 1M files would therefore seem like the best guess for this.
0122 
0123 For input data updates:
0124 * Initial download of a full OSM dataset is about 60GB (available on several fast mirrors).
0125 * Initial creation of the OSMX database takes 6h, needs 8GB RAM and generates 700GB on disk in a single file.
0126 * Incremental updates: 100MB download and about 20s CPU time per day, and 6GB RAM peak during that.
0127 * Land polygons:
0128     * 700MB download
0129     * 1GB disk space, 16k inodes
0130     * and an addtional 1.5GB temporary disk use during generation
0131     * generation takes 2-3 minutes and 4.5GB RAM
0132 
0133 ## OSMX Database Rebuilds
0134 
0135 Incremental updates of the OSMX database seem to cause that to grow faster than what a clean import produces, and there is
0136 no built-in database compaction command. We therefore have to rebuild it from scratch every couple of months to avoid running
0137 out of disk space.
0138 
0139 ```
0140 ssh mapsadmin@rhei.kde.org
0141 
0142 # Make sure to start the lengthy process in a screen session independent of the SSH session
0143 screen
0144 
0145 # Check the scratch space has at least 1TB of free space
0146 cd /srv/scratch/maps/
0147 
0148 # Download the latest OSM data dump (~15min)
0149 wget https://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/pbf/planet-latest.osm.pbf
0150 
0151 # Import the OSM dump into the OSMX database (6-9h)
0152 /opt/osm/bin/osmx expand planet-latest.osm.pbf planet.osmx
0153 
0154 # Replace the previous OSMX database with the new one (1-2h)
0155 # The server is not able to generate new tiles during this period
0156 cd /var/cache/tirex/cache/
0157 rm planet.osmx planet.osmx-lock
0158 cp /srv/scratch/maps/planet.osmx .
0159 
0160 # Check that both the tirex and mapsadmin user can access the OSMX db ("osmx query planet.osmx" must not fail)
0161 chmod 664 planet.osmx-lock
0162 chown mapsadmin:_tirex planet.osmx-lock
0163 
0164 # Clean up scratch space again
0165 rm /srv/scratch/maps/planet.osmx /srv/scratch/maps/planet.osmx-lock /srv/scratch/maps/planet-latest.osm.pbf
0166 ```