Description

This article explains how to configure a GIS service that you can use with your applications to provide a geographical representation of information.

GIS service

This article assumes that we already have a kubernetes cluster configured. If you don’t have it, you can refer to configuring kubernetes article or use a preconfigured one.

In the above diagram we can see several components:

  • a Postgres database with the postgis extension installed that can hold geographic data
    • we will configure a physical volume on “worker-02” that will always associate the database with that node
  • QGIS – a WMS service that can use the geographical data and render tiles(images) on request
  • Mapproxy – a caching server that can store on-the-fly or seeded request tiles

All code can be found here.

Configuration

Step 1

At this step we will configure a namespace for the project and we will create a folder on one of the worker nodes that will be used to define an affinity for the Postgres container.

Bash
# create a namespace for the project
sabin@sabin-pc:~$ kubectl create namespace gis
namespace/gis created

# configure the physical volume for postgres
sabin@sabin-pc:~$ ssh k8suser@10.17.2.2

k8suser@worker-02:~$ cd /static-storage/

k8suser@worker-02:~$ mkdir postgres-qgis-data

k8suser@worker-02:~$ exit

Step 2

Create a storage class that states we will use local files, with no provisioning.

YAML
apiVersion: storage.k8s.io/v1
kind: StorageClass
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Delete
metadata:
    labels:
        app.kubernetes.io/instance: local-storage
        app.kubernetes.io/name: qgis-static-storage
    name: qgis-static-storage
Bash
# create the storage class resource on the k8s cluster
# you can observe that the storage class is global( without a namespace )
sabin@sabin-pc:~/map-service/k8s/storage$ kubectl apply -f qgis-static-storage.yaml

Step 3

Create a physical volume backed by “worker-02” local storage to store Postgres data.

YAML
# postgres-qgis-data.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    app.kubernetes.io/storage-class: qgis-static-storage
  name: postgres-qgis-data
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 45Gi
  local:
    path: /static-storage/postgres-qgis-data
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-02
  persistentVolumeReclaimPolicy: Retain
  storageClassName: qgis-static-storage
Bash
sabin@sabin-pc:~/map-service/k8s/storage$ kubectl apply -f postgres-qgis-data.yaml

sabin@sabin-pc:~/map-service/k8s/storage$ kubectl describe pv postgres-qgis-data
Name:              postgres-qgis-data
Labels:            app.kubernetes.io/storage-class=qgis-static-storage
Annotations:       <none>
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      qgis-static-storage
Status:            Available
Claim:
Reclaim Policy:    Retain
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          45Gi
Node Affinity:
  Required Terms:
    Term 0:        kubernetes.io/hostname in [worker-02]
Message:
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /static-storage/postgres-qgis-data
Events:    <none>

Step 4

At this step we will create an “opaque” secret configuration for the Postgres server and Postgres clients. After which we will create a statefulset for Postgres that will be scheduled on “worker-02” because we will reference the storage class that we initially defined. We will also need a service to access the server.

YAML
# secret-config.yaml
apiVersion: v1
kind: Secret
metadata:
  name: postgres-secret
type: Opaque
# please change the below setup
stringData:
  POSTGRES_DB: "db-data"
  POSTGRES_USER: "db-user"
  POSTGRES_PASSWORD: "db-password"
  PGDATA: "/dbdata"
  pg_service.conf: |
    [postgres_svc]
    host=postgres-primary
    port=5432
    dbname=db-data
    user=db-user
    password=db-password
    sslmode=disable
Bash
# create the k8s cluster secret resource
# observe the namespace "gis"
sabin@sabin-pc:~/map-service/k8s/database$ kubectl -n gis apply -f secret-config.yaml

# you could describe the resource
sabin@sabin-pc:~/map-service/k8s/database$ kubectl -n gis describe secret postgres-secret
YAML
# postgres.yaml
kind: StatefulSet
metadata:
  name: postgres-primary
spec:
  replicas: 1
  serviceName: postgres-primary
  selector:
    matchLabels:
      myLabel: postgres-primary
  template:
    metadata:
      labels:
        myLabel: postgres-primary
    spec:
      containers:
        - name: postgres-server
          image: postgis/postgis
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: POSTGRES_DB
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: POSTGRES_USER
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: POSTGRES_PASSWORD
            - name: PGDATA
              valueFrom:
                secretKeyRef:
                  name: postgres-secret
                  key: PGDATA
          volumeMounts:
            - name: data
              mountPath: /dbdata
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: data
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 25Gi
        storageClassName: qgis-static-storage
---
apiVersion: v1
kind: Service
metadata:
  name: postgres-primary
spec:
  type: NodePort
  selector:
    myLabel: postgres-primary
  ports:
    - port: 5432
      targetPort: 5432
      nodePort: 31432
Bash
# create the k8s cluster statefulset resource
sabin@sabin-pc:~/map-service/k8s/database$ kubectl -n gis apply -f qgis.yaml

# you should see the pod running
sabin@sabin-pc:~/map-service/k8s/database$  kubectl -n gis get pods

# test service connection
sabin@sabin-pc:~/map-service/k8s/database$ telnet 10.17.2.1 31432

At this point we can load some geographical data into the created database. We chose OSM as our source. We will choose Romania’s map that we can convert with the help of osm2pgsql and upload it to our Postgres database. This is a very detailed map with 4 layers that will translate to 4 tables. You will not need always such a detailed map or you can define your one. Other regions can be found here.

Bash
# download your region of interest
sabin@sabin-pc:~/map-service/k8s/database$ wget https://download.geofabrik.de/europe/romania-latest.osm.bz2

sabin@sabin-pc:~/map-service/k8s/database$ bzip2 -d romania-latest.osm.bz2

# upload it to our configure database in the cluster
# through the configured NodePort of the service
# observe the configured database credentials at the previous step
sabin@sabin-pc:~/map-service/k8s/database$ osm2pgsql -c -d "db-data" -U "db-user" -H 10.17.2.1 -P 31432 -W romania-latest.osm
2023-11-08 09:26:37  osm2pgsql version 1.6.0
Password:
2023-11-08 09:26:43  Database version: 16.0 (Debian 16.0-1.pgdg110+1)
2023-11-08 09:26:43  PostGIS version: 3.4
2023-11-08 09:26:43  Setting up table 'planet_osm_point'
2023-11-08 09:26:44  Setting up table 'planet_osm_line'
2023-11-08 09:26:44  Setting up table 'planet_osm_polygon'
2023-11-08 09:26:45  Setting up table 'planet_osm_roads'
2023-11-08 09:33:28  Reading input files done in 403s (6m 43s).
2023-11-08 09:33:28    Processed 32642946 nodes in 42s - 777k/s
2023-11-08 09:33:28    Processed 3079863 ways in 286s (4m 46s) - 11k/s
2023-11-08 09:33:28    Processed 59876 relations in 75s (1m 15s) - 798/s
2023-11-08 09:33:46  Clustering table 'planet_osm_point' by geometry...
2023-11-08 09:33:46  Clustering table 'planet_osm_line' by geometry...
2023-11-08 09:33:46  Clustering table 'planet_osm_roads' by geometry...
2023-11-08 09:33:46  Clustering table 'planet_osm_polygon' by geometry...
2023-11-08 09:34:43  Creating geometry index on table 'planet_osm_roads'...
2023-11-08 09:34:45  Analyzing table 'planet_osm_roads'...
2023-11-08 09:34:51  Creating geometry index on table 'planet_osm_point'...
2023-11-08 09:35:09  Analyzing table 'planet_osm_point'...
2023-11-08 09:35:09  All postprocessing on table 'planet_osm_point' done in 83s (1m 23s).
2023-11-08 09:36:32  Creating geometry index on table 'planet_osm_line'...
2023-11-08 09:36:38  Analyzing table 'planet_osm_line'...
2023-11-08 09:36:38  All postprocessing on table 'planet_osm_line' done in 172s (2m 52s).
2023-11-08 09:36:51  Creating geometry index on table 'planet_osm_polygon'...
2023-11-08 09:36:57  Analyzing table 'planet_osm_polygon'...
2023-11-08 09:36:57  All postprocessing on table 'planet_osm_polygon' done in 191s (3m 11s).
2023-11-08 09:36:57  All postprocessing on table 'planet_osm_roads' done in 60s (1m 0s).
2023-11-08 09:36:57  osm2pgsql took 615s (10m 15s) overall.


# osm2pgsql doesn't create a primary key index so we will create it
sabin@sabin-pc:~/map-service/k8s/database$ kubectl -n gis exec -it postgres-primary-0 -- /bin/bash

root@postgres-primary-0:/# psql -U db-user db-data
psql (16.0 (Debian 16.0-1.pgdg110+1))
Type "help" for help.

db-data=# ALTER TABLE planet_osm_point ADD gid serial PRIMARY KEY;
db-data=# ALTER TABLE planet_osm_line ADD gid serial PRIMARY KEY;
db-data=# ALTER TABLE planet_osm_polygon ADD gid serial PRIMARY KEY;
db-data=# ALTER TABLE planet_osm_roads ADD gid serial PRIMARY KEY;

To test if everything is ok you can use QGIS Desktop. To manage several Postgres connections you can use service configurations for Windows or Linux.

Plaintext
[postgres_svc]
    host=10.17.2.1
    port=31432
    dbname=db-data
    user=db-user
    password=db-password
    sslmode=disable
qgis-postgres-configuration
postgres-service-configuration

We can see now that QGIS Desktop can query the Postgres database and render the layers. But something is not right, it doesn’t look as the maps we are used to. This is because we don’t have any applied styles. We can apply styles to maps with the help of MultiQML plugin.

Romania rendered map
Rendered map

After you install MultiQML in QGIS Desktop, through the “Plugins” menu entry, you can apply some styles. I chose the styles from this repo which match the layers in our database.

The result should be a nice looking map similar to OSM maps or Google maps.

So far, we have a single instance of QGIS Desktop that connects to the Postgres database and uses a configuration file with styles to render the map. The next step is to have a WMS service that can split requests across multiple instances and render maps. Do not worry about the latency of rendering, we will address this when we will talk about caching. To move forward, we need to save the QGIS Desktop project to a “romania.qgs” file that will be used by the WMS service implemented by QGIS service.

Step 5

We will now configure the QGIS service that will handle WMS requests and return rendered tiles. This offloads processing to the server and leaves the image joining to the client which nowadays is often a browser.

YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: qgis
spec:
  replicas: 3
  selector:
    matchLabels:
      myLabel: qgis
  template:
    metadata:
      labels:
        myLabel: qgis
    spec:
      containers:
        - name: qgis
          image: flobinsa/qgis-server:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 5555
            - containerPort: 8080
          env:
            - name: NGINX_ENABLE
              value: 'true'
            - name: QGIS_SERVER_PARALLEL_RENDERING
              value: 'true'
            - name: QGIS_SERVER_LOG_LEVEL
              value: '0'
          volumeMounts:
            - name: postgres-config-file
              mountPath: /etc/postgresql-common/
              readOnly: true
            - name: qgs-resources
              mountPath: /data
      volumes:
        - name: postgres-config-file
          secret:
            secretName: postgres-secret
            items:
              - key: pg_service.conf
                path: pg_service.conf
        - name: qgs-resources
          nfs:
            server: 10.18.0.2
            path: /nfs_data/qgs/qgs-resources
---
apiVersion: v1
kind: Service
metadata:
  name: qgis-service
spec:
  type: NodePort
  selector:
    myLabel: qgis
  ports:
    - name: "qgis-fcgi"
      port: 5555
      targetPort: 5555
      nodePort: 31080
    - name: "nginx-alt-http"
      port: 80
      targetPort: 8080
      nodePort: 32080

Bash
# We need to copy the generated project file from QGIS Desktop
# to the NFS server
sabin@sabin-pc:~/map-service/k8s/qgis$ scp romania.qgis nfsuser@10.18.0.2:/nfs_data/qgs/qgs-resources

sabin@sabin-pc:~/map-service/k8s/qgis$ kubectl -n gis apply -f qgis.yaml

sabin@sabin-pc:~/map-service/k8s/qgis$ kubectl -n gis get pods
NAME                        READY   STATUS    RESTARTS   AGE
postgres-primary-0          1/1     Running   0          4d10h
qgis-5df45ddbdb-5wkck       1/1     Running   0          1m
qgis-5df45ddbdb-vgmzq       1/1     Running   0          1m
qgis-5df45ddbdb-wzflq       1/1     Running   0          1m

If you go back to QGIS Desktop and create a new WMS connection with the right URL http://worker-01:32080/ows/?MAP=/data/romania.qgs, you should see the same map as when querying the Postgres database.

Step 6

After we defined our qgis deployment we can browse the map and each time we modify the view we send a query to the cluster that is received by one of the qgis instances. The instance that received the request, connects to Postgres and queries the database for the extent that is requested and then it applies the styles from the “*.qgs” file. This is very slow. To improve this, we add a cache interface with the help of mapproxy. Mapproxy will store tiles resulted from the WMS queries and will serve them to subsequent requests.

To cache the map we have 2 options:

  • use a client and browse the map at various levels in the regions of interest
  • seed the cache with the help of mapproxy-seed utility
YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mapproxy
spec:
  replicas: 3
  selector:
    matchLabels:
      myLabel: mapproxy
  template:
    metadata:
      labels:
        myLabel: mapproxy
    spec:
      containers:
        - name: mapproxy
          image: flobinsa/mapproxy:latest
          imagePullPolicy: Always
          env:
            - name: QGIS_URL
              value: 'http://qgis-service/ows/?MAP=/data/romania.qgs'
          ports:
            - containerPort: 9090
          volumeMounts:
            - name: cache-data
              mountPath: /opt/mapproxy/cache_dir
      volumes:
        - name: cache-data
          nfs:
            server: 10.18.0.2
            path: /nfs_data/mapproxy
---
apiVersion: v1
kind: Service
metadata:
  name: mapproxy-service
spec:
  type: NodePort
  selector:
    myLabel: mapproxy
  ports:
    - port: 80
      targetPort: 9090
      nodePort: 30090
Bash
sabin@sabin-pc:~/map-service/k8s/mapproxy$ kubectl -n gis apply -f mapproxy.yaml

sabin@sabin-pc:~/map-service/k8s/qgis$ kubectl -n gis get pods
NAME                        READY   STATUS    RESTARTS   AGE
mapproxy-7cd8bbdd9b-7qggl   1/1     Running   0          2m
mapproxy-7cd8bbdd9b-gd7qn   1/1     Running   0          2m
mapproxy-7cd8bbdd9b-vrqmc   1/1     Running   0          2m
postgres-primary-0          1/1     Running   0          4d12h
qgis-5df45ddbdb-5wkck       1/1     Running   0          27h
qgis-5df45ddbdb-vgmzq       1/1     Running   0          27h
qgis-5df45ddbdb-wzflq       1/1     Running   0          27h

# Connect to one of the instances to seed the map.
# You can also seed the cache from a local computer using the NodePort
# The latter is preferred because you can use a greater concurency

mapproxy@mapproxy-7cd8bbdd9b-7qggl:/opt/mapproxy$ mapproxy-seed -f mapproxy.yaml --continue seed.yaml
YAML
# seed example
# beware of inodes exhaustion, you should start small, with a city or smaller ROI
cleanups: {}
seeds:
  all_seed:
    caches: [all_cache]
    coverages: [romania]
    levels:
      from: 0
      to: 19
coverages:
  romania:
    bbox: [20.435274481, 49.358316310, 29.899295252, 42.337529387]
    srs:  'EPSG:4326'

If we go back to QGIS Desktop and add a new WMS service http://worker-01:30090/wms we can see the map loading faster than before.

Conclusions

This article presented the components for a mapping service that you can use for you applications. It started from a geographical database that was populated from an open source and defined a rendering, distributed engine( qgis ) and a caching component( mapproxy ).

Things to analyze and improve:

  • security
    • authentication
    • restrict requests – big tiles can overload the system
    • serve cache for background, use database for dynamic information
    • request limiting per user
    • benchmark your solution by recording and replaying a normal usage
  • switch NFS to local storage where possible
  • consider an ingress controller or a reverse proxy

Leave a Reply

Your email address will not be published. Required fields are marked *