Relational Database Management System
Seeing as how the data is fairly regular in columns, and there are obvious relations between the Voyage Reports, Position Reports and the locations that they’re in, it makes sense to use a relational database as a datastore. PostgreSQL has access to spatial and time-series add-ons; PostGIS and TimeScaleDB
There are some great Postgres+PostGIS+Timescale docker images out there but many of these do not include some PostGIS helper functions for loading geometry into a database. To get these functions use the following dockerfile:
FROM timescale/timescaledb-ha:pg13.3-ts2.3.1-latest
USER root
RUN apt-get update \
&& apt-get install -y wget postgis unzip \
&& rm -rf /var/lib/apt/lists/*
USER postgres
Which uses a PostGIS+Timescale image and adds the “postgis” and “unzip” libraries.
The database is created, on first run, using several initialisation scripts and data files. If your docker volumes are configured correctly this shouldn’t happen on the second+ runs. These init files are located here.
The database hierarchical structure is shown in the below figure:
The data types are broadly separated into different schemas. Each schema still benefits from plugins installed on the RDMS, and activated in the Vessels database. The schema’s are:
- AIS: Contains AIS position and voyage reports as well as AIS derived data
- Geo: Contains geographical tables representing maritime objects like ocean boundaries, port locations, grids to use in areas of interest
- TimescaleDB: There are multiple internal schemas generated by the TimescaleDB plugin. It’s best to leave these alone…
- PostgisFTW: This is the default schema that exposes functions and collections to the web, via a rest API managed by Pg_Featureserv.
- Data X/Y: These are examples that could contain other datasets like, Synthetic Aperture Radar (SAR) detections, port radar detections etc.
The Geo schema is created on init when multiple shapefiles are pulled from MarineRegions, loaded into the DB and spatially indexed. The goal of this table is to provide human readable context to vessel positions. The tables within are:
- admin_0_countries: The world country boundaries. Anything inside here is NOT in the ocean.
- eez_12nm: The 12 nautical mile boundary for coastal countries.
- eez_24nm: The 24 nm boundary
- world_eez: The 200 nm boundary
- eez_archipelagic_waters: Archepelagic waters. These are not oceanic but also not rivers.
- eez_internal_waters: Internal waters are lagoons/bays/river mouths that are not open oceanic but also not rivers.
- oceans_world: These are the ocean boundaries. Ever wonder where the Indian and the Atlantic split?
- sampaz: South African Marine Protected Areas
- world_port_index: List of World Ports, their locations and other information. From here.
There are also several materialized views contained in here that are either grids generated by a function call, or are aggregations of the tables in order to provide a single source of hierarchical geographic data. See image below:
The AIS schema is split into 3 catagories: AIS position reports, AIS voyage reports and protocal translation tables:
- ais_num_to_type: Look-up table to convert an AIS Class number (eg: 74) into a human readable format (Cargo Vessel; Hazardous Type D).
- mid_to_country: Look-up table to convert the MID (first 3 characters of MMSI) into a Country ID. EG: MMSI 237xxxxxx would be a vessel from Greece.
- nav_status: Look-up table to convert AIS Navigation Status (eg: 7) to human readable text (Engaged in Fishing)
- pos_reports: Data table. This is a TimescaleDB hypertable used to contain AIS position reports. This table is automatically partitioned and used for continuous aggregates. The position reports typically contain information on the position and vector of vessels.
- voy_reports: Data table. This is a TimescaleDB hypertable used to contain AIS voyage reports. This table is automatically partitioned and used for continuous aggregates. Voyage reports typically contain information on the name/callsign of vessels and their voyage status.
- latest_voy_reports: This is an table that always has the latest information on voyage reports for each vessel. Any gaps in the data are filled using a “last observation carried forward” method.
PostGIS For The Web! This holds data collections and functions, and their permissions, in order to expose them via a REST API. Wonderful!