Aggregate Processing Time Series Data
AIS data is inherently a timeseries dataset. It often doesn’t make sense to display all the raw data to users as this could result in thousands of AIS points, that have a low information per message rate, where a few regularly spaced messages would be better.
TimescaleDB is used to provide several continuous aggregate and scheduled aggregate functions to reduce the amount of data given to users while preserving information.
TimeScaleDB has several tools to create and manage continous aggregates. The database start-up scripts contain a script that creates several continous aggregates from the raw AIS tables.
The Vessel Details Continuous Aggregate provides a single row for each vessel, per 12 hour period, that transmits a voyage report. This contains the most recent daily information on things like draught, radio callsign, ETA and destination as reported by AIS Voyage Reports. The purpose of this is to provide semi-static information on vessels and can highlight changes over time.
NOTE: The Bucket field in the CAGG is indexed NOT the event_time column. Time specific queries should therefore filter on Bucket.
The Hourly Position Report Continuous Aggregate provides the latest position report, per vessel, per half hour. This is used to provide medium term trajectories of vessels. A well reporting vessel will only have 48 rows per day which is much easier to display on a web front end than several hundred or thousands of points.
NOTE: The Bucket field in the CAGG is indexed NOT the event_time column. Time specific queries should therefore filter on Bucket.
The Daily Position Report Continuous Aggregate is similar to the Hourly one, but only stores 2 points per day. This is useful for dsplaying long term vessel locations, over several weeks or months.
NOTE: The Bucket field in the CAGG is indexed NOT the event_time column. Time specific queries should therefore filter on Bucket.
The Daily Trajectory Aggregate contains one row per vessel, per day. This row contains a line built from the Hourly Position Aggregate. No GPS lock errors or MMSI duplicates are filtered out so using this line to represent vessel trajectories will include errors. For an error free trajectory users should use the daily trajectory table built from the User Defined Action below.
NOTE: The traj_start_time field in the CAGG is indexed NOT the event_time or traj_end_time column. Time specific queries should therefore filter on traj_start_time.
TimeScaleDB allows users to create and manage some user defined actions. These are functions that are called on a schedule to perform some database task like aggregating, deleting or moving data around. There are several custom functions created on DB initialisation using a script which are then scheduled.
This script takes AIS data from “yesterday” and builds a trajectory from it that is broken on gaps in time, speed or location. Read more here.
This data is inserted into a table and made available via an API end point.
This script takes AIS data and overlays this onto a grid using a method described here. In the quick start deployment project the grid used is a very course global hexagon grid. This grid is intended as a demonstration instead of a useful product. Users should probably edit the SQL and User Defined Actions to make use of a grid that is useful for their input AIS data.
PG FeatureServ is used as an OGC compliant API endpoint for accessing AIS data. There are several endpoints defined in the start up script but the two focused on here are used for acessing the heatmap and trajectory data above.
The heatmap data is stored in the ais.vessel_density_agg table. This table has a single row per grid cell, per day, per vessel class, per speed bin etc. There are multiple rows per day that would require additional aggregation to be understood. There is a simple view created during init that shows a simple aggregation over the dummy hex grid. A user could use this as a template to create a more complex aggregation over a more desirable grid.
Two trajectory API endpoints exist:
- postgisftw.vessel_trajectory: trajectory built from AIS history with gaps accounted for.
- postgisftw.traj: trajectory built from 30 min sampled AIS history without gaps accounted for.