API
The API used in this project is pg_featureserv: a PostGIS-based feature server written in Go. It is a lightweight, low-configuration RESTful web service that provides access to spatial data stored in PostGIS tables, as well as spatial processing capability based on PostGIS spatial functions.
This tool complies with OGC and OpenAPI standards and auto-creates some documentation for the API endpoint.
The available data is available from two broad categories: OGC feature collections and database functions.
These are collections of geospatial data that is configured as postgres views in the postgisftw schema. Collection can be filtered and queried using CQL filters appended to the URL. This allows users to search for specific vessels, in a specific bounding box, during a specific time window. Very handy!
While a user can use CQL to filter on any feature in the schema performance will be much better when filtering over indexed items.
What follows is a breakdown of the current collection endpoint and their SQL definitions:
The latest_position end point fetches the most recent position, per MMSI, for all vessels in the time and space window. If no filters are used then it will return thousands of vessels, which might not be particularly useful. Historical data, from before Feb 2021 for example, can be selected by using a “bucket < 2021-02-02” CQL filter.
The columns returned are:
- MMSI: Indexed Mobile Maritime Service ID associated with vessel’s AIS transceiver.
- Time_bucket: Indexed time bucket associated with the 30 minute AIS position report continuous aggregate.
- IMO: Latest reported IMO registration number
- Callsign: Latest reported radio callsign
- Flag: Flag of country associated with first 3 digits of MMSI
- Name: Latest reported vessel name
- Type and Cargo: Latest reported 2 digit AIS code representing vessel’s type and cargo.
- Type: Broad catagory of vessel type.
- Sub Type: Cargo sub type for vessel.
- Draught: Latest reported draught.
- Position: Indexed vessel position. Associated with AIS position reports.
- COG: Course over Ground associated with position report in bucket
- SOG: Speed over Ground associated with position report in bucket
- NavStatus: Navigation Status code associated with position report in bucket
- Description: Human readable navigation status.
CREATE OR REPLACE VIEW postgisftw.latest_positions
AS
SELECT DISTINCT ON (aa.mmsi) aa.mmsi,
aa.bucket AS time_bucket,
bb.imo,
bb.callsign,
dd.country AS flag,
bb.name,
bb.type_and_cargo,
ee.type,
ee.sub_type,
bb.draught,
aa."position"::geometry(Point,4326) AS geom,
aa.cog,
aa.sog,
aa.nav_status,
cc.description AS nav_description
FROM ais.hourly_pos_cagg aa
LEFT JOIN ais.vessel_details_cagg bb ON aa.mmsi = bb.mmsi
LEFT JOIN ais.nav_status cc ON aa.nav_status = cc.nav_status
LEFT JOIN ais.mid_to_country dd ON "left"(aa.mmsi, 3) = dd.mid::text
LEFT JOIN ais.ais_num_to_type ee ON bb.type_and_cargo = ee.ais_num::text
ORDER BY aa.mmsi, aa.bucket DESC;
This contains the latest voyage report information on vessels. CQL can be used to select a specific vessel’s MMSI, Name or to do a search using “like”. Time windows can be used to retrieve voyage report information for a vessel at a specific point in time.
While this API enpoint is exactly the same as the latest position endpoint, in the future these two will diverge. This endpoint is intended for retrieving ship specific data and will eventually include aggregates like port most often visited, days at sea per year, average speed when travelling etc.
The columns returned are:
- MMSI: Indexed Mobile Maritime Service ID associated with vessel’s AIS transceiver.
- Time_bucket: Indexed time bucket associated with the 30 minute AIS position report continuous aggregate.
- IMO: Latest reported IMO registration number
- Callsign: Latest reported radio callsign
- Flag: Flag of country associated with first 3 digits of MMSI
- Name: Latest reported vessel name
- Type and Cargo: Latest reported 2 digit AIS code representing vessel’s type and cargo.
- Type: Broad catagory of vessel type.
- Sub Type: Cargo sub type for vessel.
- Draught: Latest reported draught.
- Position: Indexed vessel position. Associated with AIS position reports and projected in 4326.
- COG: Course over Ground associated with position report in bucket
- SOG: Speed over Ground associated with position report in bucket
- NavStatus: Navigation Status code associated with position report in bucket
- Description: Human readable navigation status.
CREATE OR REPLACE VIEW postgisftw.ship
AS
SELECT DISTINCT ON (aa.mmsi) aa.mmsi,
bb.bucket AS time_bucket,
aa.imo,
aa.callsign,
dd.country AS flag,
aa.name,
aa.type_and_cargo,
ee.type,
ee.sub_type,
aa.draught,
bb."position"::geometry(Point,4326) AS geom,
bb.cog,
bb.sog,
bb.nav_status,
cc.description AS nav_description
FROM ais.vessel_details_cagg aa
LEFT JOIN ais.hourly_pos_cagg bb ON aa.mmsi = bb.mmsi
LEFT JOIN ais.nav_status cc ON bb.nav_status = cc.nav_status
LEFT JOIN ais.mid_to_country dd ON "left"(aa.mmsi, 3) = dd.mid::text
LEFT JOIN ais.ais_num_to_type ee ON aa.type_and_cargo = ee.ais_num::text
ORDER BY aa.mmsi, aa.bucket DESC;
This end point fetches a linestring that represents all the AIS position reports, sampled in 30 minute buckets, grouped by vessel ID and ordered in time for a specific day. These positions are gathered together each day so, ideally, each linestring should have 48 points in it.
This is a quick and simple representation of the historical path a vessel has followed.
The columns returned are:
- MMSI: Indexed Mobile Maritime Service ID associated with vessel’s AIS transceiver.
- Date: Indexed time bucket associated with the daily trajectory.
- Traj Start Time: Time of first point in linestring
- Traj End Time: Time of last point in linestring
- Geom Length: Length, in degrees, of the total linestring.
- Geom Sinuosity: Length of line over the distance between start and end points. Typically a measure of line “curviness”.
- Bucket Count: Number of points in the linestring.
- GEOM: PostGIS geometry in 4326 projection.
CREATE OR REPLACE VIEW postgisftw.traj
AS
SELECT mmsi,
bucket AS date,
traj_start_time,
traj_end_time,
geom_length,
geom_sinuosity,
bucket_count,
geom
FROM ais.daily_30min_trajectories_cagg aa
ORDER BY bucket DESC;
The position end point fetches all vessel positions in the time and space window. If no filters are used then it will return thousands of vessels with possible millions of rows per vessel, which might not be particularly useful. Historical data, from before Feb 2021 for example, can be selected by using a “bucket < 2021-02-02” CQL filter.
The columns returned are:
- MMSI: Indexed Mobile Maritime Service ID associated with vessel’s AIS transceiver.
- Time_bucket: Indexed time bucket associated with the 30 minute AIS position report continuous aggregate.
- IMO: Latest reported IMO registration number
- Callsign: Latest reported radio callsign
- Flag: Flag of country associated with first 3 digits of MMSI
- Name: Latest reported vessel name
- Type and Cargo: Latest reported 2 digit AIS code representing vessel’s type and cargo.
- Type: Broad catagory of vessel type.
- Sub Type: Cargo sub type for vessel.
- Draught: Latest reported draught.
- Position: Indexed vessel position. Associated with AIS position reports.
- COG: Course over Ground associated with position report in bucket
- SOG: Speed over Ground associated with position report in bucket
- NavStatus: Navigation Status code associated with position report in bucket
- Description: Human readable navigation status.
CREATE OR REPLACE VIEW postgisftw.positions
AS
SELECT aa.mmsi,
aa.bucket AS time_bucket,
bb.imo,
bb.callsign,
dd.country AS flag,
bb.name,
bb.type_and_cargo,
ee.type,
ee.sub_type,
bb.draught,
aa."position"::geometry(Point,4326) AS geom,
aa.cog,
aa.sog,
aa.nav_status,
cc.description AS nav_description
FROM ais.hourly_pos_cagg aa
LEFT JOIN ais.vessel_details_cagg bb ON aa.mmsi = bb.mmsi
LEFT JOIN ais.nav_status cc ON aa.nav_status = cc.nav_status
LEFT JOIN ais.mid_to_country dd ON "left"(aa.mmsi, 3) = dd.mid::text
LEFT JOIN ais.ais_num_to_type ee ON bb.type_and_cargo = ee.ais_num::text
ORDER BY aa.mmsi, aa.bucket DESC;
During database creation, if the “fetch geom” variable is not set to “False”, various geometry definitions are pulled from MarineRegions. A single materialised view of these is created in the database in order to assist with providing context to ocean regions. They are organised similar to a hierarchical municipal boundaries layer definition.
- GID: Geo ID number
- GeoName: Name of polygon
- Polygon Type: Type of polygon; ocean, country, eez etc
- Territory: State responsible for area
- ISO Ter: ISO three letter acronym for territory
- Level: Level indicating hierarchical level: 0 = Global/Country level, 1 = Sub region (EEZ, overlapping claim etc), 2 = sub-sub-region (12/24 nm area)
CREATE OR REPLACE VIEW postgisftw.maritime_boundaries
AS
SELECT gid,
geom,
geoname,
pol_type,
territory,
iso_ter,
level
FROM geo.maritime_boundaries;
This is an example API endpoint on how to create heatmap aggregates from AIS data via daily user defined functions and custom geometry. During database creation a global hex grid, with 1 degree hexagon edge size, is created. A user defined function is created where daily vessel trajectories are overlaid onto this grid to create a global 1 degree heatmap. This heatmap is a too low resolution to provide meaningful local information but is a good proof of concept on how to generate a heatmap and can be replicated with a low resolution, local heatmap.
The columns returned are:
- geom: Indexed Geometery of hexagon
- gid: Indexed GID of hexagon
- event_date: Indexed date of aggregation
- track_count_all: Count of ALL vessel tracks that intersect with hexagon.
- track_count_< class >: Count of vessel tracks, from a specific vessel class group, that intersect with hexagon.
CREATE OR REPLACE VIEW postgisftw.heatmap_track_count
AS
SELECT st_setsrid(grid.geom, 4326)::geometry(Polygon,4326) AS geom,
heatmap.gid,
heatmap.event_date,
heatmap.track_count_all,
heatmap.track_count_passenger,
heatmap.track_count_cargo,
heatmap.track_count_tanker,
heatmap.track_count_fishing,
heatmap.track_count_port
FROM ( SELECT agg.gid,
agg.event_date,
sum(agg.track_count) AS track_count_all,
sum(agg.track_count) FILTER (WHERE starts_with(agg.type_and_cargo, '6'::text)) AS track_count_passenger,
sum(agg.track_count) FILTER (WHERE starts_with(agg.type_and_cargo, '7'::text)) AS track_count_cargo,
sum(agg.track_count) FILTER (WHERE starts_with(agg.type_and_cargo, '8'::text)) AS track_count_tanker,
sum(agg.track_count) FILTER (WHERE agg.type_and_cargo = '30'::text) AS track_count_fishing,
sum(agg.track_count) FILTER (WHERE agg.type_and_cargo = ANY (ARRAY['31'::text, '33'::text, '54'::text, '32'::text, '53'::text, '52'::text, '50'::text])) AS track_count_port
FROM ais.vessel_density_agg agg
GROUP BY agg.gid, agg.event_date) heatmap
JOIN geo.world_hex_grid grid ON heatmap.gid = grid.gid::double precision;
Functions are a little different from Feature Collections; they do not have to have geometry columns and can return anything (as long as it can be represented by a table). This does make functions a little more risky to use, as users could end up doing requests that have a sever impact on the database’s performance.
This returns rows showing several views, the time window where it is expected that new data would be inserted, and the timestamp of the last data inserted. In a healthy database, with realtime data being continuous inserted, it would be expected that all these tables/views are up to date. Should this not be the case then further investigation must be done to determine whether this is a failure in TimescaleDB’s user defined functions or if the AIS data inserter pipeline has failed.
Columns:
- Data Source: Table or view name
- Good Interval: the maximum amount of time that should elapse between insertions
- Last Event Time: the last timestamp for data in table
- Last Bucket Time: the last bucket timestamp for data in continuous aggregates.
- Routing Key: The routing key for datasources. Should multiple data sources be used for data in the DB then each is checked here.
- Source_is_Okay: Boolean variable that checks whether the last timestamp is within the “good interval”.
CREATE OR REPLACE FUNCTION postgisftw.health_check(
)
RETURNS TABLE(data_source text, good_interval interval, last_event_time timestamp with time zone, last_insert_or_bucket_time timestamp with time zone, _routing_key text, source_is_okay boolean)
LANGUAGE 'plpgsql'
COST 100
STABLE PARALLEL SAFE
ROWS 1000
AS $BODY$
BEGIN
RETURN QUERY
SELECT
'ais.pos_reports' as data_source,
interval '15 minutes' as good_interval,
last(event_time, event_time) as Last_Event_Time,
last(server_time, event_time) as Last_Insert_or_Bucket_Time,
pos_reports.routing_key as _routing_key,
last(event_time, event_time) > now() - interval '15 minutes' as Source_is_Okay
FROM ais.pos_reports
GROUP BY routing_key
UNION all
SELECT
'ais.daily_30min_trajectories_cagg' as data_source,
interval '24 hours' as good_interval,
last(traj_end_time, bucket) as Last_Event_Time,
last(bucket, bucket) as Last_Insert_or_Bucket_Time,
Null as _routing_key,
last(traj_end_time, bucket) > now() - interval '24 hours' as Source_is_Okay
FROM ais.daily_30min_trajectories_cagg
UNION all
SELECT
'ais.daily_pos_cagg' as data_source,
interval '12 hours' as good_interval,
last(event_time, bucket) as Last_Event_Time,
last(bucket, bucket) as Last_Insert_or_Bucket_Time,
Null as _routing_key,
last(event_time, bucket) > now() - interval '12 hours' as Source_is_Okay
FROM ais.daily_pos_cagg
UNION all
SELECT
'ais.hourly_pos_cagg' as data_source,
interval '1 hours' as good_interval,
last(event_time, bucket) as Last_Event_Time,
last(bucket, bucket) as Last_Insert_or_Bucket_Time,
Null as _routing_key,
last(event_time, bucket) > now() - interval '1 hours' as Source_is_Okay
FROM ais.hourly_pos_cagg
UNION all
SELECT
'ais.vessel_details_cagg' as data_source,
interval '1 hours' as good_interval,
last(event_time, bucket) as Last_Event_Time,
last(bucket, bucket) as Last_Insert_or_Bucket_Time,
Null as _routing_key,
last(event_time, bucket) > now() - interval '1 hours' as Source_is_Okay
FROM ais.vessel_details_cagg
UNION all
(SELECT DISTINCT ON (last_time)
'ais.trajectories' as data_source,
interval '24 hours' as good_interval,
last_time as Last_Event_Time,
NULL as Last_Insert_or_Bucket_Time,
Null as _routing_key,
last_time > now() - interval '24 hours' as Source_is_Okay
FROM ais.trajectories
ORDER BY trajectories.last_time DESC
LIMIT 1)
UNION all
(SELECT DISTINCT ON (event_date)
'ais.vessel_density_agg' as data_source,
interval '48 hours' as good_interval,
event_date as Last_Event_Time,
event_date as Last_Insert_or_Bucket_Time,
Null as routing_key,
event_date > now() - interval '48 hours' as Source_is_Okay
FROM ais.vessel_density_agg
ORDER BY vessel_density_agg.event_date DESC
LIMIT 1);
END;
$BODY$;