Designing Spatio-temoporal API's
API’s have become the de facto method of sharing data between services. It’s clean, rigid and works again and again once setup correctly. The design of an API; the naming convention, the inputs and output structure, the variable names, often speaks more to developers than to machines. Being able to intuitively navigate API-space and find what you want, where you expected it to be, is the sign of a well designed API.
There are several existing API’s for AIS data that can be eyeballed:
- Marine Traffic
- Vessel Finder
- AIS Hub
- VT Explorer
- Spire - this one is pretty good.
Spatio-temporal data sets have some specific challenges to them due to complex data structures. It’s often not useful to only know what something is, but where and when it was what it was… Selecting items in a region, or time window, or both is required.
Basic queries about AIS, and AIS derived, products come in a couple different flavours:
- Show me all ships in an area at a specific time. Good for a basic map.
- Show me the location of a specific ship at a specific time. Good for tracking a vessel.
- Show me events relating to this vessel. When did it enter a location, change it’s activity, leave a port? Good for more detailed examination of a vessel.
- Show me vessel spatial activity; where does fishing occur, anchorages, ports etc. Good for spotting long term trends in spatial usage.
- Show me temporal activity; when do vessel enter port, fishing effort. Good for port analytics, trend analysis etc
- Show me information on a specific location; what kinds of vessels go there, is it overfished, is there maybe long term damage from anchor scars?
- Show me interactions between locations; where to ships from a specific port go to fish, how are ports connected by trade? Good for creating pretty images for funding applications.
There are a couple of terns that are being suggested by the above points:
- Vessel static information: Details on vessel names, classes (reported or derived), length/width/engine capacity/ build year.
- Vessel location geometry: either a point or trajectory.
- Physical locations: ports, wind farms, rivers and coastal infrastructure. These generally do not change.
- Non-physical locations: EEZ, marine protected areas, aids-to-navigation. These are static, but may change and generally have well defined shapes.
- Derived locations: Anchorages, fishing zones, shipping lanes. These might change over time or with the seasons.
- Location based events: these are interactions between vessels and locations.
- Behaviour based events: changes in vessel behaviour, either reported or derived.
Okay, so that lots to deal with when it comes to a bunch of lat/lon’s from RF messages. Types of requests:
vessels from ID - specific info on a specific vessel you have the ID for, it’s history, and class vessels in a location - return info on all vessels that are within a spatio-temporal window vessel_events - return info on a vessel’s history, broken up by events. locations - return physical and derived location near/at a location
Also keeping this all in a coherent, standardised format would help. Standards exist for a reason so it shouldn’t be a surprise that there are Open Geospatial Consortium standards that could be used for sharing this type of data. OGC geojson is flexible enough to contain almost anything but this should contain the information required to allow the data to be useful to other applications.
Let’s look at the data requests in a little more detail. Also I’m using DALL-E Mini images for kicks.
You want information on a specific vessel (or a list of vessels). The vessels are selected by knowing the vessel MMSI and using that in the request. The information that is returned is the status information on the vessel (class, name, callsign, size) as well as the describing
I’m tempted to just call this “vessel” but that implies that only one vessel can be selected while it would be just as easy to allow multiple vessels to be selected. It could just as easily be called vessels_from_mmsi to show that you require a MMSI to retrieve vessel info but that becomes a little clumsy.
Input to this query would be:
- List of MMSI ID’s
- Last time of interest (single datetime ISO timestamp)
- First time of interest (single datetime ISO timestamp or interval)
Output data would be:
- List of static vessel information (name, callsign, class etc)
- Vessel history line (Trajectory generated from AIS points within time-window. Can be simplified to reduce size of return (but then it won’t have time))
You want information on many vessels; the most often case would be vessels in a specified area (port, map window, EEZ). A quick way to specify a region would be to include “WHERE ship_position WITHIN area_of_interest” to a SQL query. Luckily pg_featureserv handles that for us.
Input data:
- BBox describing region of interest
- Last time of interest
- First time of interest
Output data:
- List of static vessel information
- List of points for last known position in time window
You want to get information describing the location within a specific region or with a specific ID. This is just a collection of geometry and can be published using the new CQL support from pg_featureserv: PG_Featureserv support!
As shown above there are multiple types of locations; well specified ones and dynamically generated clusters. It would be simple to have all the different types published in one table with a type and publications date attached to each one. A method of reducing the amount of returns for dynamic items needs to be considered. It probably isn’t a good idea to return fishing clusters for each week for the last three years when a query includes a popular fishing area but it might also not be ideal to only return the latest fishing zone cluster.
Standard GeoJSON OFC collection style. Less work for me…
Often adding another layer of human readable data relations to data products adds some value; it’s easier to give a time that a vessel entered a port rather than a list of lat/lon/times that contain the same information. Events are usually related to vessels (specific vessel entering/exiting an area, changing navigation status etc) or to locations ( number of vessels entering port per day, list of vessels crossing MPA, etc). This seems like another good place to publish a collection and allow the new CQL support to be effective but that would require some regularly called function to be run that populates the collection.
The API end point should be called “events” and contain a vessel ID, the type of event that occurred and the time that it occurred. PG_FeatureServ does expect a geometry to be associated with each row and this could either be the AIS Point at the event change instant or the larger marine geometry that the vessel is interacting with.
This needs some more work. Events could be generated dynamically or regularly and stored. Events could be contextual or rigidly defined. I’ll have to address all this later.