Modeling Historical AIS Data For Vessel Path Prediction: A Comprehensive Treatment

2020·Arxiv

Abstract

Abstract

The prosperity of artificial intelligence has aroused intensive interests in intelligent/autonomous navigation, in which path prediction is a key functionality for decision supports, e.g. route planning, collision warning, and traffic regulation. For maritime intelligence, Automatic Identification System (AIS) plays an important role because it recently has been made compulsory for large international commercial vessels and is able to provide nearly real-time information of the vessel. Therefore AIS data based vessel path prediction is a promising way in future maritime intelligence. However, real-world AIS data collected online are just highly irregular trajectory segments (AIS message sequences) from different types of vessels and geographical regions, with possibly very low data quality. So even there are some works studying how to build a path prediction model using historical AIS data, but still, it is a very challenging problem. In this paper, we propose a comprehensive framework to model massive historical AIS trajectory segments for accurate vessel path prediction. Experimental comparisons with existing popular methods are made to validate the proposed approach and results show that our approach could outperform the baseline methods by a wide margin.

Index Terms—AIS Data, Path Prediction, Ensemble Learning, Intelligent Navigation

I. INTRODUCTION

WITH the rapid globalization and increasing demand formaritime transportation over the world, maritime safety has attracted huge attention in both marine and trading sectors during the past decade. Statistics show that more than 90% of the world’s trade is transported by sea nowadays1, but meanwhile, a significant part of maritime accidents (about 80%) have been caused by human errors [1, 2]. These accidents have caused severe people casualties, economic losses, and environmental crisis. Therefore, the importance of maritime navigation safety has been raised to an unprecedented level in the marine industry [3]. Maritime intelligent/autonomous systems can provide potential solutions in the future to improve safety during navigation and management [4, 5]. For this purpose, Automatic Iden-tification System (AIS) recently has been made compulsory for international/regional commercial vessels above a certain tonnage, including cargo, passage, tankers, etc, as well as parts of civil vessels, including fishing ships, lifeboats, and leisure

Enmei Tu is with School of Electronics, Information and Electrical Engineering, Shanghai Jiao Tong University, China

Guanghao Zhang, Shangbo Mao and Guang-Bin Huang are with School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore

Lily Rachmawati is with Computational Engineering Team, Advanced Technology Centre, Rolls-Royce Singapore Pte Ltd

vessels [6]. Comparing with traditional maritime equipment such as radar, sonar or closed-circuit television (CCTV), AIS has many advantages. AIS messages provide rich information of its host ship in a nearly real-time way. Furthermore, AIS messages can be transmitted to and received from a very long distance (20 nautical miles for onboard transceivers and hundreds of nautical miles for satellite receivers [7]). Moreover, AIS is less likely to be affected by external factors such as sea conditions and weather conditions. A large amount of AIS data are collected every day from different vessels at different locations and the gathered data contain a wealth of information useful for maritime safety, security, and efficiency promotion. However, as will be elaborated in the following section, AIS data usually exhibits many defects, such as low data quality, highly irregular between-message time intervals and poor data integrity [8, 9]. Plus trajectory diversity (different types of vessel, different geographical contexts and different maneuvering statues), therefore the problem of utilizing massive historical AIS data to build an efficient model for better vessel path prediction is a challenging task and needs more effort.

Recently, there have been several attempts trying to utilize big historical AIS data to improve vessel path prediction, but these works have some restrictions or limitations, e.g. requiring prior knowledge or good data presence. Here we mention a few representative works. For more details, we refer readers to [10]. Pallotta et al. [11] propose an interesting Ornstein-Uhlenbeck Processes based vessel position prediction algorithm, but it relies on externally extracted contextual information, i.e. which assumes that a vessel must follow the route provided by the Traffic Route Extraction for Anomaly Detection (TREAD) tool. Mazzarella et al. [12] develop an effective Bayesian vessel prediction approach based on a Particle Filter, but it requires prior knowledge from marine traffic analysis. Hexeberg et al. [13] propose a novel recursive approach which uses the nearby historical AIS messages around the prediction location to estimate the new position, but it is only for short-term prediction less than 15 minutes. Furthermore, the method may be sensitive to decision parameters and not suitable for data sparse region (i.e. open sea, at which even nearby messages could be very far). Another work presented in [14] may have similar constraints. There are also some earlier works, such as neural network approaches[15– 17], Kalman Filtering approaches [18–20], Minor Component Analysis [21], fuzzy logics [22] and kernel density estimation [23].

In this paper, considering the characteristics of historical AIS data and vessels’ trajectory segments, we propose a comprehensive approach to attack the problem of modeling a large volume of raw AIS data for path prediction. Particularly, our work consists of three parts. (1) Historical AIS data contain a considerable amount of motion outliers, which could severely mislead a prediction algorithm and has to be excluded from training data. We developed an efficient algorithm which is able to automatically detect motion outliers. (2) Raw AIS data lack conciseness and representativeness. We propose a sample representation method, which constructs concise and uniform features from raw AIS data to enhance the effectiveness of data representation and learning. (3) We propose a motion trend ensemble learning algorithm, which combines a group of predictive models corresponding to different motion trend so that the overall predictive capability is much greater. In addition, we also construct an AIS database containing over 100 GB AIS data collected from hundreds of different types of vessels and, based on the database, we conduct extensive experiments to show the performance improvements of our approach.

The remainder of the paper is organized as follows: Section II gives a brief introduction to the Automatic Identification System and AIS data challenges. Section III presents the proposed comprehensive approach to learn predictive models from historical AIS trajectory segments. Section IV reports experimental results for a comparison of the proposed method with existing popular approaches, followed by discussions and conclusions in Section V.

II. AUTOMATIC IDENTIFICATION SYSTEM AND AIS DATA CHARACTERISTICS

The Automatic Identification System (AIS) is a global, autonomous tracking system which consists of a ship on-board transceiver, ground/satellite station receivers and vessel traffic services terminals, as illustrated in Figure 1. The onboard Very High Frequencies (VHF) transceiver broadcasts automatically its host vessel’s kinematic information (including ship location, speed, course, heading, rate of turn, destination and estimated arrival time) as well as static information (including ship name, ship MMSI ID, message ID, ship type, ship size, current time) in an AIS message. The between-message interval is 2 to 10 seconds, depending on the vessel’s speed while underway, and 3 minutes while the vessel is at anchor. Meanwhile, each transceiver also receives AIS messages from other vessels within 20 nautical miles away. The messages can also be received by an onshore station, a typical coverage range about 60 nautical miles, or by a satellite, a coverage range more than 400 nautical miles [24, 25]. These collected AIS data can be transferred in long distance and stored in large volume, and have a big value for maritime data mining and intelligent navigation.

However, modeling massive historical AIS data for intelligent navigation (e.g path prediction, situation awareness) remains a challenging task. To be more specific, let us see Figure 2, which shows a small part of our AIS database of vessels near the west coast of the USA (more details in Section IV). In the figure, each dot represents an AIS message and each line or curve is an AIS trajectory segments of a vessel. From the figure, we can see that historical AIS trajectories at

Fig. 1: Illustration of Automatic Identification System (AIS).

least have the following issues. (1) The trajectories have a big variance in terms of length, shape, location, and orientation. (2) AIS trajectories frequently have some ”abnormal” vessel motion patterns (e.g. wavering, U-turn, self-intersection) which could mislead a learning algorithm and thus reduce the generalization ability. (3) The AIS trajectories have irregular message frequency, i.e. some trajectories have dense messages sequences but others may have very sparse message sequences, with possible missing data and erroneous values. Furthermore, each AIS trajectory contains both time-varying features (e.g. position, speed, whose values change with time) and static features (e.g. vessel size, vessel type, which has a single value for all the time). These issues, together with the big data volume, pose a big challenge to existing approaches [10].

Fig. 2: Examples of historical AIS trajectories.

III. A COMPREHENSIVE MODELING APPROACH

As mentioned earlier, our comprehensive modeling approach contains three core parts: the trajectory outliers detection, sample effective feature representation and motion trend ensemble learning.

Let us first introduce some notations which will be used in the following contents. Suppose an AIS trajectory set contains N trajectories , possibly from different types of vessels and different geographical locations. The AIS message in the trajectory is denoted by , where is the total number of messages in the trajectory. Note that is an AIS message which contains many data entries, such as latitude , longitude , speed over ground (SOG) , course over ground (COG) , date and time (For a complete list of all AIS message data entries, please see [8]). We will simply use for when the meaning is clear in the context.

A. Trajectory Outliers Detection

AIS trajectories may contain some non-generalizable segments (trajectory outliers) which may be due to some unusual vessel movements (such as avoiding barrier or collision, traffic regulation). If they are not excluded from training data, these outliers could severely mislead a learning algorithm, hence harmful to predictive modeling3.

The difficulty is that the circumstances of unusual vessel movements may be different from one to another and thus the appearance of the trajectory outliers could also be quite different. Therefore, it is hard to give a concrete detection rule which could be used to identify various unusual trajectory segments. However, fortunately, we found that trajectory outliers are, or can be decomposed into, two basic anomalous motion patterns: sharp turning and self crossing, illustrated in Figure 3 (we call them type I and type II, respectively). If we could accurately locate these anomalous motion patterns, we could detect the trajectory outliers easily.

Fig. 3: Examples of anomalous motion patterns, indicated by red dash circles.

For type I, we make use of the mean-value theorem to propose an improved Ramer-Douglas-Peucker (RDP) algorithm [26] in Algorithm 1. The input parameters and are the minimum distance, the slope tolerance between parallel lines and the threshold for sharp turning, respectively. The function RDP2 in line 1 finds control points recursively, as implemented in lines from 11 to 28. Then lines 5-7 calculate the angle formed by three neighboring control points (See Appendix B for function Angle). If the angle is too small, then the middle control point has a type I anomalous pattern.

In function RDP2, line 11 calculates the slope of the secant of the trajectory curve. Then function Distance in line 16 computes the distance between point and the line formed by points and . The Slope function in line 17 computes the average slopes of the front and back lines at the central point, as implemented Appendix B. In line 18, if the slope difference between and a is less than and the distance between and the secant is larger than , then distance and the index k are stored. Thereafter, line 23 finds the index of the control point which has minimal or maximal distance and divides the trajectory at these control points. Then each subtrajectory is inputted to the function RDP2 again to find next level control points recursively.

The introduction of slope in the improved RDP algorithm enables us to rapidly find both the nearest and farthest control points at one time. To show the difference, figure 4 gives an example. The original RDP algorithm requires 5 recursive function calls (), while the new algorithm requires only 3 recursive function calls (). Therefore, the new algorithm could reduce recursive function calls significantly, hence speed up the detection process.

Fig. 4: Comparison of the original RDP algorithm and the improved RDP algorithm. In (b) and (c), the numbers indicate the sequence of recursive function calls and the dash lines are the secants of the corresponding recursive curve segment.

For type II anomalous pattern, we need to find first the intersection location of two line segments in the trajectory and then determine if the intersection is anomalous (There are two intersections in the third figure of Figure 3, but the one in the green circle could be a normal change of sailing course.). We adopt the following algorithm which is widely used in computer graphics [27]. More specifically, for two line segments, say and , we solve the following equations to find and :

Two line segments have an intersection if and only if and . The intersection point can be computed from either equation. Then we examine the trajectory length cut by the intersection point (i.e. start from and end at ) and mark it as an anomalous pattern if the length is shorter than a predefined parameter, since a small loop usually corresponds to an abnormal movement, as shown in Figure 3 (b).

B. Trajectory Sample Representation

Most of the existing works learn on original AIS trajectory data directly [10], which lack conciseness and effectiveness because the raw data are redundant and contain both static and temporal components with an irregular time stamp. Here, we propose a novel way to transform the original data into uniform feature vectors, an effective and algorithm-friendly representation, by three steps described as follows.

The first step is to transform the irregular temporal component of a trajectory into uniform samples of equal length4. Here we follow the way described in [28] to make overlapped training samples from the trajectory, because the method could make sure that most of the raw data could be translated into the training data. After this, let S be the sample set and T be the corresponding training target set, in a one-by-one correspondence. Here each sample is a small fragment of the trajectory and its target, , is an AIS message, which is time (the prediction time) later after the last message in .

Comparing to the vastness of the earth surface, the trajectories are very sparse in most water regions except for near shore port areas. So it could easily result in an overfitting model if a machine learning algorithm is applied simply to the original GPS positions in the AIS data. To avoid this problem, our second step of obtaining effective training samples is to map the GPS positions to a local coordinate system, which is homeomorphic to the original GPS coordinate space but the trajectory data distributes much densely, so we could learn a model in a more efficient way in this new coordinate system. It is worth mentioning that to guarantee the homeomorphism, the mapping has to be invertible. Under this condition, the prediction results given by the model during testing stage could be mapped back to earth GPS positions. Otherwise, the mapping is meaningless and the model would be useless.

To this end, we first define the sample direction to be the smallest angle between the tangent line and horizontal direction, i.e. for a sample from the sample set

where vector is the latitude and longitude coordinate difference between the first two messages in s and u = (1, 0) is a horizontal unit vector. sign(v) is a sign function, which gives 1 if v(2) is positive and -1 otherwise. Thereafter we compute a mapping matrix as

where and and are the latitude and longitude, respectively, of the first AIS message in s. Note that since and cannot simultaneously be zero for any value of , the matrix is always invertible. Finally, we can apply to the latitude and longitude coordinates of each AIS message in s (together with its training target) to map them from the real geographical coordinates into a local coordinate system5. In the prediction stage, we need to perform a reverse operation to map the prediction value back to real geographical coordinates, i.e. latitude and longitude. Since is invertible, the only thing needs to do is multiplying to the model prediction output.

After transformed by , let the new sample be and its l AIS messages be . We make use of both kinematic and static information in the AIS messages to design a group of representation features, which are able to reflect the instantaneous motion status (i.e. position, velocity, acceleration) during the short period of the l messages, and meanwhile overcome the problem of irregular message intervals. The features are as follows.

• Latitude features. Latitude coordinates of the first, middle and last two AIS messages

The reason is that the beginning, middle and end points are the three most important instantaneous positions that record the vessels motion dynamics. Besides, the average of the latitudes in the first half sample and the second half sample, respectively, also reflect the vessel’s overall sailing trend during that period

• Longitude features. Similarly, the longitude coordinates of the first, middle and last two AIS messages

and the averages of the longitudes of the front half sample and second half sample

• Velocity features. At any moment, the motion direction of a vessel can be decomposed into two components: latitude and longitude . Since a vessel’s future position is closely related to the latest motion status, we compute two instantaneous velocity values and one mean velocity value along the two directions. So the latitude velocity features are

where is the difference and is the time interval between the AIS message and the AIS message. Correspondingly, the longitude velocity features are

• Features of speed over ground (SOG), the acceleration and sailing direction (course over ground, COG) of the vessel.

where is . The first term is the average of SOG and the second term is the average of SOG change rate (acceleration). The third term is the

average of COG. Here only the last k AIS messages are utilized to define the features.

• Other static information in AIS messages, including vessel size, type and drought, since they also deliver important information of the vessel motion characteristics. So far, for the sample , we could construct its a feature vector, denoted as x, to summarize the instantaneous motion dynamics of the vessel by concatenating these feature values together. Comparing with raw AIS data entries, as have been used in some existing work such as [15, 23, 29], the new features are more concise, representative and effective. Importantly, even if the same length and AIS messages frequency may vary considerably, the feature vector maintains a fixed length and representativeness.

C. Motion Trend Ensemble Learning

We have computed a feature vector to represent each sample. Putting all samples’ feature vectors together we make a training data set denoted by (X, Y), where X = is the training samples set and Y = is the corresponding training target set. Each is a pair of transformed latitude and longitude . Figure 5 displays the architecture of the proposed ensemble learning algorithm. As indicated by the circled numbers in the figure, the training samples are first clustered into k clusters, say , by an unsupervised learning algorithm (here we use kmeans for simplicity, but other unsupervised learning algorithms are equally applicable). The main purpose of this step is to reduce the diversity and imbalance (hence the modeling difficulty) of vessel movements for a modeling algorithm. The resultant clusters have two usages: the first one is to train a predictive model for each cluster learning stage (circled 2) and the second one is to perform model selection and fusion in testing stage (circled 4 & 5). For the first usage, given a pattern cluster , the corresponding training target set could be collected from Y. Then for each pair of , we could build a motion model by training a supervised learning algorithm, such as neural networks, least square support vector machine, etc. In our current implementation, we use the extreme learning machine (ELM) [30, 31] which is a special type of multilayer neural networks and is very efficient.

At the testing stage, the clusters are assigned to labels 1, 2, ..., k, respectively. All samples in the same cluster share a cluster label. Given a testing sample x, the (weighted) k-nearest-neighbor (kNN) algorithm could be adopted to find its closest K neighboring from the labeled clusters (circled 3). Each nearest-neighbor sample is fed to the model corresponding to the labeled cluster to get its prediction, hence the prediction error regarding to original training target in Y. The first models with lowest prediction error and the associated nearest-neighbors (a subset of the K nearest-neighbor samples) are selected (circled 4). The final prediction result of x can be obtained by the following fusion scheme

Fig. 5: An ensemble learning algorithm. The numbers in the circles indicate learning procedures.

where is the prediction result of x given by the selected model and is the average prediction error of selected model i on the nearest-neighbors.

The advantages of the proposed ensemble learning are as follows. First, modeling each pattern cluster separately can achieve better prediction accuracy, because the diversity and imbalance, hence the modeling difficulty, within each cluster are significantly reduced. Second, training an algorithm with a smaller cluster usually saves much computation cost, because the computational complexity of most machine learning algorithms is polynomial with respect to the number of training samples. Finally, it is more flexible to model each motion pattern cluster individually, because each model can be adjusted without any influence on other models’ performance. This is especially important when some new samples are added or part of samples are removed. In this case, only the related model needs to be updated. Furthermore, the fusion of multiple models with similar movements tends to be stabler than a single one.

IV. EXPERIMENTAL RESULTS

In this section, we conduct experiments to compare the proposed approach with several popular path prediction methods on a real-world dataset, which constitutes AIS messages from both ground stations and satellite stations.

A. AIS Database and Experimental Dataset

We implemented a web data crawler which continuously crawling real AIS data from the Internet to build an AIS database for vessel path prediction and maritime data mining. The database is about 100GB and is from the international data provider Marine Cadastre 6, containing AIS messages of both ground stations and satellite stations.

For our experimental purpose, we only use part of the database to construct an experimental dataset, because the whole database is too huge for a learning algorithm running on a single computer. From the database, 200 trajectories from 180 different vessels (including cargo, tank, towing & tug vessel, fishing boat, lifeboat, etc) are extracted to form a dataset7 and then the preprocessing algorithm in Appendix A is applied to remove some apparent issues. Details of the trajectory extraction process and related information of the database can be found in [28]. Fig. 6 shows the trajectory length distribution and message time interval distribution in the dataset. In each figure, we display the overall amount for all abscissa quantity that is out of the range (larger than the right-most abscissa value) in the last bin (brown color). From these figures, we can see the irregular trajectory lengths and between-message time intervals. Note that the intervals distribute in a rather broad range, i.e. from minutes to hours long. Most predictive approaches (for static vector or time series ) requires either fixed sample length or fixed step size [32, 33], so it will be very low efficiency even infeasible to training algorithms directly using real-world raw AIS data.

Fig. 6: Trajectory length distribution (left) and message time interval distribution (right).

B. Single Path Prediction Simulation

To demonstrate why machine learning based path prediction outperforms straightforward linear projection and velocity calculation, we first run an experiment for a single path prediction to compare the capability of linear projection method, SOGCOG estimate method and our method. Linear projection method performs a forward linear interpolation to predict next time position using the latest three AIS messages. SOGCOG estimate method computes the next time position by multiplying SOG with time gap and project it to current COG direction. While these two methods are most basic and straightforward, they are very illustrative and intuitive to show the difference and importance of machine learning based approaches. Figure 7 displays the simulation results for 30 minutes prediction (The smoother the predicted path is and the closer the predicted path to the true sailing trajectory, the better the result is).

Fig. 7: Single path simulation results. From left to right: linear , SOG-COG and the proposed method. The white line in each map is the true sailing trajectory and the blue line is the predicted result.

From these results, we can see that the linear prediction method is able to output a smooth predicted path, but the deviation is very big. SOG-COG estimate method could follow true sailing trajectory more closely, but its results perturb significantly because of the fluctuations in instantaneous speed and course values. The proposed method is able to give a much more accurate prediction. The reason is that linear and SOGCOG methods are based on a vessel’s instantaneous motion state (current or one time step back) and thus the predicted future path is also ”short-sight” and inaccurate. In the contrast, the proposed method learns a group of models using a large amount of historical AIS trajectories, which contain various complex motion patterns. As a result, the learned models remember all types of motion patterns and could predict future vessel motion trends earlier.

C. Quantitative Comparison Results

In this section, we conduct experiments on the AIS dataset to compare quantitatively the proposed approach with Gaussian Process Regression (GPR) , Least Square Support Vector Machine (LS-SVM), Multilayer Perceptron (MLP) and Gaussian Mixture Models (GMM). We choose these algorithms as the baseline because they are popular for path prediction and able to achieve state of the art results [15, 29, 34–36]8. The Original Extreme Learning Machine (ELM) algorithm (applying directly to raw) is also included as a baseline algorithm to show the effectiveness of our comprehensive approach.

We perform multi-step predictions, i.e. four different future time steps (15, 30, 45 and 60 minutes ahead) to compare the capability and flexibility of these path methods. Experiments are run in a 10-fold cross validation way on the dataset and the final prediction error is the average of the 10-fold results. The prediction error is measured by the geographical distance (computed by the GeoDistance function in Appendix A) between the ground truth and the predicted result. The parameters of the baseline algorithms are tuned by cross validation to produce the lowest errors. The experiments are implemented with MATLAB 2015a and conducted on a computer with 2.0 GHz dual-core Intel CPU and 16 GB RAM memory.

1) Full dataset experiments: We first conduct experiments

on the whole dataset to demonstrate the capability of each algorithm to learn complex motion patterns from a mixture of a variety of trajectories. The results are shown in Table I. The first column is the prediction time (unit: minute).

TABLE I: Prediction error on the whole dataset (time unit: minute; error unit: nautical mile).

For longer time prediction, the motion patterns will be more complex and not necessarily Gaussian distribution. So we can see that the performance of GMM decays dramatically for long time path prediction. GPR and MLP also have big prediction errors. LS-SVM and ELM produce relatively better results for both short-term 15 minutes prediction and long-term 60 minutes prediction, because of their strong regression ability to learn an arbitrary complex continuous function from a set of samples. In all cases, the proposed method is able to achieve much lower prediction errors. Benefiting from sample representation and ensemble learning, the proposed approach is able to utilize a group of different models to exploit the big amount of historical data to learn complex motion patterns in a concise feature space. So even the prediction is a much challenging task that includes different types of vessels’ trajectories, it still can learn and predict different vessels’ motion pattern more accurately.

Table II shows the standard deviation of the prediction results for each algorithm. The results show that the variance of the proposed approach is smaller than or comparable with other algorithms.

2) Side information experiments: One advantage of the

proposed comprehensive approach is that it could make use of some side information, i.e. the contextual information that is not directly included in the AIS data. This has rarely been utilized by previous methods [10]. Such information can be

TABLE II: Standard deviation of prediction error on the whole dataset (time unit: minute; error unit: nautical mile).

collected without costing much effort and is helpful to improve path prediction results. For example, the same type of vessels with a similar voyage are more likely to have similar motion characteristics and trajectory patterns. Here we investigate the influence of vessel type information and geographical location information.

TABLE III: Vessel types and trajectories

To include static information, the trajectories in the dataset are divided into groups according to their types and four types of vessels are used: towing vessels, tug vessels, cargo vessels, and tanker vessels. Their corresponding number of trajectories are listed in Table III. For contextual information, we divide the trajectories into different groups based on their geographical coordinates and finally choose 4 groups, each one containing trajectories in the same geographical region and generally following a similar traffic route. Then the group number is added directly to each sample feature vectors and the prediction algorithms are run in a similar setting as previous experiments. The experimental results are shown in Table IV and Table V.

TABLE IV: Mean prediction error of including static infor- mation (time unit: minute; error unit: nautical mile).

Comparing the results with Table I, we can see that while most of the baseline algorithms have some very slight improvements, the prediction error reduction of the proposed algorithm is much larger. While side information is available, the collected massive historical trajectories could be clustered into more homogeneous groups according to the side information and within each group, the trajectory variance will be significantly smaller. In this case, the ensemble learning difficulty and complexity is reduced and the prediction performance is enhanced. A potential application of including the

TABLE V: Mean prediction error of including contextual information (time unit: minute; error unit: nautical mile).

side information into path prediction is that different traffic management strategies could be applied to a different type of vessels or different traffic contexts in large modern port regions (e.g. Singapore port [37]).

3) Time cost comparison: We also conduct experiments to compare the time cost of all the algorithms. Each algorithm runs 5 times on a subset of the dataset (20 trajectories) and their average time costs are shown in Table VI. From this table,

TABLE VI: Time cost comparison (Unit: Minute)

we can see that the proposed approach has a much smaller time cost than others because it is only trained once on the training data and then makes predictions on testing data requiring no further model update. MLP and GPR cost much longer time than others because GPR needs to compute a large matrix inverse during training and MLP needs to be trained repeatedly (epoch by epoch) to get converged. It should be mentioned that the time costs here include both the training and testing phase on a group of trajectories. In real applications, an algorithm usually only needs to make a prediction for one sample at each time stamp, so the time cost of the proposed algorithm will be much less and thus meet the real-time requirements in real applications.

V. DISCUSSIONS AND CONCLUSIONS

Traditional path prediction methods generally based on single trajectory learning, such as Kalman filtering [18–20, 38], neural network [15, 29, 39], in which a learning model (e.g. Kalman Filtering) is built for a target vessel and is updated periodically when receiving new data. While for the purpose of vessel collision avoidance, there are some drawbacks of this type of approaches. (1) The trained model is associated with the target vessel and thus cannot be used to predict other vessels’ future position. This will be highly inconvenient if there are many vessels to be predicted at the same time, such as Singapore Port which is a world famous busy port and needs to regulate hundreds of vessels every day. (2) These methods usually need to be updated continuously in order to use the latest data to adjust the model to reduce prediction error. This dynamic update brings extra computational burden during deployment and may prolong the response time during navigation. (3) These on-line updating models are individual trajectory associated and thus cannot make use of a large volume of historical data to learn complex motion patterns (such as learning a common commuting route from a group of historical trajectories). (4) The online learning methods require a considerable amount of past trajectory data from the target vessel to update a model. This is practically inconvenient for busy ports, e.g. Singapore Port, because every day there are hundreds of new vessels arrival and their past sailing data are quite limited even absent.

Recently, as the availability of AIS data increases, constructing a path prediction model based on a large amount of historical AIS data becomes a popular topic in the marine intelligence area. However, previous approaches usually assume that the historical AIS data have been sorted into a good form and the path prediction task is restricted to data-rich (port) area. As described in Section II, in real-world applications both the data and the task are much more challenging than the assumptions.

In this paper, we proposed a comprehensive framework for massive real-world historical AIS data learning to predict various vessels path with different geographical context. Comparing with existing approaches, the proposed approach is one-shot learning (train only once and no successive update required) and is able to make multiple predictions for a different type of vessels at the same time. The experimental results have demonstrated its effectiveness for handling the diversity, divergence, and imbalance in trajectory data. However, the proposed approach also has some limitations. One limitation is that it requires a big amount of historical data to train the models and thus may limit its application to the scenarios where only very limited data are available. The big data volume may also require more storage space and computational power. Another limitation is that it has many user parameters which need to be manually tuned in order to obtain good prediction results. Our future work will focus on further improving the algorithm performance while maximally reducing the number of user-tunable parameters.

The algorithm attempt to resolve four types of data problems in raw AIS data: SOG error, COG jump, message absence and message duplication, since these problems are very common in raw AIS data and have significant negative impact upon the performance of general machine learning algorithms. In Algorithm 2, and are two threshold parameters for abnormal SOG value detection and is the possible speed limit for different types of vessels [40]. The Length function performs a trivial operation count the number of messages in the input trajectory. The Initialization function truncates invalid AIS messages at the front of the trajectory by repeatedly examining and removing messages with invalid data entries (i.e. position coordinates are out of range or speed is not reasonable), as implemented in line 24-30. Lin 5 computes time interval between to messages. The function GeoDistance in line 10 computes the actual sailing distance between message k and . We use haversine formula [41] to compute the earth surface distance, as implemented 31-36.

1 K = Length(f);

2 f, K = Initialization(f, K, s;

3 g, M = 1;

4 for k = 2, 3, ..., K do

5 ∆); 6 if ∆t = 0 then

7 continue

8 g, M = M + 1;

9 GeoDistancex, y), fx, y)); 10 if fsog) < 0 or fsog) > sthen 11 gsogsog); 12 else if sogsog)| > sthen 13 dsog); 14 if < dthen 15 gsogsog);

16 if t > tand d > fsog) then 17 m d/fsog); 18 if m then 19 gInterpolation, f, m); 20 M ;

21 gcog) = sincog));

22 Function Initialization(f, K, s

23 for k < K do

24 if for for fsog, sthen 25 delete message f;

26 continue;

27 K = Length(f);

28 return f, K

29 Function GeoDistance, y, y

34 return d

35 Function Interpolation, f, m) 36 for i = 1, 2, 3, ..., m do

37

38 return g

Line 12 uses previous valid one if current SOG is invalid. Line 13-16 mean that if the SOG changes too abruptly but the actual sailing distance supports previous speed, then current SOG value will be replace by previous valid one. Line 17-21 mean that if the time interval between the messages is too long but the vessel is not still, then AIS messages are missing and a linear interpolation operation will be adopted for all time series features to fill the path gap, as implemented in the function Interpolation in line 37-40, assuming that the vessel follows a constant speed and course during the gap period. Line 22 converts course over ground (COG) to sine function values to eliminate the big jumps of degree value around 360.

REFERENCES

[1] D. A. Wiegmann and S. A. Shappell, A human error approach to aviation accident analysis: The human factors analysis and classification system. Routledge, 2017.

[2] H.-T. Kim, S. Na, and W.-H. Ha, “A case study of marine accident investigation and analysis with focus on human error,” Journal of the Ergonomics Society of Korea, vol. 30, no. 1, pp. 137–150, 2011.

[3] S. Knapp and P. H. Franses, “Comprehensive review of the maritime safety regimes: Present status and recommendations for improvements,” Transport reviews, vol. 30, no. 2, pp. 241–270, 2010.

[4] T. Statheros, G. Howells, and K. M. Maier, “Autonomous ship collision avoidance navigation concepts, technologies and techniques,” Journal of navigation, vol. 61, no. 01, pp. 129–142, 2008.

[5] A. Alessandrini, F. Mazzarella, and M. Vespe, “Estimated time of arrival using historical vessel tracking data,” IEEE Transactions on Intelligent Transportation Systems, no. 99, pp. 1–9, 2018.

[6] I. Harre, “Ais adding new quality to vts systems,” The Journal of Navigation, vol. 53, no. 3, pp. 527–539, 2000.

[7] M. A. Cervera, A. Ginesi, and K. Eckstein, “Satellite- based vessel automatic identification system: A feasibility and performance analysis,” International Journal of Satellite Communications and Networking, vol. 29, no. 2, pp. 117–142, 2011.

[8] A. Harati-Mokhtari, A. Wall, P. Brooks, and J. Wang, “Automatic identification system (ais): data reliability

and human error implications,” Journal of navigation, vol. 60, no. 03, pp. 373–389, 2007.

[9] P. Last, C. Bahlke, M. Hering-Bertram, and L. Linsen, “Comprehensive analysis automatic identification system data in regard to vessel movement prediction,” Journal of Navigation, vol. 67, no. 05, pp. 791–809, 2014.

[10] E. Tu, G. Zhang, L. Rachmawati, E. Rajabally, and G.-B. Huang, “Exploiting ais data for intelligent maritime navigation: A comprehensive survey from data to methodology,” IEEE Transactions on Intelligent Transportation Systems, 2017.

[11] G. Pallotta, S. Horn, P. Braca, and K. Bryan, “Context- enhanced vessel prediction based on ornstein-uhlenbeck processes using historical ais traffic patterns: Real-world experimental results,” in Information Fusion, 17th International Conference on. IEEE, 2014, pp. 1–7.

[12] F. Mazzarella, V. F. Arguedas, and M. Vespe, “Knowledge-based vessel position prediction using historical ais data,” in 2015 Sensor Data Fusion: Trends, Solutions, Applications (SDF). IEEE, 2015, pp. 1–6.

[13] S. Hexeberg, A. L. Fl˚aten, E. F. Brekke et al., “Ais-based vessel trajectory prediction,” in 2017 20th International Conference on Information Fusion (Fusion). IEEE, 2017, pp. 1–8.

[14] B. R. Dalsnes, S. Hexeberg, A. L. Fl˚aten, B.-O. H. Eriksen, and E. F. Brekke, “The neighbor course distribution method with gaussian mixture models for ais-based vessel trajectory prediction,” in 2018 21st International Conference on Information Fusion (FUSION). IEEE, 2018, pp. 580–587.

[15] T. Xu, X. Liu, and X. Yang, “A novel approach for ship trajectory online prediction using bp neural network algorithm.” Advances in Information Sciences & Service Sciences, vol. 4, no. 11, 2012.

[16] D. Zissis, E. K. Xidias, and D. Lekkas, “Real-time vessel behavior prediction,” Evolving Systems, vol. 7, no. 1, pp. 29–40, 2016.

[17] M. Zandipour, B. J. Rhodes, N. Bomberger et al., “Probabilistic prediction of vessel motion at multiple spatial scales for maritime situation awareness,” in Information Fusion, 2008 11th International Conference on. IEEE, 2008, pp. 1–6.

[18] L. P. Perera, P. Oliveira, and C. Guedes Soares, “Mar- itime traffic monitoring based on vessel detection, tracking, state estimation, and trajectory prediction,” Intelligent Transportation Systems, IEEE Transactions on, vol. 13, no. 3, pp. 1188–1200, 2012.

[19] L. P. Perera and C. G. Soares, “Ocean vessel trajectory estimation and prediction based on extended kalman filter,” in Proc. 2nd Int. Conf. on Adaptive and Selfadaptive Systems and Applications, 2010, pp. 14–20.

[20] C. G. Pr´evost, A. Desbiens, and E. Gagnon, “Extended kalman filter for state estimation and trajectory prediction of a moving object detected by an unmanned aerial vehicle,” in American Control Conference, 2007. ACC’07. IEEE, 2007, pp. 1805–1810.

[21] B. Zhou, A. Shi, L. Wan, and B. Yang, “Minor com- ponent analysis-based landing forecast system for ship-

borne helicopter,” Journal of China Ordnance, vol. 2, p. 018, 2005.

[22] G. Pallotta, M. Vespe, and K. Bryan, “Vessel pat- tern knowledge discovery from ais data: A framework for anomaly detection and route prediction,” Entropy, vol. 15, no. 6, pp. 2218–2245, 2013.

[23] B. Ristic, B. L. Scala, M. Morelande, and N. Gor- don, “Statistical analysis of motion patterns in ais data: Anomaly detection and motion prediction,” in Information Fusion, 2008 11th International Conference on. IEEE, 2008, pp. 1–7.

[24] F. Clazzer, A. Munari, S. Plass, and B. Suhr, “On the impact of coverage range on ais message reception at flying platforms,” in Advanced Satellite Multimedia Systems Conference and the 13th Signal Processing for Space Communications Workshop (ASMS/SPSC), 2014 7th. IEEE, 2014, pp. 128–135.

[25] I. Recommendation, “2169,“improved satellite detection of ais,”,” ITU, Tech. Rep, Tech. Rep., 2009.

[26] D. H. Douglas and T. K. Peucker, “Algorithms for the reduction of the number of points required to represent a digitized line or its caricature,” Cartographica: The International Journal for Geographic Information and Geovisualization, vol. 10, no. 2, pp. 112–122, 1973.

[27] D. Kirk, Graphics Gems III (IBM Version): Ibm Version. Elsevier, 2012.

[28] S. Mao, E. Tu, G. Zhang, L. Rachmawati, E. Rajabally, and G.-B. Huang, “An automatic identification system (ais) database for maritime trajectory prediction and data mining,” in Proceedings of ELM-2016. Springer, 2018, pp. 241–257.

[29] D. Zissis, E. K. Xidias, and D. Lekkas, “Real-time vessel behavior prediction,” Evolving Systems, pp. 1–12, 2015.

[30] G.-B. Huang, L. Chen, and C.-K. Siew, “Universal ap- proximation using incremental constructive feedforward networks with random hidden nodes,” Neural Networks, IEEE Transactions on, vol. 17, no. 4, pp. 879–892, 2006.

[31] G.-B. Huang, H. Zhou, X. Ding, and R. Zhang, “Ex- treme learning machine for regression and multiclass classification,” Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 42, no. 2, pp. 513–529, 2012.

[32] C. M. Bishop, Pattern recognition and machine learning. springer, 2006.

[33] A. S. Weigend, Time series prediction: forecasting the future and understanding the past. Routledge, 2018.

[34] B. Zhou and A. Shi, “Lssvm and hybrid particle swarm optimization for ship motion prediction,” in Intelligent Control and Information Processing (ICICIP), 2010 International Conference on. IEEE, 2010, pp. 183–186.

[35] K. Kowalska and L. Peel, “Maritime anomaly detection using gaussian process active learning,” in Information Fusion (FUSION), 2012 15th International Conference on. IEEE, 2012, pp. 1164–1171.

[36] J. Wiest, M. H¨offken, U. Kreßel, and K. Dietmayer, “Probabilistic trajectory prediction with gaussian mixture models,” in Intelligent Vehicles Symposium (IV), 2012 IEEE. IEEE, 2012, pp. 141–146.

[37] J. Weng, Q. Meng, and X. Qu, “Vessel collision fre- quency estimation in the singapore strait,” The Journal of Navigation, vol. 65, no. 2, pp. 207–221, 2012.

[38] W. Ra and I. Whang, “Real-time long-term prediction of ship motion for fire control applications,” Electronics Letters, vol. 42, no. 18, pp. 1020–1022, 2006.

[39] A. Khan, C. Bil, and K. Marion, “Ship motion prediction for launch and recovery of air vehicles,” in Proceedings of OCEANS. IEEE, 2005, pp. 2795–2801.

[40] J. Faber, D. Nelissen, G. Hon, H. Wang, and M. Tsimplis, “Regulated slow steaming in maritime transport: An assessment of options, costs and benefits,” CE Delft. Delft, Netherlands, 2012.

[41] C. C. Robusto, “The cosine-haversine formula,” The American Mathematical Monthly, vol. 64, no. 1, pp. 38– 40, 1957.