Title Author Keyword ::: Volume ::: Vol. 17Vol. 16Vol. 15Vol. 14Vol. 13Vol. 12Vol. 11Vol. 10Vol. 9Vol. 8Vol. 7Vol. 6Vol. 5Vol. 4Vol. 3Vol. 2Vol. 1 ::: Issue ::: No. 4No. 3No. 2No. 1

Extraction of Reference Seaway through Machine Learning of Ship Navigational Data and Trajectory

Joo-Sung Kim1, and Jung Sik Jeong2

1Kyeong-In VTS Center, Ministry of Public Safety and Security, Incheon, Korea, 2Department of Maritime Transportation System, Mokpo National Maritime University, Mokpo, Korea
Correspondence to: Jung Sik Jeong (jsjeong@mmu.ac.kr)
Received May 2, 2017; Revised June 23, 2017; Accepted June 23, 2017.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract

Vessel Traffic Services operators have kept sharp monitoring and provided the appropriate information to ensure safe and effective navigation. While attending the tasks, analysis of traffic patterns and navigational data is required to conduct accurate situation assessment in decision-making process of VTS operators (VTSO). Unfortunately, there are problems in the process of data analysis such as appropriateness of time, VTSO’s personal error and improper judgment. Therefore, objective and proper data analysis is necessary to solve above matters. However, it is virtually impossible to monitor all vessels because there are many vessels in the VTS area and at the same time complex traffic situations are produced. In this study, we proposed a machine learning algorithms for objective and accurate pattern recognition and data modeling. Support Vector Regression algorithm was used for data learning and modeling. The optimal parameters were selected through v-fold cross validation and grid search. The machine learning was conducted with virtual route and ship tracks that are similar with real navigational environment. As a result, we presented reference route and navigational patterns. We expect that the proposed modeling methods could be utilized for relevant tasks as the useful information to VTSO and/or ship’s mater.

Keywords : Vessel Traffic Services, Machine learning, Sea traffic route, Decision making, Pattern recognition
1. Introduction

The navigational environment in harbour limits are complex and changeable due to concentrated traffic density, development of port facilities and other geographical and environmental factors. Therefore, required information is becoming increasingly diverse to support safe and effective marine traffic [1, 2]. According to the e-navigation strategy implementation, in addition, the role of shore based information is expected to gradually increase and Vessel Traffic Services (VTS) will be the part of important roles [3, 4]. VTS has established by authorized governments in necessary places in order to promote safe and efficient navigation and prevent marine accidents based on ‘IMO RESOLUTION A.857 (20) on Guidelines for Vessel Traffic Services’, Article 36 of ‘Maritime Safety Act’, Chapter 4 of ‘Act of Ship Arrival and Departure’ and ‘rules of implementation for Vessel Traffic Services’ [58]. In recent years, the role of VTS is expanded to adjust traffic situations in VTS areas as providing effective information to vessels based on collected and selected navigational data [9, 10].

Understanding traffic patterns is one of the most important part for situational awareness of maritime traffic in order to predict traffic situations and ship’s positions. In addition, VTS operators (VTSO) who have to monitor wide areas should keep a capability of accurate assessment within limited time [11, 12]. Therefore, it is necessary for development of decision-making support tools to relieve cognitive workload and stress of VTSOs.

In this study, we aimed to help VTSOs and ship’s masters by providing information on route usage through the data analysis and pattern recognition of navigational data. In addition, we intended to contribute to the traffic monitoring and prediction. The machine learning algorithm was used to construct a reliable reference route with the small number of data set consisting of data for the most recent sailing. Support Vector Regression (SVR) algorithm was used for pattern recognition and modeling. The optimal parameters were selected through v-fold cross validation and grid search. The machine learning was conducted with virtual route and ship tracks that are similar with real navigational environment. As a result, we presented reference route model and utilization method of this study.

2. Modeling Methods

### 2.1 Data Processing

For building a learning model in SVR, the most important thing is to choose the kernel function and optimal parameters. In the relevant study, Hsu et al. [13] proposed LIBSVM algorithm with cross validation and grid search. They presented following six steps in the data learning and SVM model design process.

1. (1) Transform data

2. (2) Conduct simple scaling on the data

3. (3) Consider the RBF kernel as a kernel function

4. (4) Use cross-validation to find the best parameter

5. (5) Use the best parameter to train the whole training set

6. (6) Test

In this paper, we configured modeling process with reference to parameter selection that proposed in the relevant articles. Modeling process for extracting reference route can be summarized in the following steps:

1. (1) Data collection

2. (2) Data classification

1. ➀ Classification of target area

2. ➁ Classification of target route

3. ➂ Construction of sub-data sets

3. (3) Data learning and model extraction

1. ➀ Conversion of input data sets

2. ➁ Data scaling

3. ➂ Parameter range selection

4. ➃ Optimal parameter selection

5. ➄ Optimal model selection

4. (4) Database construction

The simulation was performed with the process from (2)- ➂ to (3)- ➄ because classified virtual data sets were used for the simulation. Therefore, the classification steps were omitted in the simulation.

Meanwhile, same steps were applied for route extraction and pattern recognition of navigational data.

### 2.2 Construction of Input Dataset

Configuration of data set is the structure of (M ×N) matrix for each individual ship. Because the structure is ease to transfer with each component to column vectors for learning [1417]. First, the program selects any one of the vessels to learn entire trajectory and divides total distance to k equal parts while moving the entire trajectory to constitute the k sets of data. Each sub-data set is composed of a distance of 2k with overlapping the distance k. Therefore, the number of sub-data sets is k − 1. Each divided data sets are learned twice and models of the overlapping legs are composed of one final model after final learning process.

Due of each trajectory data obtained from individual vessel, the number of constituent elements and data range are different according to the ship speed. The data scaling is required in order to effectively learn the trajectory without being affected by the differences. When the data set $Xt=[x1tx2tx3t⋯xit]$ scales in the input variables as a value between 0 and 1,

$Xscalingt=[Xit-Min [Xt]Max [Xt]-Min [Xt]].$

### 2.3 Support Vector Regression

The Support Vector Machine (SVM) is a classifying technique to configure the hyper-plane to maximize the margin through supervised learning. Although the SVM originally developed for classification issue, it has been extended to problems with regression and probability density estimation [1820].

The SVM can be applied to regression model by introducing the loss function. When the training data set (x1, y1), …, (xN, yN) ∈ RM · R are given, ɛ-SVR that was proposed by Vapnik is to find f(x) with minimum ω and insensitive parameter ɛ against all training data sets [21, 22]. Where N is the number of training data, RM represents the input space. The linear function is as below,

$f(x)=+b with w∈RM, b∈R.$

As Figure 1, the slack variable ζi, $ξi*$ are used to accept the training data locating the outside of space ɛ, the f(x) is presented to

$min12‖w‖2+C∑i=1N(ξi+ξi*)subject to yi--b≤ɛ+ξi+b-yi≤ɛ+ξiξi, ξi*≥0.$

Therefore, the linear equation can be expressed the output SVR,

$f(x)=∑i=1N(αi+αi*)·+b.$

In SVR, the training data in the input space can be mapped in the high-dimensional space using a non-linear mapping function ϕ to approximate to a non-linear function (ϕ : RMF). This non-linear mapping function is called the kernel function [21, 22] and

$K (x,x′)=<ϕ (x),ϕ (x′)>.$

Therefore, the non-linear equation can be expressed the output SVR,

$f(x)=∑i=1N(αi+αi*)·k(xi,x)+b.$

In this study, we used Gaussian Radial Basis Function (RBF) as the kernel function to solve the non-linear problem. Gaussian RBF presents successful performance of the various sector and it can be expressed that

$K (x,y)=exp (-(x-y)22σ2).$

### 2.4 Optimal Parameter Selection

Parameter values ɛ and σ are needed to be selected in the model selection. The parameter ɛ is a value to adjust the complexity of the model by adjusting the number of support vector. Because small value ɛ takes a lot of training data as support vectors, output model becomes complex. On the other hand, larger ɛ reduces the number of support vectors and model can be simply determined [1417]. Parameter value σ is the standard deviation of the kernel function Gaussian RBF. Lager σ favors models that close to straight lines and small value of σ takes curved lines.

The v-fold cross validation and grid search were used for avoiding the over-fitting of models presented by SVR and selecting optimal parameters of route model. First, whole data sets are needed to be divided into the training data set and the evaluation data set. At this time, the number of these sub-data sets will be v. One of the sub-data set v can be the validation set and the others (v − 1) are training sets for a validation of a divided group. These procedure will be repeated as v times. The average value of these accuracies can be an objective indicator of the performance of the whole data sets. It can be explained as

$Meanvalidation error=k-1∑j=1kMin(qj).$

The qi represents the validation error, and the number of models for the verification is determined based on the number of v and the range of parameter selection.

3. Simulation

### 3.1 Outline of Simulation

The virtual tracks are composed in order to conduct machine learning for ship trajectories and navigational data. The shape of passage is similar with character of alphabet ‘S’ that includes tight bends. Fifteen ships are composed as data sets and they have various characteristics while sailing on the route with different course and speed. The data sets are transformed to the learning formation and divided into eight sub-data sets. The virtual passage has following features;

1. (1) It has tight bends,

2. (2) narrow channels,

3. (3) designated route track,

4. (4) and diverse changes on course and speed.

Among the data sets, a vessel is showing anomaly behavior and the vessel break away from the designated route twice. The data sets and sub-data sets are presented on

4. Data Learning

Data learning on divided sub-data sets was carried out through each learning engine. Here, we presented the learning results of latitude and longitude components in the progress of the learning steps, and the results of these components is shown in

Meanwhile, the sub-models are extracted after data learning and the results are shown in

After extracting the each sub-model that has overlapped sections, the models are arranged and scaled as similar coordinates. After the process, a final model can be obtained with learning process on the whole sub-model data. The extraction steps of the sub-models and the final model are in same process and it will be repeated until finding the best model. The final route model for the virtual route is shown in

Navigation data are needed to be compared with the specific position of each ship. Therefore, the data are given to the approximate point of the model. The given data are differences among the data taken from a specific location of each vessel. Figures 610 show comparison of data between the target vessel and others.

As the results of comparison of each vessel’s deviation, the anomaly behavior can be detected when a ship has sudden changes on course or speed. Users can easily recognize ship’s deviation and it would be a reference information to predict the above behavior in advance. When the position of the target vessel deviated from the specified path, the difference of ship’s speed and course began to increase. In other words, when an abnormality of ship’s speed or course occurs, the target vessel leaves the designated route and a dangerous situation occurs. Therefore, if the abnormalities of ship’s speed or course are monitored together with the position of the ship, it is possible to detect in advance whether the ship has deviated or not.

5. Conclusion

It is very important to understand the traffic route of a ship in order to prevent marine accidents. It is an indispensable task in VTS to predict the ship’s traffic and analyze the navigational data to determine the ship’s abnormal behavior. In this study, we presented reference route model using SVR. The SVR algorithm was used for pattern recognition and modeling. The optimal parameters were selected through v-fold cross validation and grid search. The machine learning was conducted with virtual route and ship tracks that are similar with real navigational environment. The proposed method is able to compose the dynamic route model based on recent and a few ship’s navigational data. And it was possible to extract reference data not only the route of the ship but also the speed and the course that can be used for judging the abnormality of the ship. The proposed model can be used to prevent accidents and provide reference routes through the prediction and decision-support information to VTSOs and mariners. For further study, we need to develop the appropriate method for data division in case that the legs become long and complex. In addition, we conduct simulations with users based on real navigational data and develop various applications related with data learning.

Acknowledgements

This work was conducted as the Research for Development Strategy to Future Maritime Traffic Environments and Applications to Maritime Safety Technology which was supported by KRISO from November 1, 2016 to February 28, 2017 (Project No. 2016-0096).

Conflict of Interest

Figures
Fig. 1.

Lost function.

Fig. 2.

Simulation dataset and data division.

Fig. 3.

Data learning on sub-dataset.

Fig. 4.

Extracted model of sub-dataset.

Fig. 5.

Extracted route model of whole data-set.

Fig. 6.

Deviation comparison.

Fig. 7.

Comparison of course differences.

Fig. 8.

Comparison of speed changes.

Fig. 9.

Relationship between deviation and course differences.

Fig. 10.

Relationship between deviation and speed changes.

References
1. Kim, JS, Jeong, JS, and Park, GK (2013). Prediction table for marine traffic for vessel traffic service based on cognitive work analysis. International Journal of Fuzzy Logic and Intelligent Systems. 13, 315-323.
2. Kim, JS, Jeong, JS, and Park, GK (2014). Utilization of planned routes and dead reckoning positions to improve situation awareness at sea. International Journal of Fuzzy Logic and Intelligent Systems. 14, 288-294.
3. Hong, T (2014). Development of a system for transmitting a navigator’s intention for safe navigation. International Journal of Fuzzy Logic and Intelligent Systems. 14, 130-135.
4. ,. (1997) . Guidelines for Vessel Traffic Services (Resolution A.857(20)). Available http://www.maritime-vts.co.uk/A857.pdf
5. ,. (2015) . Act of Ship Arrival and Departure. Available https://www.moleg.go.kr/english/
6. ,. (2015) . Maritime Safety Law. Available https://www.moleg.go.kr/english/
7. ,. (2015) . Vessel Traffic Service Operational Manuals. Available http://www.mpss.go.kr/en/
8. Kim, DY, Park, GK, and Kim, HY (2014). A study on the ship information fusion with AIS and ARPA radar using by blackboard system. Journal of Korean Institute of Intelligent Systems. 24, 16-21.
9. Kim, EK, Jeong, JS, Park, GK, and Im, NK (2012). Characteristics of ship movements in a fairway. International Journal of Fuzzy Logic and Intelligent Systems. 12, 285-289.
10. Kim, KI, Jeong, JS, and Park, GK (2013). Assessment of external force acting on ship using big data in maritime traffic. Journal of Korean Institute of Intelligent Systems. 23, 379-384.
11. Jeong, JS, Kim, KI, and Park, GK (2012). A quantitative collision probability analysis in port waterway. Journal of Korean Institute of Intelligent Systems. 22, 373-378.
12. Hsu, CW, Chang, CC, and Lin, CJ. (2003) . A Practical Guide to Support Vector Classification. Available http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
13. Kim, JS, and Jeong, JS (2015). Pattern recognition of ship navigational data using support vector machine. International Journal of Fuzzy Logic and Intelligent Systems. 15, 268-276.
14. Kim, JS, and Jeong, JS 2015. Vessel trajectory and route detection in vessel traffic service areas using machine learning theories., Proceedings of 2015 International Conference on Advanced Intelligent Maritime Safety and Technology, Daejeon, Korea, pp.142-145.
15. Kim, JS, and Jeong, JS (2016). Vessel trajectory analysis for setting reference route model within harbor limits. Proceedings of Korea Institute of Intelligent Systems Spring Conference 2016. 26, 163-164.
16. Kim, JS 2016. A design of reference route model based on SVR through reconstruction of ship trajectories in VTS area. Ph.D. dissertation. Department of Maritime Transportation System Graduate School of Mokpo Maritime University. Mokpo, Korea.
17. Vapnik, VN (1995). The Nature of Statistical Learning Theory. New York, NY: Springer
18. Vapnik, VN (1998). Statistical Learning Theory. New York, NY: Wiley
19. Jo, TH (2008). Modified version of SVM for text categorization. International Journal of Fuzzy Logic and Intelligent Systems. 8, 52-60.
20. Gunn, SR. (1998) . Support Vector Machines for Classification and Regression: Technical Report. Available http://users.ecs.soton.ac.uk/srg/publications/pdf/SVM.pdf
21. Han, HY (2014). Introduction to Pattern Recognition. Seoul, Korea: Hanbit Academy, Inc.
Biographies

Joo-Sung Kim is a vessel traffic services operator at Kyeong-in VTS Center in Korea. His research interests include maritime traffic engineering, ship collision avoidance, maritime information and communication network. He received his B.S. degree in Nautical Science from Mokpo National Maritime University in Korea in 2004, his M.S. degree in International Maritime Transportation Sciences from Mokpo National Maritime University in Korea in 2014 and his Ph.D. degree in International Maritime Transportation Sciences from Mokpo National Maritime University in Korea in 2016. His research areas include intelligent system, fuzzy system, human factors engineering, work analysis, vessel traffic services, maritime transportation system, etc.

E-mail: jskim81@korea.kr

Jung Sik Jeong is a professor in the Department of International Maritime Transportation Sciences at Mokpo National Maritime University in Korea. His research interests include intelligent system, fuzzy system, intelligent navigation control system and maritime information. He received his B.S. degree in Nautical Science from Korea Maritime University in 1987, his M.S. degree in Communication and Electronic Engineering from Korea Maritime University in 1993, and his Ph.D. degree in Electrical and Electronic Engineering from Tokyo Institute of Technology in 2001. He worked at Korea Telecom at 1996. His research areas include maritime traffic engineering, ship collision avoidance, maritime information and communication network, etc.

E-mail: jsjeong@mmu.ac.kr

July 2017, 17 (2)