Prediction of Marine Traffic Density Using Different Time Series Model From AIS data of Port Klang and Straits of Malacca

e-mail: masnawimustaffa@gmail.com In the study of ocean engineering, marine traffic is referring to the study of the pattern of the density of ships within the particular boundaries at certain periods. The Port Klang and Straits of Malacca are known for one of the heaviest traffics in Malaysia and the world. The study of traffic within this area is important, because it enables ships to avoid traffic congestion that might happen. Thus, this study is mainly aimed at predicting or forecasting the density of the ships using the route through this waterway by using quantitative methods which are timeseries models and the associative models from the Automatic Identification System (AIS) data. The moving averages, weight moving average, and exponential smoothing for the time series model and associative model have used multiple regression. The results show an exponential smoothing alpha 0.8 and give the lowest MAPE as 20.701%, thereby making this method to be the best in forecasting the future traffic density among the method categories. Prediction of Marine Traffic Density Using Different Time Series Model From AIS data of Port Klang and Straits of Malacca

In the study of ocean engineering, marine traffic is referring to the study of the pattern of the density of ships within the particular boundaries at certain periods. The Port Klang and Straits of Malacca are known for one of the heaviest traffics in Malaysia and the world. The study of traffic within this area is important, because it enables ships to avoid traffic congestion that might happen. Thus, this study is mainly aimed at predicting or forecasting the density of the ships using the route through this waterway by using quantitative methods which are timeseries models and the associative models from the Automatic Identification System (AIS) data. The moving averages, weight moving average, and exponential smoothing for the time series model and associative model have used multiple regression. The results show an exponential smoothing alpha 0.8 and give the lowest MAPE as 20.701%, thereby making this method to be the best in forecasting the future traffic density among the method categories.  (Mustaffa et al., 2019).

Prediction of Marine Traffic Density
The AIS technology is mainly aimed at giving information in studying this field. In a brief explanation, the Automatic Identification System (AIS) technology is a system that enables a vessel to obtain about the encountered vessels, such as their position (Zaman et al., 2015), course, speed, and other parameters automatically by very high frequency (VHF) radio transmission (Kundakci et al., 2018). Using AIS (Wu et al., 2017;Kang et al., 2018) will improve and remedy the traffic congestion on open water as it gives required data for traffic analysis (Xiao et al., 2015;Zhang et al., 2019;Fiorini et al., 2016).
In studying the pattern of data, it always takes into consideration the affecting and responding variables and the season or weather that may affect the traffic congestion. Previous studies show that weather during storms or disasters affects the traffic or behaviour of ships in manoeuvres of vessels (Gao et al., 2017). Thus a seasonal pattern technique has been used in the analysis.
Several methods have been used for this study and the effectiveness of each method is measured by the percentage of error. It presents the graphs to show the pattern of the forecast and the actual situation. This is expected to improve the effectiveness of predicting the overall density in the study area. Furthermore, the prediction or forecast may be useful for vessels in planning to use a route to avoid traffic congestion.
The scope of this research is the forecasting using time series and associative models. These methods are based on a set of data from July 2016 to December 2017. These data have been collected from our laboratory, using our own AIS receiver. These data do not represent real data of the whole Port Klang and Starits of Malacca. We are not using any data from related authorities in Malaysia. Rather, we are using our own data to study the pattern and measure the effectiveness of the methods.

Straits of Malacca
At the centre of one of the busiest shipping lanes that connect East and West, the Straits of Malacca play a vital role for the shipping industry (Cheng et al., 2019). The Straits of Malacca are located on the east coast of Indonesia's Sumatra Island and the west coast of the Malaysia Peninsula, and extend to the Straits of Singapore at its southeast end. Marine navigational hazards will emerge when using this straits due to the increasing amount of traffic, in addition to a geographically narrow straits (Mustaffa et al., 2019).

QUANTITATIVE FORECASTING
Forecasting is a science of predicting future events as it gives the good application in economic forecasting, technology forecasts and also demands forecasts which are a projection of a company's sales for each period in the planning horizon (Gao et al., 2017;Mustaffa et al., 2019). Furthermore, the main aim of time series modeling is to carefully collect and rigorously study the past observations of a time series to develop an appropriate model which describes the inherent structure of the series. This model is then used to generate future values for the series, time series forecasting can thus be termed as the act of predicting the future by understanding the past (Ratnadip et al., 2013).
Determine the use of the forecast 2.
Select the items to be forecast 3.
Determine the time needed to make a forecast 4.
Select the forecasting models 5.
Gather the data needed to make the forecast 6.
Make the forecast 7.
Validate and implement the results The quantitative forecasting and qualitative forecasts are a general approach to forecast categorized into time-series models and the associative model.

Time-series Model
This type of model predicts the assumption that the future is a function of the past. By using this, the data will analyse the previous past data to make a forecast. In this study, the approach used is Moving Average, Weighted Moving Average and the Exponential Smoothing.

Associate Model
This model is like the linear regression. It incorporates the variables or factors that might influence the quality being forecast. The variables that give the effect of the forecast is the seasonal pattern and this study involves this variable. The Multiple-Regression approach is used in this study for this category.

AIS Data
All collected data was encrypted, analyzed daily and stored in the database. This dataset contains all traffic in the area of study.
The scope of this study is obtaining the general equation or the pattern of the data distribution of traffic density. The dataset used contains the statistical data from July 2016 until December 2017.
The data are a collection of 499 day dataset, containing the total of ships. We categorized the data by the four seasonality quarter of the year. Each season represents months from January to March, April to June, July to September, and October to December, denoted as 1 to 4 respectively.

Moving Average (MA)
A moving average forecast uses many historical actual data values to generate a forecast. A 4-month moving average is found by simply summing the demand during the past 4 months and dividing it by 4. The following is the formula for calculating the moving average: Where n is the number of periods in the moving average.

Weighted Moving Average (WMA)
The trend of the pattern can be more emphasis on recent value when considering the weights. The following formula represents the weighted moving average: The weight for each calculation represents ascending number of the number of ships in n period.

Exponential Smoothing
Exponential smoothing is also known as a weighted moving average as it involves very little record keeping of past data. The α is a weight or smoothing constant that has a value greater than or equal to 0 and less than or equal to 1. This study will show the results of different value α and obtain the best value α in this forecast. The following formula show the exponential smoothing: Where: F t = new forecast F t-1 = previous period's forecast α = smoothing constant (0 ≤ α ≤ 1) A t-1 = previous period's actual number of ships

Multiple Regressions
Unlike time-series forecasting, associative forecasting models take into consideration several variables that are related to the quantity being predicted and make this method more reliable than others. Multiple regression analysis is a practical extension of the simple regression model, a straight-line mathematical model to describe the functional relationships between independent and dependent variables. It is represented by the following equation: Where y is a dependent variable, a is a constant, x 1 , and x 2 are values of two independent variables, seasonal b 1 and b 1 are coefficients for the two independent variables.

Computational Approach of Multiple Regressions
The least-square method has been used to compute the independent variables to the dependent variables which are the density of traffic.

Mean Absolute Percent Error(MAPE)
The average of the absolute differences between the forecast and actual values is expressed as a percent of actual values. The formula for this error is as follows: Where n is a period for the forecast and actual values.

Traffic Density Analysis
The statistical summary for all quarters of actual traffic within the study time frame of 499 day is tabulated in Table 1, showing the trend line equation with maximum and minimum amount by quarter. The line graph of 6 quarter of years for July 2016 until December 2017, as shown in Figure 2-7, represents the distribution of daily traffic, which always fluctuates, and trend line of actual traffic density by each quarter.

Moving Average
For each group of four month data or seasonal group of data, the forecast for the consecutive days uses the first four days of the number of ships in each seasonal group. Based on the equation (1), a total of four days is divided by four and forecast value of the 5th day is obtained. This iteration is continuously being forecast until the 499th day. The MAPE was calculated for the comparison.

Weighted Moving Average
In this type of forecast, the more heavily weighted the latest days provide a more accurate projection. Weighted moving average for four days projection was calculated using the equation (2). The MAPE has been calculated for comparison with other methods.

Exponential Smoothing
The equation (3), applying the forecast as the initial forecast for the first day, is 205. This consecutive iteration continues until the 499th day. In obtaining the best smoothing constant, try-anderror technique has been used to calculate the best alpha value and MAPE each difference smoothing constant, α was calculated and tabulated as in Table 2.
From this table, the best value smoothing constant to match the forecasting or closest to the actual is α = 0.8.

Associative Models
The multiple-regression is obtained as the equation considering the 4 variables, one variable representing the number of days, and the other 3 representing the set of seasonal variables. The following is the set of variables used. Table 3 shows the independent variable in which this density traffic is influenced by seasonal by quarter year.    (5), data iteration for 499 days from dataset gives the following value respectively to the method used in this project as in Table 4. Trend line for each method is shown in Figure 8. This comparison shows the exponential smoothing with alpha value 0.8 as the best method compared with other methods. The comparison by the plotting the graph for all method compared to actual data plotting is shown in Figure 9.

Forecasting Future Traffic Density
From the analysis overall method, the reliable method in this study is the exponential smoothing alpha 0.8, as to perform the forecasting the extrapolating of recent data is required for continuity of the data projection. For the case, the date required is the 1 st January 2018 and iteration using equation (3) to obtain the traffic density for each consecutive day. The plotting graph pattern of the previous week and the predicted day is shown in Figure 9.

CONCLUSION
In this study, the main objective of traffic forecasting analysis has been presented and discussed. The quantitative method has been used to forecast total density and by days as time horizon used to project the estimation density. The data are taken from July 2016 until December 2017, which only comprises 499 days as some of the data have not been completed due to some error in receiving the AIS signal. The quantitative method which is the time-series model; moving average, weighted moving average, exponential smoothing, and associative model; multipleregression approach. These methods have been measured using the MAPE. The result showing the exponential smoothing alpha 0.8 gives the lowest MAPE as 20.701% and this will make this method reliable in terms of indicating or forecasting the future traffic density by the number of ships. Plotting of Forecasting using exponential smoothing alpha 0.8.