Global economic growth has stimulated the growth of international trade. It also leads to more maritime transportation activities. Ship accidents often result in serious damage, death, loss, injury or pollution, and may also have significant political, economic and environmental consequences  . Meanwhile, they also affect several entities in the maritime industry, such as shipping companies and ship owners, flag states, shipping companies, coastal countries, shipbuilding and ship insurance companies. The improvement of ship technology, the implementation of the International Maritime Organization (IMO) and related safety regulations, have successfully reduced the number of ship accidents. However, the complex and high-risk environment at sea, including various dynamic factors such as environment, collision and human factors, makes it difficult to eliminate ship accidents   . Therefore, maritime security is still one of the main concerns of global maritime interests.
In order to prevent ship accidents more effectively, experts and scholars have proposed a safety management system as a tool to assess and monitor the accident risk. The relationship between different safety indexes and accidents is modeled. This approach not only provides useful potential information for security management, but also contributes to continuous improvement and decision making  . However, this is a security issue with complex structure, diverse processes, and dynamic systems. At this point, the Bayesian network  is a very powerful quantitative modeling technique. Because it can present a complex dependence relationship among factors that affect the accident. In addition, for some uncertain or unobservable variables, BNs can also have qualitative graphical dimensions.
Previous shipping safety management studies   have made important contributions in identifying the various factors affecting accidents. Therefore, we know that the factors affecting ship accidents mainly include the inherent properties of the ship and the PSC inspection defects. In previous studies, the BN model was mostly used to analyze the influencing factors and the inter-incident dependencies.
However, this paper suggests that a more complete dynamic BN model can be attempted. This BN model should contain all variables related to ship accidents, such as the impact factors of accidents, accident types, and accident consequences. In other words, this model is like a small accident evaluation system, which can not only analyze the dependency between the accident influencing factors and ship accidents, but also analyze the influence of different accident types on the accident consequences. In addition, the model can be adjusted and updated timely according to changes in the external environment.
Therefore, this study attempts to build a dependency model among the inherent properties of ships, PSC inspection defects, ship accidents and accident consequences. In particular, this paper introduces an index representing the overall inspection defects of ship, that is, PSC inspection defect items affect ship accidents through a hidden variable.
2. Literature Review
When analyzing maritime safety, BN is increasingly regarded as a powerful tool for building complex causal models  repeatedly mentioned that there are many advantages to using BN. For example, it has visibility and backward inference ability. The advantage of BN is that it is a dynamic security risk assessment method that can combine causality and decision-making problems. This is also an important reason for the increase of BN literature. In order to obtain more objective conditional probability estimates,  developed a risk assessment model for maritime transport systems by using expert estimates of prior probabilities of BN models. However, Li, Yin, & Fan (2014) combine Logistic regression and BN methods for maritime risk analysis  . In the same year, Kevin X. Li et al. (2014) also combined Logistic regression with BN technology to construct a model of the relationship between ship inherent properties (flag, age, ship type, ship size, classification society) and ship accidents. However, these methods only discuss the effects of inherent attribute variables on accidents  .
PSC refers to the inspection of ships docked at the port by the Port Congress. If a defect is found, it will be detained until it is repaired. The PSC is often known as the fourth safety barrier for marine ships. It complements the effective maritime safety management of the flag State and is designed to compensate for the shortcomings of ship owners, flag States and classification societies  . In the study of various factors affecting accidents  , we found that PSC inspection results are also an important reason for the risk of accidents. Cariou et al., (2008) proposed that when a ship passes the PSC inspection, the reported defects will be reduced in the next inspection, which means that the PSC inspection can improve the safety performance of the ship, and the PSC inspection system is effective  . In addition, Hänninen, Valdez Banda, & Kujala (2014) and Hanninen (2014) published a number of models on the use of BN technology to construct the ship’s inherent properties and the dependence of PSC inspection defects on ship accidents  . A hidden variable representing the overall safety of the ship was introduced in his research. Each influencing factor acts on the hidden variable, which has an impact on the ship accident.
3. Data and Methods
The data utilized as an input in learning the Bayesian network models is derived from three databases. The first database includes basic information about 150,000 vessels in total from Lloyd’s register of shipping (LR), such as the ship flag, the ship type, the gross tonnage, year built, and so on. The second database is the PSC inspection database obtained from Tokyo Memorandum of Understanding (Tokyo MoU), between January 2007 and December 2017. It contains basic information about the ships being inspected, deficiencies codes, deficiencies counts, etc. The third database is from the International Maritime Organization (IMO) incident database. This database records incident reports from January 2007 to December 2017, including the ship’s IMO number, incident date, cause of accident, accident type, type of casualty and so on.
Table 1 describes the main variables used for BN learning and the prior probability of each state. The 17 inspection defects from Tokyo MoU are converted into two binary variables, indicating whether a deficiency is encountered at least once (1) or never (0) within one year before the event occurrence date. Also, the flag variables are divided into two categories based on the flag information from 73 flag states, open and closed flags. Similarly, depending on whether the vessel is certified by the International Association of Classification Societies, these vessels are classified into IACS and No-IACS.
Table 1. The Bayesian network variables.
The service life of a ship is about 30 years, and the ship over 30 years will basically be used for offshore or scrapped and will not be used for commercial use. The average age of all ships in this database is 16 years. We divided the age of ships into 3 categories. Among them, ships under 16 years old belong to young ships. Between 16 and 30 are middle ships. Ships over the age of 30 are old ships.
Similarly, ship size is also divided into 3 categories according to the gross tonnage. The ship with more than 20,000 tonnages is big, the ship with less than 2000 tonnages is small, and the other is middle. Finally, the ship types are classified into 7 categories: Bulkers, Containers, General cargo ships, Tankers, Passengers, Off shores and Others.
For the purpose and scope of this study, we not only consider degree of seriousness but also we do differentiate accident types. We classified the seriousness of accidents according to IMO definitions (IMO, 2000), which are very serious, serious, and less serious. Also causes of accidents include collision, fire or explosion, equipment, person, sinking, capsizing/listing, stranding/grounding, etc. Moreover, types of casualties include total loss, injure, loss of life, pollution, etc.
Unlike previous studies, we use “non-accident” to reverse the probability of an accident in this article. The discussion in the fourth section is also discussed in reverse.
3.2. Bayesian Network Learning
Bayesian networks are acyclic directed graphs in which nodes represent random variables and arcs represent direct probabilistic dependences among them (Pearl 1988). The direction indicated by the arrow is the parent node, and the direction of the arrow is the child node. The structure of a Bayesian network is a graphical, qualitative illustration of the interactions among the set of variables that it models. A Bayesian network also represents the quantitative relationships among the modeled variables. Numerically, it represents the joint probability distribution among them. Variables with parent nodes obtain conditional probability table (CPT) by calculating associations with parent nodes, while variables without parent nodes have marginal distribution probabilities.
The conditional probability value of event B is calculated as follows:
The Greedy Thick Thinning (GTT) structure learning algorithm is based on the Bayesian Search approach and has been described in (Cheng et al., 1997). GTT starts with an empty graph and repeatedly adds the arc (without creating a cycle) that maximally increases the marginal likelihood until no arc addition will result in a positive increase. Then, it repeatedly removes arcs until no arc deletion will result in a positive increase. Finally, the optimal network model can be obtained through these steps, which is also a model for the analysis of this study.
4. Model Structures
After combing the literature (Hänninen & Kujala, 2014; Hänninen et al., 2014; Knapp & Franses, 2007; Li et al., 2014; Trucco et al., 2008; Yang et al., 2018), we know that the inherent properties of ships and PSC inspection defects are the influencing factors for predicting ship accidents and losses. The main purpose of our study is to establish a model for the relationship between ship’s inherent property, port state inspection defect item and ship accident, especially the dependence relationship between variables. Therefore, the complete BN model is shown in Figure 1.
In the main model, the five ship’s inherent properties (ship type, ship flag, age, IACS, gross tonnage) and RISK variables are linked to the Accident attribution node, and Accident attribution is linked to the Type of accident node. Finally, the type of accident is linked to six loss variables (total loss, loss of life, injury, pollution, remain, unavailable) which is we read the IMO accident report to summarize after the accident.
It is worth noting that there are 17 inspection defects in the main model, and we will include these defects in a sub-model called RISK. Each inspection defect item acts on a hidden variable, which in turn affects the ship accident.
5. Model Evaluation
It should be specially noted here that the number of states of the hidden variable “ship safety” is not certain, so it is assumed that there are 2, 3, 4, and 5 states respectively. When the state number is 2, the maximum likelihood value is −162,155.19, and the accuracy of the model is 0.950795. When the state number is 3, the maximum likelihood value is −162,306.72, and the model’s readiness is 0.952021. As the number of states increases, although the maximum likelihood and accuracy of the model are not very different, they can be found to be gradually smaller. This study compares the likelihood function and accuracy of the four models, and finally finds that the model is optimal when the state number is 2 (Table 2).
Figure 1. Main model-Bayesian network based on the GTT algorithm.
Table 2. The results of the model evaluation.
Note: The number of 2, 3, 4, and 5 suggests the number of states for the hidden variable “Risk”.
In this study, we first analyze the influencing factors of the accident, and use the Bayesian network model to construct the dependence of the ship’s inherent properties, PSC inspection defects, and ship accidents. Finally, a complete Bayesian network learned by the GTT algorithm is constructed. In addition, the data used in this study were derived from comprehensive data from three authoritative databases. To some extent, the quality of the data and the accuracy of the model results are ensured.
In this study, the model was constructed intact and considered comprehensive. In addition, the model can be continuously updated as the external environment changes. However, there are still some shortcomings in this study. In order to make the model clearer, this study did not consider the interaction among the variables. In addition, this study only analyzed ships from ten years (January 2007 to December 2017). If the observation data are expanded, the results may be different. Future research can also be tried from this aspect.