Methodological Responses to Contemporary Maritime Piracy in the Gulf of Guinea

Show more

1. Introduction

Piracy was a limited problem prior to 2009. However, piracy started being noticed during the 1990s as small armed groups began holding ships and crew members for ransom. In this scenario, those responsible for piracy activities argue that their actions are just because they are protecting Somalian fishing resources [1] . The government was unable to enforce existing law, causing piracy to increase, become bolder, and became a significant menace. Beginning from the Suez Canal, passing through the Gulf of Aden, and between the Horn of Africa and the Arabian peninsular, piracy has become advanced as a result of technological advancements and pirate courage. Because of these crimes, concerns have arisen in relation to navigation dangers in these waters, particularly for humanitarian ships.

The effectiveness of laws covering maritime piracy internationally, especially in Articles 100 to 107, and 110 of the UN Convention on the Law of the Sea (UNCLOS) have been questioned due to ongoing attacks. There is an interesting similarity between the Articles of UNCLOS, and the Geneva Convention of the High Seas 1958 (Articles 14 to 22).UNCLOS does not bind every state, however, there are non binding states bound by the Geneva Convention. This suggests that articles regarding the Law of the Sea could be deemed the same [1] . These similarities in laws have still not helped since in various maritime locations the rate of piracy has been on the rise.

The surge of pirate activities in the Gulf of Guinea (GoG) (Figure 1), is most likely due to the increased discovery and exploration of oil on the coast of nations in the Gulf of Guinea. The GoG is an important sea route for major oil producing countries such as Nigeria and Angola. The recent discoveries of oil in Ghana, Ivory Coast, Sierra Leone and Liberia have made the GoG a very attractive area thus raising a new wave of pirate attacks. According to [2] the principal motivational of pirates in the GoG is the theft of crude oil and refined petroleum products.

[3] further reports that there is insufficient protection by state or regional authorities within territorial waters in the GoG. The serious challenge exists in the area of boarding equipment, tracking and monitoring of ships. Corrupt officials are often reported to accept bribes and set pirates free. Ships traversing these sea lanes are left to their own defenses to establish their own surveillance and deterrence, which is an expensive undertaking.

Figure 1. Map of the Gulf of Guinea (Source: www.google.com.gh/search?q=map+).

2. Restatement of Research Problem and Questions

The research questions to be addressed include:

1) What are the causes of the piracy surge in the Gulf of Guinea and the obstacles to effective resolution?

2) What are the effects of oil piracy in the Gulf of Guinea on maritime transportation (concerning the effects on reputation damage to maritime transport company, insurance costs, transport price increase, extra cost on changing route, trade disruption and the loss of crude oil)?

3) What are the effects of oil piracy in the Gulf of Guinea on maritime security (considering the effects on security cost, more integration with the navy, dangers associated with sailing in unfamiliar waters, threat to crew and security equipment)?

4) What are the strategies required to manage these threats?

3. Research Design and Methods

This study is being conducted as a longitudinal design. This type of research design was chosen because the longitudinal study allows the sample to be followed over time, which then allows for repeated observations throughout a specific time period [4] . As a result, researchers are able to track changes over a time period, which may then be related to variables (such as political causes) that may explain the reasons behind the changes. Ultimately, longitudinal research designs are beneficial because patterns of change are described, which allow the direction and magnitude of relationships to be established effectively. This is especially beneficial in the consideration of causal relationships [5] . In a longitudinal study, measurements are taken on the relevant variables over different time periods. Typically, these time periods are extremely distinct, allowing the researcher to provide change measurements for variables over a time period. At the same time, longitudinal studies are commonly called panel studies [6] .

Longitudinal studies are quasi-experimental research designs that involves measuring repeated observations of the same variables over time. These studies are often considered to be observational studies [7] . These studies are primarily used in psychology (allowing for developmental trends to be studied throughout the lifespan) and sociology (allowing for life events to be studied throughout generations). These are different than cross-sectional studies, which focus on different data samples with the same characteristics for comparative purposes [8] . Longitudinal studies, on the other hand, focus on the same data samples in order to observe differences, allowing for the observation of change to be more accurate. For example, medical longitudinal studies allow predictors of disease to be uncovered; whereas advertising longitudinal studies allow target audience attitudes changes to be tracked based on an advertising campaign. Importantly, longitudinal studies allow for the distinguishing of short and long-term phenomena.

Observational longitudinal studies that do not manipulate variables are argued to be less powerful in the detection of causal relationships than experiments. On the other hand, because of the repeated observation of different variables, longitudinal studies are more powerful than cross-sectional studies because of the ability to exclude time-invariant unobserved individual differences and being able to observe the temporal order of events [9] . However, it is argued that these studies are time consuming and are expensive, making them inconvenient. On the other hand, rather than focusing on future data and time periods, longitudinal studies can be retrospective, allowing for the use of existing data. Some longitudinal studies are cohort studies (occurring when all data representations have a common event in a selected period), allowing for the performance of cross-section observations through a specified time period at certain intervals [8] [9] .

There are different uses of longitudinal studies. For instance, longitudinal data allows the duration of a variable to be analyzed effectively. Moreover, survey researchers are able to develop causal explanations that are typically only attainable through experiments [8] . The longitudinal design allows changes in the variable(s) to be measured throughout a time period, which is beneficial in predicting future outcomes based on historical data. However, the data collection method may change throughout the course of the longitudinal study, which can make it difficult to maintain the original sample’s integrity [7] . Moreover, due to the extended time period, it may be difficult to consider more than one variable during a specific collection time. Therefore, qualitative research is commonly required to explain the changes in the data, especially since it is assumed that current trends will continue without any type of changes [10] .

As noted previously, the longitudinal study is a quasi-experiment, which is an empirical study, allowing for the estimation of the causal impact of an intervention on the target population, yet does not include random assignment. There are similarities between quasi-experimental research and traditional experimental design or randomized controlled trials. However, quasi-experimental designs lack the random assignment element for treatment or control, instead allowing the researcher to maintain control of the treatment condition through the use of another criteria (in this case, incidences of piracy) [10] .

Quasi-experiments have concerns issued against them in relation to internal validity due to the difficulties that may exist in comparing treatment and control groups at baseline. However, through the use of random assignment, participants are afforded the same opportunity of assignment of the intervention or comparison group. This allows for the consideration that differences in relation to observed/unobserved characteristics to occur through chance, rather than a treatment-related systematic factor. However, randomization alone does not guarantee equivalency at baseline, making it difficult for quasi-experimental studies to demonstrate a causal link, especially in the event of uncontrollable or unaccounted for confounding variables [11] . Quasi-experimental designs first require that the researcher identify the variables, where the independent variable is the x variable and manipulated in order to result in a dependent variable. The outcomes of these manipulations can be measured over time through a time series analysis, allowing for the establishment of changes in the data [12] .

Experiments that contain random assignment allow units to have the same opportunity of assignment to either the control or intervention group. The qua- si-experimental design allows for the assignment for a treatment to be based on a criteria rather than random assignment [7] . This allows the researcher to have greater control over assignment to treatment, such as through a cut-off score. The use of alternative criteria allows the researcher to determine which participant is assigned to which group (treatment or control). At the same time, the researcher may have no control over assignment of participants whatsoever and the criteria remains unknown [10] . Different factors influence the assignment of the participant, such as feasibility, political concerns, convenience, and/or cost. As a result, quasi-experiments have concerns in relation to internal validity [12] .

The effectiveness of quasi-experiments varies. However, these experiments are typically deemed very effective due to pre-post testing [13] . This means that tests are conducted prior to data collection in order to determine if participants have certain tendencies that may alter the results of the study. The following step includes the conduction of the experiment with tests being conducted following the intervention [13] . The pre-and post-test results can be compared within the data analysis. However, some researchers elect to utilize the pre-test to explain the experimental data. Typically, independent variables already exist in quasi-experiments, including gender, age, etc. and can either be continuous (such as in the case of age) or categorical (such as in the case of gender). Thus, variables that are naturally occurring tend to be measured within quasi-experiments [13] . There are different types of quasi-experimental designs, such as difference in differences; non-equivalent control groups design (such as no-treatment control group design; non-equivalent dependent variables design; removed treatment group design; repeated treatment design; reversed treatment non-equivalent control groups design; cohort design; post-test only design; and/or regression continuity design); regression discontinuity design; case-control design (such as time-series design; multiple time series design; interrupted time series design; propensity score matching; and/or instrumental variables); and panel analysis [7] . Of the listed research designs, the regression discontinuity design is the most similar to the experimental design. The reason behind this is that the experimenter (or researcher) is able to maintain control of the treatment assignment, which results in unbiased estimates of the effects of treatment. However, this type of study requires a significant number of participants and the precise modelling of functional forms. These forms are in relation to treatment assignment and outcome variable in order to have the same power and effectiveness as the traditional experimental design.

In some instances, the use of quasi-experiments are considered ineffectual by experimental purists, leading them to be called queasy experiments [14] . However, quasi-experiments are especially useful in areas where it is infeasible or undesirable to conduct an experiment or randomized control trial. Examples include the evaluation of impacts caused by changes to public policy or interventions within education and/or health. There are drawbacks to the quasi-experi- mental design, specific the impossibility in eliminating confounding bias possibilities, resulting in difficulties in determining causal inferences. Because of this significant drawback, quasi-experimental results are often discounted [14] . On the other hand, this type of bias can be controlled using statistical techniques- namely, multiple regression-particularly if the confounding variable(s) can be identified and measured. These types of techniques are beneficial in order to model and explain the effects of the confounding variables, which increases the accuracy of the results of the study. At the same time, propensity score matching can be used in the matching of participants according to those variables that are most important within the treatment selection process (or treatment criteria). The use of propensity score matching is beneficial in the improvement of quasi-experimental results accuracy. Knowing this, it is evident that quasi-experi- ments are increasingly valuable for applied research. However, individually, quasi-experiment designs do not allow for definitive causal inferences, yet provide information that cannot be provided through other experimental methods. There- fore, researchers interested in applied research should consider the use of quasi-experimental research designs as opposed to traditional experimental design in order to meet their needs [7] .

True experiments rely on random assignment in order to control for all variables. Quasi-experiments are commonly used for those studies where participants cannot be randomised. Therefore, some researchers argue that it is possible to distinguish between natural and quasi-experiment [7] [10] . The primary difference is that a quasi-experiment allows the researcher to determine the criterion for participant assignment, whereas in a natural experiment, the participant assignment occurs with no intervention by the researcher. Therefore, it is evident that quasi-experiments utilize outcome measures and treatments, yet do not use random assignment. As a result, many researchers elect to utilise quasi-experiments over true experiments because quasi-experiments can typically be conducted, whereas true experiments cannot always be conducted successfully. Quasi-experiments are considered to be particularly interesting because they encompass features from experimental and non-experimental designs, as well as maximize validity-both internal and external [15] .

Internal validity considers the truth (in its approximate form) regarding inferences that are in relation to causal relationships. Since quasi-experiments focus on causal relationships, internal validity is increasingly important. As a result, internal validity occurs when the researcher attempts to control all variables that may have an impact on the results of the study. Therefore, threats to internal validity include statistical regression, historical data, and the participants themselves. In order to maintain a high internal validity, researchers consider the factors (such as outside causes) that impact the outcome [15] . This means that internal validity comes from within the study itself-namely, through the researcher.

On the other hand, external validity refers to the extent that it is possible to generalise the results of the research study’s sample to the population of interest. As a result of high external validity, this generalisation is accurate and is representative of the entire population based on the sample. Moreover, external validity is increasingly important due to statistical research due to the need for an accurate depiction of the population. As a result of low external validity, the research results are less credible. This means that it is necessary to reduce threats that may exist in relation to external validity, which is often conducted through random sampling and random assignment [12] .

There are different advantages of quasi-experimental designs. For instance, in many cases, the use of quasi-experiments is easier to establish than true experiments due to the ability of quasi-experiments to be conducted without randomisation of participant assignment. Common usages of quasi-experiments are when it is impractical or unethical to randomise the participants, making this design beneficial for research that would otherwise be harmful to participants. For instance, the use of quasi-experiments allows ecological validity threats to be minimised, especially as natural environments do not have artificiality issues, whereas a laboratory setting does [16] . Quasi-experiments are natural, allowing findings from one study to be applied to other populations, which allows for generalisations to be established through the results. This allows quasi-experi- mental research designs to be efficient for longitudinal research, which focuses on longer time periods, allowing for follow up to occur in different environments. Quasi-experiments are advantageous because the researcher can choose the manipulations to place on the subjects, as compared to natural experiments, where manipulations must occur individually, and giving no control of these manipulations to the researcher. Most important, quasi-experiments can reduce ethical concerns due to the ability of the researcher to control different aspects of the study [15] .

Despite the advantages, quasi-experimental designs have disadvantages. For instance, the estimates of impact can be contaminated through confounding variables [10] . For instance, in certain quasi-experimental designs, different factors cannot be measured or controlled in an efficient manner (such as those factors that are highly impacted by personal beliefs of the participants). Despite the fact that quasi-experiments have no reliance on random assignment (which increases feasibility), there are numerous challenges due to internal validity concerns. For example, since randomisation does not occur in quasi-experiments, it is more difficult to exclude confounding variables. At the same time, new threats are established in relation to internal validity [16] . Therefore, due to the absence of randomisation, some data results can be approximated. However, causal relationship conclusions are difficult, if not impossible, to determine effectively as a result of extraneous and confounding variables. Thus, even with accounting for internal validity threats, causation cannot be established entirely due to the lack of control held by the researcher over extraneous variables [12] . Therefore, the lack of randomness may yield weaker results because randomisation is necessary for broader results, allowing for a greater representation of the population.

For this study, a relevant quasi-experimental design is the natural experiment. It differs from the traditional quasi-experimental design (focusing on treatment) because the researcher is not manipulating a variable. The researcher does not control any variable and does not use random assignment, which allows the experimental control to occur naturally. Although this technique may appear unorthodox and/or inaccurate, it has been shown to be effective in many different studies, such as in research studies analysing sudden events (whether positive or negative) [10] [17] .

This particular study is using a quantitative longitudinal research design, focusing on natural experimentation through a panel analysis. The use of panel analysis is valid for this study because it is used in a wide range of disciplines, such as epidemiology, social science, and econometrics. Therefore, it can also be used in finance studies effectively. The focus of panel analysis is time series analysis [18] . This allows for data to be collected over time and using the same participants. Using these two dimensions (time and participants), a regression analysis can be conducted. A common econometric method is multidimensional analysis, allowing data to be collected over two or more dimensions. Commonly, the panel data regression model equation is as follows:

${y}_{it}=a+b{x}_{it}+{\in}_{it}$ (1)

Equation (1): Panel Data Regression Model [19] .

In Equation 1, y refers to the dependent variable, x refers to the independent variable, a and b refer to coefficients, i and t refer to distinct variables (such as individuals and time), and Є refers to the error variable. The assumptions of the error variable conclude whether or not the equation focuses on fixed or random effects. In the case of a fixed effects model, the error variable is non-stochastic, which renders the model to one dimension through a dummy variable. In the case of a random effects model, the error variable varies stochastically, which makes the i and t in the formula to require special treatment or the completion of an error variance matrix [20] . Panel data analysis can be conducted through three different approaches: 1) independently pooled panels; 2) random effects models; and 3) fixed effects models or first differenced models. Each of the models have different assumptions. For example, independently pooled panels assume that within the measurement set (the sample), there are no unique attributes for the individuals and time yields no universal effects. Fixed effect models assume that there are unique attributes within the model, which are not caused by random variation and are not time-variant. This model is commonly known as the least squares dummy model and is especially useful if the focus is to draw inferences regarding examined participants. Within the random effects model, it is assumed that participants have unique and time constant attributes, resulting from random variation and have no correlation with individual regressors. This model is especially useful in drawing inferences regarding the population, not just the sample. This specific study focuses on the fixed effect model.

4. Data Collection and Processing

This study uses panel data, important because of the analysis of all piracy attacks globally within the time period. The time period being assessed is from 2006 to 2015. This is done in order to allow a wide range of data through a time series analysis. Moreover, the data will be obtained from the International Maritime Bureau (IMB) the Piracy Reporting Centre. Although the results will consider all countries, particular emphasis will be placed on the Gulf of Guinea. Data will be downloaded from the sources into Microsoft Excel. The data will be displayed in rows and columns, where rows are the countries and columns are the year of occurrence. The final row shows a progression of all piracy attacks throughout the time period. The data analysis was conducted using Microsoft Excel. In some instances, the RealStats resource add-in for Excel was used for enhanced statistical analysis of the data. The RealStats add-in is beneficial because it works effectively with the existing statistical techniques enabled by Microsoft Excel, yet allows for an additional layer of statistical analysis. Specifically, the use of Microsoft Excel was chosen in order to provide a transparent view of data in consideration of the models selected to analyse the data. The models chosen to analyse the data are ordinal logistic regression, series hazard modelling for maritime transport risk analysis, and Bayesian networks (BN’s) technique for maritime risk analysis.

Data Analysis

This study uses three different data analysis techniques. The goal of these techniques is to determine the change in piracy attacks, especially in relation to the Gulf of Guinea. They include: 1) ordinal logistic regression; 2) series hazard mo- delling for maritime transport risk analysis; and 3) Bayesian networks (BN’s) technique for maritime risk analysis. Each is discussed in its own sub-section below.

5. Ordinal Logistic Regression

Initially, Peter McCullagh established the ordinal logistic regression, which is a statistical model. The ordinal logistic regression is also known as the ordered logit model or the proportional odds model. Generally, this model is considered a regression model and is used in relation to ordinal dependent variables. It is assumed within the model that the purpose of the analysis is to determine the prediction rate for a certain response. The prediction rate is based on the responses obtained for other questions. The response options are typically set (such as through a close-ended survey/questionnaire). On the basis that these assumptions are met, the ordinal logistic regression model can be used [21] [22] [23] [24] [25] . In fact, the ordinal logistic regression model is often considered to be an extension of the initial logistic regression model. Significantly, the initial logistic regression model is applicable to dichotomous dependent variables, providing for the ability to analyse numerous ordered response categories.

It is noted that the ordinal logistic regression model can only be used with data that meets the proportional odds assumptions, which assumes that proportions allow the response to be divided into classifications (such as px, where p refers to proportion and x refers to the number of the division). This involves determining the logarithms of the odds, rather than the probability, of a particular answer. This is established through the following equation:

$\mathrm{log}\mathrm{log}\frac{px}{py}$ (2)

Equation (2): Proportional Odds Assumption

In this formula, px refers to the sum of the current and previous categories and py refers to the sum of all future categories. Therefore, the proportional odds assumptions argue that an arithmetic sequence can be established because the number added to each logarithm is the same. Within the model, it is suggested that the results of the assumption represent the numbered of required additions for each logarithm. These results are determined through observed variables, established through a linear combination [21] [22] [23] [24] [25] . The coefficients found as a result of these linear combinations cannot be consistently estimated using ordinary least squares. As a result, maximum likelihood estimates are more commonly used in conjunction with this model. Evidence suggests that maximum likelihood estimates are computed through iteratively reweighing least squares. Therefore, ordered response categories are commonly items that are close-ended and cannot be interpreted any other way, such as credit rating, Likert scales obtained from opinion surveys, spending levels at a particular department store, or employment status. Through using the logistic regression model, it is possible to provide an analysis of the ordinal outcome of these ordered response categories. The binary logistic regression model requires adequate fitting. This means that it is necessary to predict the probability of the outcome of interest. The probability prediction is done by estimating the regression coefficient. This is done through determining the results for the assumption, as follows:

$\mathrm{ln}\mathrm{ln}\left(\frac{(\text{Prob}\left(\text{event}\right)}{(1-\text{Prob}\left(\text{event}\right)}\right)$ (3)

Equation (3): Ordinal Logistic Regression Model

The logit is found on the left side of Equation (3) and is the log of the odds that an event will occur. This “event” is clarified as the prob (event) as established in the first equation. Therefore, the calculation is based on the ratio of occurrences of an event as compared to the number of events that do not occur. The purpose of the coefficients in the logistic regression model is to show the degree of logit changes based on the predictor values variables. For the first portion of this part of the analysis, the data is shown for all countries and considers a specific time period. Based on the data, the range of occurrences is 0 to 160, resulting in coefficients of 20 ranges (P1 through P20, with multiples of 8 for each range). A table will be constructed with the range, count, and ratio. The first column is self-explanatory and includes the range title (P1, P2, etc.) and potential number of occurrences (0 - 8, 9 - 16, etc.). The second column is also self-explanatory and includes the number of times for the entire time period that the specified number of piracy attacks occurred. The count is based on the years and is conducted for each country. The third column is the ratio column. The ratio is calculated as follows:

$\text{Ratio}=\frac{\text{Count}}{\text{TotalCount}}$ (4)

Equation (4): Ratio Equation for Ranges

For Equation (4), the count is found in the second column and the total count (found in the third column) is the sum of all counts from that specific range to the end of the data. For example, the total count for P3 would be from P3 to P20. Each of the ratio calculations is rounded to four decimal points.

The second portion of this analysis involves the assumptions, which are derived from Equation 1. The assumptions names compose the first column and range from 0 to 19. As noted in Equation 4, the final result is, essentially log (x/y). Therefore, the second and third columns are the numerator (x) and denominator (y) respectively. The fourth column is the result of x/y, whereas the final (fifth) column is the final output of log(x/y). Since the numerator is the probability of the event, it is the ratio as found in the first portion of the analysis. The denominator, therefore, is calculated as:

$y=1-\text{prob}\left(\text{event}\right)$ (5)

Equation (5): Calculation for Assumption Denominator

Through the division of the numerator and denominator, the result (fourth column) is derived, which allows for the log of the result to be calculated in the fifth column. These two steps (resulting in independent tables) allow the variable to be determined in the third step. The third step, as noted, allows for the ordinal logistic model variables to be determined accurately. The table used to display these results has 5 columns. The first column identifies the variable, which ranges from 0 to 19. The second column is the logit. The logit is calculated as:

$Logit=\mathrm{ln}\mathrm{ln}\left(\frac{\text{prob}\left(\text{event}\right)}{1-\text{prob}\left(\text{event}\right)}\right)$ (6)

Equation (6): Calculation for Logit

Therefore, this could be considered to be ln(x/y) using the numerator and denominator in the preceding step. The third column is β, which is found from the preceding step and is log(x/y) as defined previously. The fourth column is X and refers to the count obtained in the first count for each range. In this scenario, P1 is assumption (or variable) 0. The final column is found by multiplying β and X. For the first variable (0) there is neither a logit or a X variable, and βX is assumed to be β.

The fourth step involves the use of the raw data and the RealStats add-in for Excel to conduct the binary logistic regression and the multimodal logistic regression. From the regressions, it is possible to obtain the coefficient, LL0, LL, LL1, chi square, degrees of freedom, p-value, alpha, R-sq (L), R-sq (CS), R-sq (N), and Hosmer. Moreover, the regressions provide the ROC curve (the fifth step in the analysis), which provides information regarding the p-Pred, failure rate, success rate, failure cumulative rate, success cumulative rate, FPR, TPR, and AUC.

With the previous five steps, it is evident that it is possible to predict the number of attacks that would occur using the ordinal logistic regression model. The Gulf of Guinea includes Equatorial Guinea, Ghana, Guinea, and Guinea Bissau. The total number of piracy attacks are found for each of the locations for the time period of 2006 to 2015 and totals are found for all locations for all years. Each location, as well as the total, for five final outputs, has the number of piracy attacks for the year multiplied by the β value found previously in the third step. Each of these products are rounded to one decimal point. Once the values are calculated for the β value, each location’s results (including the total of all 4 areas) is summed. In order to find the predictor, the log is found for each total for all 5 data sets, resulting in 60 logs. Once all of the logs are determined, they are totalled then the ln is found of the total logs. This final ln is the predictor that attacks will occur. The ln is rounded to the nearest whole number and is used to forecast anticipated future attacks. It is assumed that piracy attacks will increase/decrease by the predictor between 2016 to 2030. For 2016, the values are based on 2015, whereas for 2017 and after, the values are based on the previous forecasted attacks. The table constructed to show the forecasts will be for all 5 scenarios (Equatorial Guinea, Ghana, Guinea, Guinea Bissau, and total of the preceding four areas). Therefore, the first column will contain the year being considered. The second through sixth columns will have the forecasted attacks. The final (seventh) column will have the average number of forecasted attacks for the year.

The summary of steps undertaken to conduct the ordinal logistic regression analysis are:

1) Determine the range size based on total number of occurrences in order to determine the determine the coefficient rate (Px through Py), allowing for equal numbers in each range.

2) Based on the raw data, count the number of occurrences for all years and countries within a specified range.

3) Based on the count obtained from the raw data, determine the radio of occurrences divided by total occurrences based on the range being considered. This means that the total count will change for different ranges. The ratio of occurrences is then rounded to four decimal points.

4) The assumptions (considered to be 0 through 19) are determined through the ratio of occurrences, 1-ratio of occurrences, and the log. Therefore, the ration of occurrences is numerator. The valve found for 1-ratio of occurrences (also know as prob (event) is the denominator. Once the numerator and denominator are divided, the log of the result is found.

5) The variable can be determined on the previous steps. The logit is the ln of the result of the numerator and denominator (as divided in the previous step). B is log of the result (the results of the preceding step) and X is the count from the second step. The variable is found by multiplying β and X. For the first variable (0) there is neither a logit or a X variable, and β x is assumed to be β.

6) The raw data for the time period and countries is entered into the RealStats add-in for Microsoft Excel to conduct the binary logistic regression and the multimodal logistic regression in order to obtain the ROC curve, which provides information regarding the p-Pred, failure rate, success rate, failure cumulative rate, success cumulative rate, FPR, TPR, and AUC.

7) These steps allow for the forecast to be determined. In order to do is this, the predictor of occurrences must be determined. A computation of the total number of piracy attacks for the individual countries is determined. For the specified time period, the total number of piracy attacks is found for each location. The piracy attack values are multiplied by the β value and rounded to one decimal point, then summed to reach a final value for each location (including the total). Next, the log is found of these values and then summed. The final value is the ln of the sum of the logs, which is then rounded to the nearest number to obtain the predictor of forecasted attacks.

6. Series Hazard Modelling for Maritime Transportation Risk Analysis

[26] suggests that in consideration of the South China Sea and Malacca Strait, the series hazard modelling technique has been useful for determining the hazard of global hijacking by planes and maritime piracy. The United Kingdom has also used series hazard modelling in order to examine terrorism, specifically terrorist attacks on the area. [27] noted that series hazard modelling can be used in the assessment of global terrorist attacks in Justice Commandos of the Armenian Genocide (JCAG) and Armenian Secret Army for the Liberation of Armenia (ASALA). Other researchers, such as [28] and [29] , confirmed this usage of the series hazard model. For instances, Hamas, Fatah and the Palestinian territories and the Palestinian Islamic Jihad in Israel used series hazard modelling to assess frequent terrorist attacks in Israel. Therefore, it has been shown that the series hazard modelling method has been beneficial in modelling events that have taken place and will precipitate future events. Thus, the series hazard modelling method is useful in that it measures dependence across events and estimate intervention effects across independent events. This means that the series hazard modelling methods is important in the assessment of hidden variations in the duration between events and in relation of the event details [27] . The following equation shows the series hazard model:

${\lambda}_{k}\left(t/{X}_{k}\right)={\lambda}_{0}\left(t\right)\mathrm{exp}\mathrm{exp}\left({X}_{k}\beta \right)$ (7)

Equation (7): Series Hazard Modelling for Maritime Transportation Risk Ana- lysis Formula

The Series Hazard Model is projected through the Cox Proportional Hazard model. There are slight differences. For instance, the Series Hazard Model uses failures for estimation purposes, whereas the Cox Proportional Hazard Model uses subjects for estimation purposes. However, Equation (7) can be expanded to Equation (8) as (shown below) in order to capture previous failure history for conditional independence purposes. In Equation (8), it is noted that the functions of previous failures are represented by Z (such as maritime piracy). As a result, it is possible to derive the partial likelihood function from the Series Hazard function, as shown below:

${\lambda}_{k}\left(t/{X}_{k}\right)={\lambda}_{0}\left(t\right)\mathrm{exp}\mathrm{exp}\left({X}_{k}\beta +{Z}_{ky}\right)$ (8)

Equation (8): Modified Series Hazard Model

The matrix
$Zky$ is used to measure information about failure history in order to account for the dependencies that exist across these failures. Moreover, the matrix Xk consists information that is related to a specific failure. As a result, through the completion of this model, it is assumed that failure represents the unit of observation and it is possible (even permissible) to introduce dummy variables as needed. Within the condition of this analysis,
${\lambda}_{0}$ (

Considering the Series Hazard Model, the forecast predictor is based on the sum of logs to the power of 1 through 5 and is the absolute value of the total for the countries and years. The Gulf of Guinea includes Equatorial Guinea, Ghana, Guinea, and Guinea Bissau. The total number of piracy attacks are found for each of the locations for the time period of 2006 to 2015 and totals are found for all locations for all years. Each location, as well as the total, for five final outputs, has the number of piracy attacks for the year multiplied by the sum of logs to the power of 1 through 5 (resulting in 5 rows for each country, as well as a sixth totals row for annual data). Each of these products are rounded to one decimal point. In order to find the predictor, the log is found for each total for all 5 data sets, resulting in 60 logs. Once all of the logs are determined, they are totalled then the ln is found of the total logs. This final ln is the predictor that attacks will occur. The ln is rounded to the nearest whole number and is used to forecast anticipated future attacks. It is assumed that piracy attacks will increase/decrease by the predictor between 2016 to 2030. For 2016, the values are based on 2015, whereas for 2017 and after, the values are based on the previous forecasted attacks, as shown in the table. The table constructed to show the forecasts will be for all 5 scenarios (Equatorial Guinea, Ghana, Guinea, Guinea Bissau, and total of the preceding four areas). Therefore, the first column will contain the year being considered. The second through sixth columns will have the forecasted attacks. The final (seventh) column will have the average number of forecasted attacks for the year.

The summary of the steps undertaken to conduct the Series Hazard Model are:

1) Construct a table with seven columns. The first, second, fourth, and fifth columns contain the FPR, TPR, p-Pred, and AUC (respectively) from the ROC curve information. The third column is the sum of FPR and TPR. The sixth column is the result of Equation (7), whereas the seventh column is the log of the results. The final row in the table will have the sum of logs.

2) The sum of logs can be considered by the power of 1 through 5 in order to develop the forecast predictor. The four countries under consideration are Equatorial Guinea, Ghana, Guinea and Guinea Bissau. The number of piracy attacks is determined for each country as well as the total pirate attacks which occurred for the considered time period. The piracy attack values are multiplied by the sum of logs to the power of 1 through 5 and rounded to one decimal point, then summed to reach a final value for each location (including the total). Next, the log is found of these values and then summed. The final value is the ln of the sum of the logs, which is then rounded to the nearest number to obtain the predictor of forecasted attacks.

7. Bayesian Networks (BN’s) Technique for Maritime Risk Analysis

[30] [31] [32] all suggest that the Bayesian Network (BN) is an effective risk modelling tool, which can be used for maritime transportation analysis. There are numerous favourable features through the BN, such as situational factors. Using these situational factors, BNs can allow contextualization of consequence occurrences, and commonly represented by observable aspects. BNs allows for a sensitivity analysis because different evidence types can be integrated through different probability types. [33] suggest that BN’s are graphical models based on probabilities, as shown below:

$P\left(V\right)={\displaystyle {\prod}_{i=1}^{n}\left({V}_{i}|Pa\left({V}_{i}\right)\right)}$ (9)

Equation (9): Bayesian Network Modelling Equation

In the final conduction of this analysis, the probability is the p-Pred and is multiplied by the values (total number of occurrences). For the constructed table, the individual rows will be summed then log (sum). If the log(sum) is deemed to be between −0.15 and 0.20, the value will be considered reliable and the predictor can be used to determine the forecast predictor. If the value is outside of this range, the log will be found of log (sum) and added to p-Pred (used in the previous model and considers the power of 1 through 5) used for the forecast model.

Considering the BN model, the forecast predictor is based on the sum of logs to the power of 1 through 5 and is the absolute value of the total for the countries and years. If the criteria (between -0.15 and 0.20) is met, no further action will be taken to the predictor. However, if the criteria are not met, the log of log (sum) will be added to the predictor. The Gulf of Guinea includes Equatorial Guinea, Ghana, Guinea, and Guinea Bissau. The total number of piracy attacks are found for each of the locations for the time period of 2006 to 2015 and totals are found for all locations for all years. Each location, as well as the total, for five final outputs, has the number of piracy attacks for the year multiplied by the sum of logs to the power of 1 through 5 (resulting in 5 rows for each country, as well as a sixth totals row for annual data). Each of these products are rounded to one decimal point. In order to find the predictor, the log is found for each total for all 5 data sets, resulting in 60 logs. Once all of the logs are determined, they are totalled then the ln is found of the total logs. This final ln is the predictor that attacks will occur. The ln is rounded to the nearest whole number and is used to forecast anticipated future attacks. It is assumed that piracy attacks will increase/decrease by the predictor between 2016 to 2030. For 2016, the values are based on 2015, whereas for 2017 and after, the values are based on the previous forecasted attacks, as shown in the table. The table constructed to show the forecasts will be for all 5 scenarios (Equatorial Guinea, Ghana, Guinea, Guinea Bissau, and total of the preceding four areas). Therefore, the first column will contain the year being considered. The second through sixth columns will have the forecasted attacks. The final (seventh) column will have the average number of forecasted attacks for the year.

The summary of steps undertaken to conduct the BN model are:

1) Construct a table which multiplies the p-Pred from the ROC curve by the values. These values are rounded to one decimal point and then summed. All rows are summed then logged. If the log is within the range of −0.15 to 0.20, nothing is done to the predictor. If the log is outside of this range, the log (sum) is then logged again, which will be added to the predictor.

2) The sum of logs can be considered by the power of 1 through 5 in order to develop the forecast predictor. The previous step is outside of the stated criteria, it is added to the forecast predictor. It is prudent to determine the total number of pirate attacks for the countries being considered. At these locations, the total number of piracy attacks for the time period is determined. The piracy attack values are multiplied by the sum of logs to the power of 1 through 5 and rounded to one decimal point, then summed to reach a final value for each location (including the total). Next, the log is found of these values and then summed. The final value is the ln of the sum of the logs, which is then rounded to the nearest number to obtain the predictor of forecasted attacks.

8. Methodological Assumptions, Limitations, and Delimitations

Methodological assumptions are those things that are accepted as true or plausible by researchers and colleagues. Therefore, methodological assumptions refer to the assumption that certain aspects of the study are true based on the population, statistical test, design, or methodology. Commonly, assumptions involve honesty and truthful responses in survey studies. For this specific study, it is assumed that the data obtained was recorded accurately and without bias. It is also assumed that the data shown in this analysis is complete. This study is limited to three statistical models. However, the selected models were deemed to be the most flexible in order to conduct the forecast for anticipated piracy attacks. The study is limited to a specific time period (2006 to 2015). The study is limited to determining the correlation and forecast of piracy attacks, not causation.

9. Conclusions

In the Gulf of Aden, piracy and armed robbery are on the decline but there is an increase in the Gulf of Guinea. Records from the Piracy Reporting Centre which was specified by the International Maritime Bureau (IMB) (2015) indicate that 58 pirates’ attacks occurred with then being hijacks. In the first quarter of 2013, 11 attacks were reported, as against 27 attacks in the years 2012, almost thrice the number in 2011, with Nigeria being the country affected the most. However, other coastlines countries suffered the surged of piracy in the Gulf of Guinea. There comprise Ghana, Togo, Benin and Bakassi. A new focus has been the Ghanaian waters where pirate activities have increased amidst the oil found in these which possess the capacity to affect maritime transportation and maritime security [34] .

The research methodology is designed to provide a process for conducting the study. This includes steps and protocols for data collection, processing, and analysis. Based on this knowledge, the research methodology chapter is divided into seven sections. The chapter considers three separate models: 1) ordinal logistic regression; 2) series hazard modelling for maritime transport risk analysis; and 3) Bayesian networks (BN’s) technique for maritime risk analysis to analyse one set of data. The goal of the chapter was to explore the methodology used so that future researchers could reconstruct the study as desired. The study was conducted as a panel analysis based on a longitudinal quasi-experimental research design and focused on piracy attacks that occurred between 2006 and 2015.

For the ordinal logistic regression analysis, the researcher determined the range size based on total number of occurrences in order to determine the coefficient rate (Px through Py), allowing for equal numbers in each range, counted the number of occurrences for all years and countries within a specified range, and determined the ratio of occurrences divided by total occurrences based on the range being considered. The assumptions of this model were identified as 0 through 19 and were determined through the ratio of occurrences, 1-ratio of occurrences, and the log. Therefore, the ratio of occurrences is the numerator. The value found for 1-ratio of occurrences (also known as prob (event)) is the denominator. Once the numerator and denominator are divided, the log of the result is found. The logit was determined through the determination of the ln of the result of the numerator and denominator (as divided in the previous step). β is the log of the result (the results of the preceding step) and X is the count from the second step. The variable is found by multiplying β and X. For the first variable (0) there is neither a logit nor an X variable, and βX is assumed to be β. Next, the raw data for the time period and countries was entered into the RealStats add-in for Microsoft Excel to conduct the binary logistic regression and the multimodal logistic regression in order to obtain the ROC curve, which provides information regarding the p-Pred, failure rate, success rate, failure cumulative rate, success cumulative rate, FPR, TPR, and AUC. This information allowed for the forecast. In order to do this, the predictor of occurrences must be determined. As a start, pirate attacks are noted, and the total number found for Equatorial Guinea, Ghana, Guinea, and Guinea Bissau. The time period for these attacks is vital for each location hence it is important to determine the total number of attacks. The piracy attack values are multiplied by the β value and rounded to one decimal point, then summed to reach a final value for each location (including the total). Next, the log is found of these values and then summed. The final value is the ln of the sum of the logs, which is then rounded to the nearest number to obtain the predictor of forecasted attacks.

For the series hazard model, a table was constructed with seven columns where the first, second, fourth, and fifth columns contain the FPR, TPR, p-Pred, and AUC (respectively) from the ROC curve information. The third column is the sum of FPR and TPR. The sixth column is the result of Equation (7), whereas the seventh column is the log of the results. The final row in the table will have the sum of logs. The sum of logs can be considered by the power of 1 through 5 in order to develop the forecast predictor. The selected Gulf of Guinea (GoG) countries should have their total number of attacks found as well as the totals for the number of pirate attacks during that time period for these specific areas. The piracy attack values are multiplied by the sum of logs to the power of 1 through 5 and rounded to one decimal point, then summed to reach a final value for each location (including the total). Next, the log is found of these values and then summed. The final value is the ln of the sum of the logs, which is then rounded to the nearest number to obtain the predictor of forecasted attacks.

For the BN model, a table was constructed where the p-Pred (found from the ROC curve) was multiplied by the values in the raw data. These values are rounded to one decimal point and then summed. All rows are summed then logged. If the log is within the range of −0.15 to 0.20, nothing is done to the predictor. If the log is outside of this range, the log (sum) is then logged again, which will be added to the predictor. The sum of logs can be considered by the power of 1 through 5 in order to develop the forecast predictor. The previous step is outside of the stated criteria, it is added to the forecast predictor. To commence, find the total number of pirate attacks for each of the four GoG countries. The time period under consideration is of essence, and should be used for the total number of total pirate attacks in that location. The piracy attack values are multiplied by the sum of logs to the power of 1 through 5 and rounded to one decimal point, then summed to reach a final value for each location (including the total). Next, the log is found of these values and then summed. The final value is the ln of the sum of the logs, which is then rounded to the nearest number to obtain the predictor of forecasted attacks. The goal is to determine the difference in forecasts based on each of the different models.

References

[1] Treves, T. (2009) Piracy, Law of the Sea, and Use of Force: Developments off the Coast of Somalia. European Journal of International Law, 20, 399-414.

https://doi.org/10.1093/ejil/chp027

[2] Holmgren, A. (2013) Piracy’s Persistence in the Gulf of Guinea.

http://www.africandefence.net/piracys-persistence-in-the-gulf-of-guinea/

[3] Barrios, C. (2013) Fighting Piracy in the Gulf of Guinea. European Union Institute for Security Studies, May 2013.

[4] Creswell, J.W. (2013) Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. SAGE Publications.

https://books.google.com/books?hl=en&lr=&id=EbogAQAAQBAJ&oi=fnd&pg=PP1&dq=

Creswell,+J.W.+

[5] Leedy, P. and Ormrod, J. (2001) Practical Research: Planning and Design. 7th Edition, Merrill Prentice Hall and SAGE Publications, Upper Saddle River, NJ and Thousand Oaks, CA.

[6] Trochim, W.M.K. and Land, D.A. (1982) Designing Designs for Research. The Researcher, 1, 1-6.

[7] Shadish, W.R., Cook, T.D. and Campbell, D.T. (2002) Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton, Mifflin and Company.

[8] Carlson, N.R., Heth, D., Miller, H., Donahoe, J. and Martin, G.N. (2009) Psychology: The Science of Behavior. 2nd Edition, Allyn & Bacon, Boston.

[9] Van der Krieke, L., Blaauw, F.J., Emerencia, A.C., Schenk, H.M., Slaets, J.P., Bos, E.H., et al. (2017) Temporal Dynamics of Health and Well-Being: A Crowdsourcing Approach to Momentary Assessments and Automated Generation of Personalized Feedback. Psychomatic Medicine, 79, 213-223.

[10] DiNardo, J. (2010) Natural Experiments and Quasi-Natural Experiments. In: Durlauf, S.N. and Blume, L.E., Eds., Microeconometrics, Palgrave Macmillan, UK, 139-153.

https://doi.org/10.1057/9780230280816_18

[11] Rossi, P.H., Lipsey, M.W. and Freeman, H.E. (2003) Evaluation: A systematic approach., 4th Edition, Sage Publications, Inc., Thousand Oaks, CA.

[12] Campbell, D.T. and Stanley, J.C. (2015) Experimental and Quasi-Experimental Designs for Research. Ravenio Books.

[13] Harmon, R.J., Morgan, G.A., Guner, J.A. and Harmon, R.J. (2000) Quasi-Experimental Designs. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 794-796.

https://doi.org/10.1097/00004583-200006000-00020

[14] Campbell, D.T. and Overman, E.S. (1988) Methodology and Epistemology for Social Science: Selected Papers. University of Chicago Press, Chicago.

[15] DeRue, D.S., Nahrgang, J.D., Hollenbeck, J.R. and Workman, K. (2012) A Quasi-Experimental Study of After-Event Reviews and Leadership Development. Journal of Applied Psychology, 97, 997-1015.

https://doi.org/10.1037/a0028244

[16] Robson, L.S., Shannon, H.S., Goldenhar, L.M. and Hale, A.R. (2001) Quasi-Experimental and Experimental Designs: More Powerful Evaluation designs. Canada.

[17] Meyer, B.D. (1995) Natural and Quasi-Experiments in Economics. Journal of Business & Economic Statistics, 13, 151-161.

https://doi.org/10.1080/07350015.1995.10524589

[18] Brooks, C. (2014) Introductory Economics for Finance. Cambridge University Press, Cambridge.

[19] Davies, A. and Lahiri, K. (1995) A New Framework for Testing Rationality and Measuring Aggregate Shocks Using Panel Data. Journal of Econometrics, 68, 205-227.

[20] Hsiao, C., Pesaran, M., Lahiri, K., Lee, L.-F., Hsiao, C., Pesaran, M., et al. (2010) Analysis of Panels and Limited Dependent Variable Models. Cambridge University Press, Cambridge.

[21] Glonek, G.F. and McCullagh, P. (1995) Multivariate Logistic Models. Journal of the Royal Statistical Society. Series B: Methodological, 57, 533-546.

[22] Greene, W.H. (2012) Econometric Analysis. 7th Edition, Pearson Education, Inc., Boston, MA.

[23] Harrell Jr., F.E. (2015) Ordinal Logistic Regression. In: Harrell Jr., F.E., Regression Modeling Strategies, Springer Series in Statistics, Springer International Publishing, Switzerland, 311-325.

https://doi.org/10.1007/978-3-319-19425-7_13

[24] Kleinbaum, D.G. and Klein, M. (2010) Ordinal Logistic Regression. In: Kleinbaum, D.G. and Klein, M., Logistic Regression, Statistics for Biology and Health, Springer, New York, 463-488.

https://doi.org/10.1007/978-1-4419-1742-3_13

[25] McCullagh, P. (1980) Regression Models for Ordinal Data. Journal of the Royal Statistical Society. Series B: Methodological, 42, 109-142.

[26] Jiang, B. (2014) Maritime Piracy in Malacca Strait and South China Sea: Testing the Deterrence and Reactance Models. University of Pennsylvania and University of Maryland.

[27] Dugan, L. (2011) The Series Hazard Model: An Alternative to Time Series for Event Data. Journal of Quantitative Criminology, 27, 379-402.

https://doi.org/10.1007/s10940-010-9127-1

[28] Chen, X., Minacapelli, L., Fishman, S., Orehek, E., Dechesne, M., Segal, E. and Kruglanski, A.W. (2008) Impact of Political and Military Interventions on Terrorist Activities in Gaza and the West Bank: A Hazard Modeling Analysis.

[29] LaFree, G., Dugan, L. and Korte, R. (2009) Is Counter Terrorism Counterproductive? Northern Ireland 1969-1992. Criminology, 47, 501-530.

[30] Aven, T. (2008) Risk Analysis: Assessing Uncertainties beyond Expected Values and Probabilities. John Wiley & Sons, Chichester, UK.

https://doi.org/10.1002/9780470694435

[31] Fenton, N. and Neil, M. (2012) Risk Assessment and Decision Analysis with Bayesian Networks. CRC Press, Boca Raton, FL.

[32] Goerlandt, F. and Montewka, J. (2015) A Framework for Risk Analysis of Maritime Transportation Systems: A Case Study for Oil Spill from Tankers in a Ship-Ship Collision. Safety Science, 76, 42-66.

[33] Koller, D. and Friedman, N. (2009) Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA.

[34] Bowden, A. (2010) The Economic Cost of Maritime Piracy. One Earth Future Foundation.