An Econometric Time Series GDP Model Analysis: Statistical Evidences and Investigations

Show more

1. Introduction

As a measure of performance for an economy, the (GDP), gross domestic product, is the value of all final goods and services produced within a country in a year. GDP data is widely used economic data in the field of time series modeling and analysis. GDP data are used to meet a wide variety of requirements, such as in industry, finance, research institutions, and other fields. Forecasting economic model is an essential component of a country’s economy decision-making process. The GDP forecast is necessary for policy makers to forecast economic model. For these reasons, this paper investigates the performance of GDP model for the Sudan. It aims at analyzing time series econometric model of macroeconomic variable GDP in the country.

Time series models and analysis has been discussed in [1] [2] [3] [4] [5] and many others. Time series analysis aims at identifying data patterns and trends as well as explanation of data modeling and forecasting. Two principal approaches are adopted to maintain time series analysis which depends on the time or the frequency domain. Several procedures are used to analyze data within these domains. A useful common technique is the Box-Jenkins ARIMA method [1] , which can be used for univariate or multivariate data set analyses. The ARIMA technique uses moving averages (MA), smoothing, and regression methods to detect and remove data autocorrelation. Tools of time series analysis have been intensively discussed by [6] .

Many statistical tests are used in time series models in order to make it a stationary series and integrated; thus, Box-Jenkins procedure is used for the determination of ARMA, and OLS method is used to estimate the model parameters. In the following sections, among the techniques those are useful for analyzing will be identified.

This paper is organized as follows: Chapter 2 is devoted to the proposed model of the study. A background about data collection and methodology is presented in Chapter 3 while Chapter 4 is devoted to data analysis and results which has been discussed in Chapter 5 and then a brief conclusion has been introduced in Chapter 6.

2. The Proposed Model

The methodology of time series analysis composed of two steps: constructing a data model for that time series, and forecasting the future values.

For a regular time series pattern, the value of the series, Y_{t}, should be a function of previous values. If Y is the target value that we are trying to model and predict, and Y_{t} is the value of Y at time t, then the goal is to build a model of the type:

${Y}_{t}=f\left({Y}_{t-1},{Y}_{t-2},{Y}_{t-3},\dots ,{Y}_{t-n}\right)+et$ (1)

where Y_{t}_{−1} is the previous observation value of Y, Y_{t}_{−2} is the value two observations ago, etc., and e_{t} (a random shock), represents noise that does not follow a predictable pattern. Variables Values occurring prior to the current observation are called lag values. In a repeating pattern time series, the value of Y_{t} is usually highly correlated with Y_{t}_{−cycle}. Thus, the goal of constructing a time series model is to build a model such that the error between the predicted value of the target variable and the actual value is as small as possible.

Consider a time series of data X_{t}, the ARMA model consists of two parts, an autoregressive (AR) part and a moving average (MA) part. Following [7] , the AR(p) model is written in the form:

${Y}_{t}=c+\underset{i=1}{\overset{p}{{\displaystyle \sum}}}{\phi}_{i}{Y}_{t-i}+{\epsilon}_{t}$ (2)

where ${\phi}_{i},\mathrm{...},{\phi}_{p}$ are the model parameters, c is a constant (which may be omitted for simplicity) and ${\epsilon}_{t}$ is an error term. The MA (q) notation stands for the moving average model of order q:

${Y}_{t}={\epsilon}_{t}+\underset{i=1}{\overset{q}{{\displaystyle \sum}}}{\theta}_{i}{\epsilon}_{t-i}$ (3)

where the θ_{1}, ..., θ_{q} are the parameters of the model and the ε_{t}, ε_{t−1}, ... are, the error terms.

The notation ARMA (p, q) refers to the model with p autoregressive terms and q moving average terms. This model contains the AR (p) and MA (q) models,

${Y}_{t}={\epsilon}_{t}+\underset{i=1}{\overset{p}{{\displaystyle \sum}}}{\phi}_{i}{Y}_{t-i}+\underset{i=1}{\overset{q}{{\displaystyle \sum}}}{\theta}_{i}{\epsilon}_{t-i}$ (4)

where the error terms ε_{t} are assumed to be independent identically-distributed random variables with mean zero and ε_{t} - N (0, σ^{2}) where σ^{2} is the variance.

The process ${\left(Y\right)}_{t}$ is said to be ARIMA (p, d, q) if:

${\left(1-l\right)}^{d}{\varnothing}^{*}\left(l\right){Y}_{t}=c+\theta \left(l\right){\epsilon}_{t}$ (5)

where

${\varnothing}^{*}\left(l\right)$ is defined in $\varnothing \left(l\right)=\left(1-l\right){\varnothing}^{*}\left(l\right)$ , (6)

${\varnothing}^{*}\left(z\right)\ne 0$ for all $\left|z\right|\le 1$ . And $\theta \left(l\right)$ is defined in $\theta \left(z\right)\ne 0$ for all $\left|z\right|\le 1$ .

The process ${\left(Y\right)}_{t}$ is stationary if and only if d = 0 in which case it reduces to ARMA (p, q) process:

$\varnothing \left(l\right){Y}_{t}=c+\theta \left(l\right){\epsilon}_{t}$ (7)

The Box-Jenkins methodology [1] is a five-step technique for identifying, selecting, and assessing models for a type of time series data. These steps are:

1) Time series stationary. A time series is said to be stationary if both its mean and its variance remain constant through time. Classical Box-Jenkins ARMA models only work satisfactorily with stationary time series.

2) Identify a (stationary) conditional mean model for underlying data. The sample autocorrelation functions (ACF) and partial autocorrelation functions (PACF) can help with this selection. For an autoregressive (AR) process, the sample ACF decays gradually, but the sample PACF cuts off after a few lags. Conversely, for a moving average process (MA), the sample ACF cuts off after a few lags, but the sample PACF decays gradually. If both the ACF and PACF decay gradually, consider an Auto-Regressive Moving Average (ARMA) model.

3) Model Specification stage, and estimation of the parameters required.

4) Model checks for goodness-of-fit by using methods such as Proportion of variance explained by model or Correlation between actual and predicted. Residuals should be uncorrelated, homoscedastic, and normally distributed with constant mean and variance.

5) Forecasting: The model can be used to forecast or generate simulations over a period of time after checking its goodness of fit and its forecasting ability.

Adopting the ARIMA (auto-regressive, integrated, moving average) method iteratively, to best-fit time series data, then auto-regressive component (AR) in ARIMA is designated as p, the integrated component (I) as d, and moving average (MA) as q. The AR component represents the effects of previous data observations. The I component represents trends, including seasonality. And the MA component represents effects of previous random shocks (or error). To fit an ARIMA model to a time series, the order of each model component must be selected. Usually a small integer value (usually 0, 1, or 2) is determined for each component.

3. Data and Methodology of Collection

The GDP is equals to the total expenditures for all final goods and services produced within the country in a stipulated period of time. It is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products [8] .

The Sudan Central Bureau of Statistics (CBS) issues annual report includes all National accounts, while the Central Bank of the Sudan [9] also issues its annual economic records. Annual collections of the official national accounts data reported to the United Nations Statistics Division by the countries in form of the United Nations [8] . If a full set of official data is not reported, estimation procedures are employed to obtain estimates for the entire time series.

Annual percentage growth rate of GDP at market prices is based on constant local currency. Aggregates are based on constant U.S. dollars. Reported by the World Bank, the GDP in Sudan was worth 97.156 billion US dollars in 2015. It represents 0.14 percent of the world economy. GDP in Sudan averaged 18.774 USD Billion from 1960 until 2015, reaching an all time high of 97.156 USD Billion in 2015 and a record low of 1.307 USD Billion in 1960.

According to [10] , Sudan’s real GDP is predicted to recede slightly that year to 2.7% because of fiscal consolidation and is projected to reach 3.8% in 2015. However, according to the African Economic Outlook (AEO) report, the country’s real GDP grew in 2014 by 3.6% up from 1.4% in 2012 due to increase of agriculture, oil, gold and transit revenues. In 2005, agricultural sector reported contribution of 33.2% to GDP, industry about 22%, and 44.8% for services sector while in 2014, Agricultural sector contributed to the GDP by about 27.5%, industry around 20.7% and the services sector 51.8% for the year after. The reports also stated that inflation remained high at 36.5% and raised to 36.9% in 2014 and drop to 16.9% for 2015. The trade balance has been reported negative since 1985 except for the year 2000 [9] .

The resulting high external and internal deficits, coupled with the sustained American sanctions as well as the security concerns in the country, affected the economic situation which led to devaluation to supplement the budget, including the devaluation of the currency by 29% and removal of fuel subsidies worth SDG 3.6 billion (Sudanese pounds) about 1.2% of GDP, resulting in riots. Economic linkages and value addition were weakened during the period of oil-driven growth (1999-2011), mainly in agriculture (which provided 47.6% of total jobs in 2011). The major field of government expenditure might be on the security services, though no official figures were displayed.

Also, the high taxes along the supply chains and the recent increase in tariffs on imported inputs in addition to the high costs of energy and infrastructure services raised domestic resource costs and reduced domestic value addition. During 2001-2007, 41% of all factories closed because of intense competition.

After the production of oil fields in the southern Sudan from 1998 onward, the economy developed rapidly, reaching levels of 8% per annum. However, the fall in oil revenues after the secession of what is now South Sudan in 2011 has affected greatly on GDP growth, which stands negative (−6%) in 2013.

GDP per capita―current prices estimated as US$1985 for 2014 while the GDP (Purchasing Power Parity) is estimated to be 168 billion of International dollars in 2015, while the estimate of GDP per capita―PPP is 4522 International Dollars for 2014, (see [9] and [8] ).

Sudan’s trade suffers from several difficulties, despite persistent efforts by the government to liberalize trade. Import restrictions, discriminatory taxes, delays in customs clearance and non-transparent regulations are some of the factors impeding Sudanese trade.

Some chief import commodities of Sudan are: Manufactured goods, Transport equipment, Medicines and Chemicals. The main share of Sudan’s export partners in its total trade, according to CIA World Fact book reports for 2009, UAE (32%), China (16%), Saudi Arabia (15.5%), while the import partners are China (26.3%), UAE (10%), India (9%), Egypt (5.6%) and Turkey (4.7%).

4. Data Analysis

Time series are analyzed in order to understand the nature of underlying structure and mechanism of the function that produce the observations. In this section, the data of GDP statistics of Sudan, which include the current and constant prices in million US$ for the period (1960-2015) will be investigated.

Figures 1-3 show a line graph of GDP levels in the period under consideration. Overall, the line graph shows a clear dominance of a long-term upward trend, suggesting a non-stationary time series in levels. In this analysis of GDP data, a summary of the model descriptive statistics for GDP) is given in Table 1, where one way ANOVA Summary for the same model is shown in Table 2 and Table 3 for the classical regression model summarized in Table 4, and the coefficients are presented in Table 5 with summary and parameter estimates using linear equation method are shown in Table 6. Significant tests are also summarized for the model. Using Time Series Modeler, the model type was shown in the estimates are based on 5% level of significance. The R-square value is over 61% for linear, logarithmic, quadratic and exponential methods, and the F value of regression is highly significant in Table 7 and Figure 4. Then the model

Figure 1. GDP 1960-2015 in billion US$. Source: world Bank.

Figure 2. Data Explore Chart Sudan GDP in billion US$ (1960-2015).

Table 1. Descriptive statistics.

Figure 3. Linear GDP model.

Figure 4. Selected GDP models.

Table 2. One-way ANOVA for GDP.

GDP in billion US$.

Table 3. ANOVA.

(a) Predictors: (Constant), year; (b) Dependent Variable: GDP in billion US$.

Table 4. Classical regression model summary.

(a) Predictors: (Constant), year.

Table 5. Model coefficients.

(a) Dependent Variable: GDP in billion US$.

Table 6. Model summary and parameter estimates.

Dependent Variable: GDP in billion US$. The independent variable is year.

Table 7. Model summary and parameter estimates.

Dependent Variable: GDP in billion US$. The independent variable is year.

described in Table 8 has been reached (ARIMA (0, 0, 0), and its fit shown in Table 9. The model statistics has been shown in Table 10 accompanied by descriptive graph (Figure 5), which suggested upward trend for the series.

When building a time series model, it is necessary to include lag values that

Table 8. Model description.

Table 9. Model fit.

Table 10. Model statistics.

Figure 5. GDP model.

have large, positive autocorrelation values or that have large negative autocorrelations. The partial autocorrelation is the autocorrelation of time series observations separated by a lag of k time units with the effects of the intervening observations eliminated. Autocorrelation and partial autocorrelation tables are also provided for the residuals (errors) between the actual and predicted values of the time series. Proportion of variance explained by model is the best single measure of how well the predicted values match the original values. If the predicted values exactly match the original values, then the model would explain 100% of the variance. In fact this is not always the case (here the model explains 61.6% of the variance due to the R square value), as seen in Table 3, Table 6 and Table 7.

Examining the autocorrelation table shown in Table 11, we see that the highest autocorrelation is −0.313 which occurs with a lag of 15. Hence we want to be sure to include lag values up to 15 when building the model.

The autocorrelation ACF (Table 11 and Figure 6) and partial autocorrelation PACF tables (Table 12 and Figure 7) provide valuable information about the significance of the lag variables. An autocorrelation is the correlation between the target variable (GDP) and lag values for the same variable. Correlation values range from −1 to +1. A value of +1 indicates that the two variables move together perfectly; a value of −1 indicates that they move in opposite directions (see the results of Table 12. The second column of the autocorrelation table shows the standard error of the autocorrelation, this is followed by the t-statistic in the third column. The right side of the autocorrelation table is a bar chart with asterisks used to indicate positive or negative correlations right or left of the centerline. The dots shown in the chart mark the points two standard deviations from zero. If the autocorrelation bar is longer than the dot marker (that is, it covers it), then the autocorrelation should be considered significant. In this model, significant autocorrelations occurred for all lags except for lag 15. Based on the assumption that the series are not cross correlated and that one of the series is white noise, the cross correlations and range of lags (from −7 to +7 are displayed in Table 13 and Figure 8). The figure shows confidence limit to be all above zero for the GDP.

Figure 6. Autocorrelation function.

Figure 7. Partial autocorrelation function.

Figure 8. CCF.

Table 11. Autocorrelations for GDP in billion US$.

Series: GDP in billion US$. (a) The underlying process assumed is independence (white noise); (b) Based on the asymptotic chi-square approximation.

Table 12. Partial autocorrelations.

Series: GDP in billion US$.

Table 13. Cross correlations, range of lags from −7 to 7.

Series Pair: GDP in billion US$ with year. (a) Based on the assumption that the series are not cross correlated and that one of the series is white noise.

Thus, if we rely on this information, we may conclude that we have a good fit. From Table 5, we could put the model as the following form:

$y={\beta}_{0}+{\beta}_{1}{x}_{i},\text{orGDP}=9.314+0.785{x}_{i},\text{with standard error}\left(0.120\right)\text{for}{\beta}_{1}$

Based on forecasting model results, the forecasted values for Sudan GDP (in in billion US$), are 99.51 (for the year 2017), 101 (2018), 106.58 (2019) and 112.62 (for the year 2020). The Annual growth rates are estimated to be about 5.3%, 5.36%, 5.52% and 5.67% for the above years respectively.

5. Discussion

We evaluate Autoregressive Integrated Moving Average (ARIMA) model of the GDP series using Box-Jenkins methodology by using four different equations which are, linear, logarithmic, quadratic and exponential equations. I also successively eliminated the AR or the MA term while leaving the other term in, but still got higher values for all test parameters. Based on the parameter values, I found that the ARIMA (0, 0, 0) is the best model for the data. Comparing with other models, ARIMA model has been selected as the final model. We provide method for prediction and forecasting based on data, which may be applicable and useful to government and business institutions.

Sudan GDP Annual Growth Rate Forecasts are projected using an autoregressive integrated moving average (ARIMA) to be 4.9 for 2017 and 4.9 for 2020, using analysis expectations. We model the past behavior of Sudan GDP Annual Growth Rate using historical data and adjustments of the coefficients of the econometric model by taking into account analysis assessments and future expectations. It can be seen that time series are very complex because each observation is somewhat dependent upon the previous observation, and often is influenced by more than one previous observation. Random error is also influential from one observation to another. These influences are called autocorrelation―dependent relationships between successive observations of the same variable. The challenge of time series analysis is to extract the autocorrelation elements of the data, either to understand the trend itself or to model the underlying mechanisms.

A word of caution about using multiple regression techniques with time series data: because of the autocorrelation nature of time series, time series violate the assumption of independence of errors. Type I error rates will increase substantially when autocorrelation is present. Also, inherent patterns in the data may dampen or enhance the effect of an intervention; in time series analysis, patterns are accounted for within the analysis.

6. Conclusion

This article has discussed the analysis for GDP statistics of the Sudan. The ARIMA method used here might be appropriate only for a time series that is stationery (i.e., its mean, variance, and autocorrelation should be approximately constant through time) and it is recommended that there are at least 50 observations in the input data (the underlying model has 55 observations). It is also assumed that the values of the estimated parameters are constant throughout the series. The article has discussed changes in the GDP for the period (1960-2015). The results for the analysis, indicated that model, provides useful information for identifying GDP trend. An important policy consideration rising from the study is that there is increasing trend for the model of the data. More advanced future work can be done on the basis of these investigations, particularly in residual analysis of the model.

References

[1] Box, G.E., et al. (1994) Time Series Analysis: Forecasting and Control. 3rd Edition, Prentice Hall, Englewood Cliffs, NJ.

[2] Brockwell, P.J. and Davis, R.A. (2002) Introduction to Time Series and Forecasting. 2nd Edition, Springer, New York.

https://doi.org/10.1007/b97391

[3] Yang, L. (2009) Modeling and Forecasting China’s GDP Data with Time Series Models. D-level Essay in Statistics. Department of Economics and Society, Hogskolan Dalarna, Falun.

[4] Andrei, E.A. (2011) Econometric Modeling of GDP Time Series. Theoretical and Applied Economics, 18, 91-98.

[5] Boshnakov, G.N. (2016) Introduction to Time Series Analysis and Forecasting, 2nd Edition, Wiley Series in Probability and Statistics, by Douglas C. Montgomery, Cheryl L. Jennings and Murat Kulahci (eds). Published by John Wiley and Sons, Hoboken, NJ, USA, 2015. Total Number of Pages: 672 Hardcover: ISBN: 978-1-118-74511-3, ebook: ISBN: 978-1-118-74515-1, etext: ISBN: 978-1-118-74495-6. Journal of Time Series Analysis, 37, 864.

https://doi.org/10.1111/jtsa.12203

[6] Baum, C.F. (2007) Powerful New Tools for Time Series Analysis. Boston College & DIW.

[7] Pasavento, E. (2007) Residuals-Based Tests for the Null of No-Co Integration: An Analytical Comparison. Journal of Time Series Analysis, 28, 111-137.

https://doi.org/10.1111/j.1467-9892.2006.00501.x

[8] International Monetary Fund (IMF) (2014) World Economic Outlook (WEO) Database. Global Finance Magazine.

[9] Central Bank of Sudan (2015) Sudan GDP and Economic Data. Country Report 2015.

[10] (2015) Sudan GDP Annual Growth Rate.

http://www.tradingeconomics.com/sudan/gdp-growth-annual/forecast