TEL  Vol.7 No.3 , April 2017
Does the Biased Coefficient Problem Plague the VAR Model?
Author(s) Yunyun Lv
ABSTRACT
This paper documents evidence to investigate if the explanatory variables are always correlated with the error term in the vector autoregression (VAR) model because of the property of the VAR model. I use Christiano et al. (CEE, 2005) as an example to examine this argument empirically. According to the findings of this paper, the impulse responses provided by the structural VAR model may be derived from the biased estimates if we allow variables to be correlated with each other through different horizons. It remains possible for a skeptic to maintain some dominant views inferred from the biased coefficients of the SVAR models.

1. Introduction

Since Sims (1980) [1] , the vector autoregression (VAR) model becomes a useful tool to make out-of-sample forecasts in macroeconomics, especially forecasting how the variables are going to change after a shock by adding restrictions to the VAR model, holding all other shocks constant. However, a lot of plausible and contrary results of the structural vector autoregression (SVAR) models exist in the literature. Bernanke et al. (1997) [2] test the hypothesis that the response of monetary policy to oil shocks causes recessions by BGW model. Hamilton and Herrera (2004) [3] demonstrate that when more lags of variables are modeled, monetary policy is far less powerful. Moreover, Hamilton (1983) [4] suggests a significantly negative correlation between oil prices and output during some of the recessions before 1972. Hooker (1996) [5] provides evidence that the correlation between oil prices and economic activity becomes much weaker since 1985. The instability of the empirical relations among variables in the literature may be attributed to the biased estimated coefficients of the SVAR models.

This paper demonstrates a weakness of current empirical practice in the VAR literature: I try to check if the estimated error term is always correlated with the lagged variables on the right-hand side (R.H.S.) of the VAR model because of the structure of the VAR model, which is an analogous question of the correlations between the identified shocks and the explanatory variables in the structural vector autoregression (SVAR) model. If the above correlations exist, it indicates that the estimated coefficients of the VAR model are biased.

Likewise, Friedman (1961) [6] advocates that for the eighteen non-war business cycles since 1870, monetary policy affects economic conditions only after a lag which is long and variable. Blanchard and Quah (1989) [7] appeals to an analogous argument regarding that some variables are more important at some horizons than at others. They point out that demand disturbances have a hump-shaped effect on output, which disappears after about two years, while supply disturbances have a continually increasing effect on the output which reaching a plateau after five years.

Lv (2017) [8] provides a new assumption that different variables may affect the economy through different horizons. The impulse-response function (IRF) usually employs the same variables selected one-step ahead for multi-step analyses. When a one-step-ahead structural VAR model is used for all horizons, the fluctuations of the variables selected one-step ahead may affect some omitted variables which are not in this model, then those omitted variables may affect the variables in the system significantly over long horizons. From the findings of Lv (2017) [9] , the contributions of the omitted variables may be taken by the variables in the SVAR model under the new assumption, so the traditional impulse response results may not be credible since they ignore the significant effects of these important omitted variables through a long-horizon perspective.

The innovation of this paper is that under the new assumption that variables may vary over different horizons, I test if the variables may be correlated with the error terms in the VAR model through a long-horizon perspective. To my knowledge, this is the first paper to check the biased coefficient problem of the VAR model. This paper contributes to the literature by questioning the reliability of the impulses response results derived from the existing SVAR models. When talks about the ceteris paribus, if outside variables will not change, the estimated coefficients of variables in the SVAR models are actually overestimated. I provide convincing evidence that the error term may be always correlated with variables on the R.H.S. of a VAR model through different horizons.

The paper is organized as follows. Section 2 interprets the relations between the identified shocks and the lags of variables on the R.H.S. of the SVAR model. Section 3 uses an existing SVAR model to gauge the biased coefficient problem empirically. Concluding comments are given in Section 4.

2. Interpretation

The traditional VAR model tries to add sufficient lags to make sure that the equations are not misspecified and the residuals are not autocorrelated. However, if we include more lags in the model, these lags may be correlated with the error term. In this paper, since all variables in the SVAR system are considered as the combinations of shocks, I try to check if the estimated coefficients of the SVAR model are biased.

In detail, a standard SVAR model is usually given by:

y t = i = 1 p A 0 1 A i y t i + A 0 1 u t , u t N ( 0 , σ 2 ) (1)

where y t is a vector of the model variables, P is the lag length of the variables in the system and u t is the vector of structural shocks. A i denotes the matrices of parameters corresponding to the i th lag and I have left out the vector of constant terms to keep things simple. According to the suggestion in Sims (1980) [1] , the Cholesky decomposition which requires that A 0 1 be lower triangular is usually used as restrictions. This structure indicates that the variable listed at the top of y t affects the remaining variables contemporaneously, while the second variable from the top has immediate causal effects on all the variables except the first and so on down the list. We can also write it in the form of Equation (2):

y t = ( 1 + β L + + β + L + ) A 0 1 u t (2)

Equation (2) shows that variables are the combinations of the shocks. We can also transfer Equations (2)-(4) to make the above argument clear.

y t 1 = ( 1 + β L + + β + L + ) A 0 1 u t 1 (3)

y t 1 = ( L + β L 2 + + β + L + ) A 0 1 u t (4)

According to Equations (1)-(4), y t and y t 1 are both correlated with u t , which implies that the lags of the variables on the R.H.S. of the SVAR model may be correlated with the error term through longer horizons and motivates us to wonder how reliable its impulse response results are.

Likewise, Lv (2017a) [9] postulates that the estimated coefficients may be biased if the variables affect the economy through various time spans. To exemplify, I assume that one more type of shocks affect GDP in y significantly after a year for quarterly data in Equation (5):

GDP t = ( L 4 + δ L 5 + + δ + L + ) ϵ t (5)

It is possible that the shocks selected one-step ahead in u t may take the contributions of the omitted shocks ϵ t as their own. The traditional IRFs assume that these exogenous structural innovations are independent and identically distributed random variables. The policy intuition behind this paper is that the fluctuations of economic time series may not be from random shocks, these innovations may be correlated with each other or the variables through different horizons.

3. Empirical Analysis

The above argument can be justified on the empirical viewpoints. The IRF, which is often used in estimating the multi-step response of one variable to an impulse in another variable in a system, has been widely used in many articles. In this section, I use the SVAR model in Christiano et al. (CEE, 2005) [10] as an example.

Christiano, Eichenbaum, and Evans construct a model with a moderate degree of nominal rigidities that prevents a sharp rise in marginal costs, generating inertial inflation and persistent output movements after an expansionary shock to monetary policy.

The form of the CEE model is as follows:

F t = i = 1 p B i F t i + C u t , u t N ( 0 , σ 2 ) (6)

F t contains nine quarterly series. The lag length p of the model is set to 4. The order of variables is the real gross domestic product (GDP), the real consumption (RPCE), the GDP deflator (GDPDEF), real investment (INVEST), the real wage (WAGE), labor productivity (PROD), the federal funds rate (FEDFUNDS), real profits (PROFIT) and the growth rate of M2 (M2).

The matrix C is taken to be lower triangular with ones along the principal diagonal. It implies that the variables except real profits and the M2 growth rate will not respond instantaneously to monetary policy innovations.

All estimates reported in this paper are based on the original dataset from 1965Q3-1995Q2 in CEE (2005) [10] 1. All data can also be downloaded from Federal Reserve Economic Data (FRED) provided by the Federal Reserve Bank of St. Louis. The real GDP, the real consumption, real investment, the real wage, labor productivity, and real profits are measured as 100 times the natural logarithm of the original data. The federal funds rate is expressed as annualized percentage points. Inflation is 100 times the natural logarithm of the ratio of the indexes for C P I t and C P I t 1 . Money growth of M2 are the first difference of 100 times the natural logarithm of the original data. The first eight transformed variables are still not stationary, I use the first difference of these variables, and leave the money growth of M2 alone.

I begin by checking if the estimated error term is correlated with the first lag of the real output on the R.H.S. of the CEE model in Equation (7). I regress the real output at time t 1 on the output shocks and the lags of the output shocks estimated from the CEE model.

GDP t 1 = α 0 + i = 0 6 β i u GDP , t p + ε t (7)

Table 1 presents the regression estimation of Equation (7)2, which shows the effect of an output shock on output over different horizons. The coefficient estimates, t-statistics, and p-values are reported in column 2-4, with the level of significance of the coefficients appearing on the right side of p-values. In this research, I use *** to denote 0.1% level of significance of the coefficients; **, denoted as 1% significance; *, as 5% significance; ., as 10% significance. The 0.1% level of significance implies that there is about 1 in 1000 chance that a significant result is actually due to chance.

Based on the findings of Table 1, the estimated coefficients of the lags of the real output shocks can be statistically significant at 0.1% level.

Then I check if the first lag of the real consumption listed second in the CEE model is correlated with the real output shocks:

RPCE t 1 = α 0 + i = 0 6 β i u GDP , t p + ε t (8)

Table 2 shows parameter estimates of Equation (8). The coefficients of the real output shocks at time t 1 and t 5 are statistically significant at the 1% and 5% level, respectively. Again, I find that the relation between the lags of the output shocks and the explanatory variables on the R.H.S. of the CEE model exists, which indicates that the error term are correlated with the variables in the

Table 1. The regression estimation of the first lag of GDP on the GDP innovations.

Table 2. The regression estimation of the first lag of the real consumptions on the real GDP innovations.

system through different horizons and the estimated coefficients may be biased.

In Equation (9), I demonstrate the impact of the output shocks on the first lag of each variable on the R.H.S. of the CEE model.

Y t 1 = α 0 + i = 0 6 β i u GDP , t p + ε t (9)

Table 3 presents the regression results of the empirical models represented by Equation (8). The first two rows display the significance level of coefficients in Equation (7) and Equation (8), respectively. Since the estimated coefficients of the output shocks at time t and t 3 are not significant for all variables, I drop them from Table 3 for simplifications. Column 1 lists the dependent variable Y t 1 in Equation (9). Columns 2 to 7 are the significance level of the estimated coefficients of the lags of output shocks. The R 2 and the p-value of the equation on the first row are shown in Columns 8 and 9, respectively.

According to the outcomes of Table 3, the first lag of the output shocks is correlated with the first lag of all variables except the GDP deflator and M2. If more lags of the output shocks are included in the equations, even the first lag of the GDP deflator and M2 may be correlated with output shocks significantly through longer horizons. The equations except the real consumption, the GDP deflator, the real wage and M2 have R 2 greater than 20% and the p-value less than 0.05. It is reasonable for us to conclude that these lags of the output shocks impact most of the variables through longer horizons in the CEE model.

Since I find that the first lag of the GDP deflator and M2 are not correlated with the first sixth lags of the output shocks, then I regress them on the other kinds of shocks in Equation (10), respectively:

Y t 1 = α 0 + i = 1 2 β GDP , i u GDP , t p + + i = 1 2 β M 2 , i u M 2 , t p + ε t (10)

Table 4 displays the estimates of the effect of the first two lags of all shocks on the GDP deflator at time t 1 on the R.H.S. of the CEE model. The regression has an R 2 of 75%. If the lags of these shocks do not impact the GDP deflator,

Table 3. Statistical significance of the real output shocks for each variable in the CEE model.

Table 4. Estimation of the regression of the first lag of the GDP deflator on lags of all innovations.

the R 2 for this regression should be approximately zero. The coefficients of all shocks except u PROD , u PROFIT and u M 2 are significant. The estimated coefficients of u GDP , t 2 u RPCE , t 2 and u GDPDEF , t 2 are significant, indicating that the GDP deflator is correlated with the sum of these shocks through different horizons, so the estimated coefficients of the GDP deflator equation in the CEE model may be biased.

The interesting part is that the estimated coefficients of u GDP , t 1 is not significant, which may imply that the output shock does not affect the GDP deflator contemporaneously as the CEE model assumes or the significance level of the estimated coefficients of u GDP , t 1 may change if we add more lags into Equation (10).

Table 5 reports the estimates of Equation (10) with M2 at time t 1 as the regressor. Again, I find the last four types of shocks from the CEE model are correlated with the first lag of M2 on the R.H.S. of the CEE model.

To sum up, from above evidences, the shocks may be always correlated with the variables in the SVAR model through different horizons. In other words, the

Table 5. Estimation of the regression of the first lag of M2 on the lags of all innovations.

error term is more likely to be correlated with the dependent variables and explanatory variables at the same time in the VAR model. Therefore, the impulse response results may be inferred from the biased coefficients of the SVAR model and we should be cautious to interpret these results.

4. Conclusions

This paper has sought to provide an answer to an ignored question in the literature: does the biased coefficient problem plague the VAR model? I investigate the relationships between the variables and the identified shocks in the SVAR model and find that they are always correlated through different horizons, which is rarely fully persuasive and postulated to be uncorrelated with each other in the conventional models. It implies that the error term is correlated with the explanatory variables on the R.H.S. of the VAR model, which means that its estimated coefficients may be biased. Hence, the biased coefficient problem may limit the credibility of the conclusions drawn from the VAR model. In addition, from the assumption that variables may affect the economy through different horizons, different types of shocks may affect the variables in the structural VAR model over different horizons.

The academic significance of this paper is that my analysis sheds light on the potential problems of the impulse response functions and enhances our understanding on how should we use the VAR model. The distortions of coefficients can be substantial in practice, which invalidates the traditional causal interpretation of the responses of variables to a unit shock and overturns the standard view of how variables affect each other. When talks about the ceteris paribus, if outside variables will not change, the estimated coefficients of variables are actually overestimated. The impulse response results and other conclusions inferred from the biased coefficients of the traditional SVAR models are no means settled issues and needed to be further studied.

The social significance of this paper is that the evidence from the SVAR models may be employed by many center banks to analyze the volatility transmission from a shock to the fluctuations in variables. For example, the oil price may not be the deep factor of recessions according to the instability of its estimated coefficient. It may be just the last straw that breaks the camel. Hence, oil price decreases may help the economy in the short horizon but the real problems which cause recessions still need to be solved.

The limitation of this research is that I only provide the possibility that the estimations of the SVAR model may be biased, but this paper cannot explain how much it will affect the results of the existing literature. For some SVAR models, the biased coefficients may not be important at all because these parsimonious models may capture the main variables which can affect all other variables in the economy. These concerns are beyond the scope of this paper and need to be further studied.

Acknowledgements

I thank the Editor and the referee for their comments.

NOTES

1The data are provided by Martin Eichenbaum.

2I estimate all regressions of this paper by R.

Cite this paper
Lv, Y. (2017) Does the Biased Coefficient Problem Plague the VAR Model?. Theoretical Economics Letters, 7, 454-463. doi: 10.4236/tel.2017.73034.
References
[1]   Sims, C.A. (1980) Macroeconomics and Reality. Econometrica, 48, 1-48.
https://doi.org/10.2307/1912017

[2]   Bernanke, B.S., Gertler, M. and Watson, M. (1997) Systematic Monetary Policy and the Effects of Oil Price Shocks. Brookings Papers on Economic Activity, 1997, 91-157.
https://doi.org/10.2307/2534702

[3]   Hamilton, J.D. and Herrera, A. (2004) Oil Shocks and Aggregate Macroeconomic Behavior: The Role of Monetary Policy. Journal of Money, Credit, and Banking, 36, 265-286.
https://doi.org/10.1353/mcb.2004.0012

[4]   Hamilton, J.D. (1983) Oil and the Macroeconomy Since World War II. The Journal of Political Economy, 91, 228-248.
https://doi.org/10.1086/261140

[5]   Hooker, M.A. (1996) What Happened to the Oil Price-Macroeconomy Relationship? Journal of Monetary Economics, 38, 195-213.

[6]   Friedman, M. (1961) The Lag in Effect of Monetary Policy. The Journal of Political Economy, 69, 447-466.
https://doi.org/10.1086/258537

[7]   Blanchard, O.J. and Quah, D. (1988) The Dynamic Effects of Aggregate Demand and Supply Disturbances. National Bureau of Economic Research, Working Paper No. 2737, National Bureau of Economic Research, Cambridge, MA.
https://doi.org/10.3386/w2737

[8]   Lv, Y. (2017) Selection of Macroeconomic Forecasting Models: One Size Fits All? Theoretical Economics Letters, forthcoming.

[9]   Lv, Y. (2017) How Can the Error Term Be Correlated with the Explanatory Variables on the R.H.S. of a Model? Theoretical Economics Letters, 7.

[10]   Christiano, L.J., Eichenbaum, M. and Evans, C.L. (2005) Nominal Rigidities and the Dynamic Effects of a Shock to Monetary Policy. Journal of Political Economy, 113, 1-45.
https://doi.org/10.1086/426038

 
 
Top