An intervention model or interrupted time series model  is an Autoregressive Integrated Moving Average (ARIMA) model of a time series in which at least one of the predictors is a dummy variable for an event, which is thought of as an interruption in a pure ARIMA process. An ARIMA model in which differencing is not used is also called an ARMA model.
The use of intervention analysis or interrupted time series analysis is very common in hospitality and tourism literature. Bonham and Gangnes  used intervention analysis to show that the 5% Hawaii hotel room tax started in 1987 did not significantly impact Hawaii hotel room revenues. Min’s  intervention analysis of inbound tourism data showed that both the earthquake of September 21, 1999 and the Severe Acute Respiratory Syndrome (SARS) of 2003 had significant negative impacts on Taiwan’s inbound tourism. A thorough review of time series forecasting literature is provided by De Gooijer and Hyndman . Ahlgren et al.  used an ARIMA model to assess the impact of a higher gaming tax rate in the state of Illinois on gaming volume, and concluded that the gaming volume experienced a significant decrease when the tax increase took effect. Ahlgren et al.  used an ARIMA model to assess the impact of a higher gaming tax rate in the state of Illinois on marketing expenditure by a major Illinois riverboat operator. Toma et al.  used a seasonal ARIMA model to show that the book “Midnight in the Garden of Good and Evil” set in Savannah, Georgia had a significant positive effect on hotel tax receipts, while both 9/11 and hurricane Floyd had a significant negative effect. Goh and Law  used intervention analysis to show that relaxation of issuing out-bound visitor visas, the Asian financial crisis, the handover, and the bird flu epidemic had significant and expected impacts on Hong Kong tourism demand. Eisendrath et al.  used intervention analysis on Las Vegas Strip gaming volume to show that 9/11 had a significant negative impact lasting five months. Suh et al.  used this approach to investigate the effects of cash revenue generated from non-comped diners (CASHREV), amount spent by the casino in comped-meals (COMPREV) on gaming volume, using major holidays and Motorcycle Rally as intervention predictors; their ARIMA model showed that both CASHREV and COMPREV were significant predictors of slot coin-in, and Motorcycle Rally had a significant and negative impact on slot coin-in. Zheng et al.  used ARIMA intervention analysis to study the impact of recession on restaurant stock performance. D’Amuri and Marcucci  used ARIMA modeling to assess the impact of an index of Google job-search intensity on the monthly US unemployment rate. Intervention analysis has been used in other disciplines as well. Su and Deng  used time series regression with intervention term to predict the yield of Yu Ebao. Huang  has used intervention analysis to show that government intervention improved a firm’s investment efficiency. The purpose of the present article is to introduce a method from econometric modeling that is simpler to use than the ARIMA method for intervention analysis.
2. Problem Statement and Methodology
The general form of an autoregressive moving average (ARMA) model is
= the response variable of interest.
= the intercept.
= the autoregressive (AR) term coefficients.
= the moving average (MA) term coefficients.
= the random shocks.
The above model is referred to as ARIMA(p,d = 0,q) model or ARMA(p,q) model .
Following steps are used in fitting an intervention model to a time series Yt of the response variable as a function of predictor(s) Xt and intervention variable(s) It:
(1) The time series Yt is plotted to assess the presence of trend with time. A polynomial function of time t is typically used to model the trend.
(2) A multiple linear regression (MLR) model is fitted to the data, such that the variance inflation factor (VIF) values of all predictors are not too high; values of VIF above 5 suggest the presence of multicollinearity , and values of VIF above 10 indicate that the MLR model suffers from serious multicollinearities . One typically drops predictors with highest VIF value, one by one, in order to get a reasonable MLR model.
(3) The MLR model assumes that the errors are independent and normally distributed with 0 means and a common error variance, and the standard t-test is used to conduct significance tests for the coefficients of the predictors. In a time series data, however, the residuals are typically auto-correlated, and hence the use of the t-test is not valid. Plots of the auto-correlation function (ACF) and partial auto-correlation function (PACF) function are examined to determine the order of the ARIMA(p,d,q) × (P, D, Q) model; here p is the order of the non-seasonal auto-regressive term, d the non-seasonal differencing used, q the order of the non-seasonal moving average term, and P, D, Q are the corresponding seasonal terms.
(4) Once the ARIMA model has been identified, a time series regression model is fitted to the data with all predictors and the ARIMA terms in the model, and the residuals from this model are tested for zero auto-correlations up to h lags; the Ljung-Box test  is commonly used for this purpose. Hyndman and Athanasopoulos  recommend using h = 10 lags in the Ljung-Box test After a time-series regression model with uncorrelated residuals is found, the significance of all predictors is tested.
The last or 4th step at times proves to be quite challenging, and an alternative approach for testing the significance of all predictors and intervention variables is proposed and investigated in this study.
One of the assumptions of multiple linear regression (MLR) is that the variance of the response variable is same across the range of predictor values; when this is not the case, we say that heteroscedasticity (or heteroskedasticity) is present . When the error terms (residuals) from an MLR model are autocorrelated (i.e., not independent) the standard estimate of the correlation matrix needs to be corrected for both heteroscedasticity and the presence of autocorrelation. These estimators are referred to as HAC-Consistent estimators .
This approach uses three different heteroskedasticity and autocorrelation consistent (HAC) estimators of the covariance matrix, used in econometric modeling , namely HAC, Kern-HAC, and Newey-West in place of Steps (3) and (4) above. Examples from hospitality literature as well as a synthetic time series data are used to demonstrate the effectiveness of this alternative approach, and parametric bootstrap of time series will be used to compare the results from standard ARIMA-based intervention model and the results of significance testing using HAC estimators. In Step (2) above, one typically keeps only the significant predictors in the MLR model; in this paper, to keep things simple, all predictors in the model are kept as long as the VIF values are less than 5.
3. Comparison of P-Values from ARIMA Model and HAC Estimators
Three different time series data sets are used to compare the ARIMA method for intervention analysis and significance tests using HAC estimators of the covariance matrix of the estimated coefficients of the MLR. The time series in the first two examples are real data sets from gaming literature, the first one modeled by an ARIMA(3,0,0) process with a cubic trend, and the second by an ARIMA(0,0,2) process with a cubic trend. For the third example, a synthetic monthly time series of length 84 was used.
Following steps are used for this comparison:
(a) The time series Yt is plotted to assess the presence of a trend.
(b) An MLR model is fitted to Yt as a function of the predictors including a dummy column for the intervention term, and a polynomial trend function; predictors with high VIF values are removed.
(c) ACF and PACF of the residuals from MLR are plotted for identification of ARIMA terms p and q.
4An ARIMA(p,0,q) time series regression model is fitted, with p and q determined in Step (c) above. The P-values for each predictor term in the ARIMA model are calculated.
(d) ACF of the residuals from the ARIMA model of Step (d) is plotted to show that these residuals are not auto-correlated, which is followed by running the L-B test for 10 lags. If the P-values from the L-B test exceed 0.05 at each lag, these residuals are deemed not auto-correlated, which validates the correctness of the P-values computed in Step (d).
(e) P-values for each term in the MLR model of Step (b) are computed using the HAC estimators of the covariance matrix of the estimated coefficients.
(f) P-values computed in Steps (d) and (f) are compared.
The package Sandwich of the statistical software environment R  was used to compute the P-Values using the HAC estimators.
In addition, for each time series data set, 1000 bootstrap samples are generated, and Steps (a)-(f) are repeated for each bootstrap sample. Histograms of the 1000 P-values computed in Steps (d) and (f) are plotted to compare the ARIMA method to the HAC-method of intervention analysis.
4. Examples from Tourism and Hospitality Literature
Example 1: Impact of 9/11 on Las Vegas Strip Gaming Revenue
Eisendrath et al.  used Las Vegas Strip slot coin-in data, complied from the Nevada Gaming Control Reports for the period January 1990 through November 2004. In this study, data for the period July 1997 to November 2005 are used. The potential predictors of Las Vegas Strip coin-in are dummy variables for months February, March, …, December, a cubic trend, and a dummy column D9_11 for 9/11 that was set to 1 for 6 months starting from September 2011, and 0 for all other months. A time-series plot of Las Vegas Strip coin-in for the sampling period, with all six points labeled as 1 corresponding to the interrupted time-series (Figure 3, top left plot) shows two bends, suggesting the presence of a cubic term in the time series. In order to keep VIF values small, the variable t (indicating month number) was standardized first, and then squared and cubed: (Table 1)
Figure 1 is a plot of the ACF and PACF values for lags 1 to 20; this graph suggests using ARIMA(3,0,0) model for the residuals .
Table 1. MLR model fitted to Las Vegas Strip Slot Coin-in Data.
Table 2. ARIMA(3,0,0) regression model fitted to Las Vegas Strip Slot Coin-in Data.
Figure 1. ACF and PACF plots of residuals from the MLR model for Strip Coin-in.
1) the months February, June and December are statistically significant, each with lower average slot coin-in than the other ten months, 2) the terrorist attack of September 2011 (predictor D9_11) had a significant and negative impact on slot coin-in, and 3) the ARIMA terms AR2 and AR3 are statistically significant.
Figure 2 shows the ACF of the residuals from the ARIMA(3,0,0) Time Series Regression Model of Table 2, and also the P-values of the Ljung-Box (LB) test for 10 lags. The ACF plot shows that the residuals from the ARIMA(3,0,0) Time Series Regression Model are not auto-correlated, which is confirmed by the LB test since P-values at lags 1 through 10 are all above 0.05.
Figure 3 shows plots of the Las Vegas Strip Coin-in and five bootstrap samples generated from it using the estimated ARIMA(3,0,0) model given in Table 2. The bootstrap samples are seen to have the same general pattern as the Las Vegas Strip Coin-in time series. Figure 4 shows the histogram of 1000 P-values for D9_11 obtained from 1000 bootstrap runs. Since D911 term is highly significant (see Table 2 and Table 3), the P-values from bootstrap samples are expected to be small. The term D11 (dummy column for November) is not significant (see Table 2 and Table 3), and hence the 1000 P-values for D11 are expected to exceed 0.05. Figure 4 and Figure 5 clearly show that all of the intervention analysis methods yield similar results.
Table 3. Results of significance tests using the three HAC estimators for Las Vegas Strip Slot Coin-in Data.
Figure 2. Plots of ACF and P-values of L-B test for residuals from the ARIMA(3,0,0) model for Strip Coin-in.
Figure 3. Time series plots of Strip Coin-in (top left) and five bootstrap samples.
Figure 4. Histograms of P-values of the intervention term D9_11 for the four methods considered in this paper from 1000 parametric bootstrap samples generated from the Strip Coin-in data.
Figure 5. Histograms of P-values of the dummy variable for November for the four methods considered in this paper from 1000 parametric bootstrap samples generated from the Strip Coin-in data.
Example 2: Impact of tax rate increase on Marketing Expenditure
Ahlgren et al.  used secondary data for the period January 2000 to December 2006, provided by a major Illinois riverboat operator, to assess the impact of a gaming tax rate increase in the state of Illinois on marketing expenditure by the riverboat operator. A time series regression model was fitted to marketing expenditure; the predictors were a cubic trend, eleven dummy columns for the months of February through December (see Example 1), a dummy column DTax for Illinois tax increase which was 1 for months 43 through 67 and 0 for all other months, and an interaction term between the linear term and the DTax column.
Table 4 shows the MLR model fitted to the Marketing Expenditure data. The cubic trend is significant, along with the month of July, the intervention event DTax and the interaction term.
Figure 6 shows plots of ACF and PACF of the residuals from the MLR model of Table 4. The behavior of the autocorrelation functions suggests an ARIMA(0,0,1) process but the residuals turned out to be auto-correlated. An ARIMA(0,0,2) process provided good fit to the Marketing Expenditure data, as can be seen from Figure 7.
Table 5 shows the fitted ARIMA model. The intervention term DTax is highly significant, and the quadratic trend component Zt2 is not significant (P-value = 0.91). The t-tests using HAC estimators yield similar results (see Table 6).
Table 4. MLR model fitted to Marketing Expenditure Data.
Table 5. ARIMA(0,0,2) regression model fitted to Marketing Expenditure Data.
Table 6. t-test results from the three HAC estimators for Marketing Expenditure Data.
Figure 6. ACF and PACF plots of residuals from the MLR model for Marketing Expenditure.
Figure 7. Plots of ACF and P-values of L-B test for residuals from the ARIMA(3,0,0) model for Marketing Expenditure.
Figure 9 shows the histograms of 1000 bootstrap P-values for ARIMA and HAC methods for the statistically significant term DTax; Figure 10 shows the same for the insignificant term Zt2. Both of these figures show that the ARIMA and HAC methods provide similar results.
Example 3: Synthetic time series
The advantages of working with a simulated (synthetic) time series are that the truth is known and hence the estimates can be compared to the corresponding true parameter values.
The synthetic time series was generated from the following model:
with 1 representing January month of Year 1 and 84 representing December of Year 7.
DE = dummy variable for the intervention event E.
DE = 1 for ; 0 otherwise.
e = ARIMA(2,0,2) error process with , , , and .
Figure 8. Time series plots of Marketing Expenditure (top left) and five bootstrap samples.
Figure 9. Histograms of P-values of the intervention term DTax for the four methods considered in this paper from 1000 parametric bootstrap samples generated from the Marketing Expenditure data.
Figure 10. Histograms of P-values of the dummy variable for the quadtratic term Zt2 for the four methods considered in this paper from 1000 parametric bootstrap samples generated from the Marketing Expenditure data.
Figure 11, plots of the ACF and PACF functions, suggest an ARIMA(2,0,2) model, and Figure 12 shows that ARIMA(2,0,2) model yields uncorrelated residuals. Note that the synthetic time series was generated from an ARIMA(2,0,2) process.
Tables 7-9 show the fitted MLR model, the ARIMA(2,0,2) model, and the significance test results from the HAC estimators, respectively. The true values used to generate the synthetic time series are also shown in these three tables. It can be seen from this table that all of the estimated coefficients are close to their corresponding true values. The intervention term DE is seen to be highly significant, and DNov is not significant.
Figure 13 shows the synthetic time series and five bootstrap samples generated from the synthetic series and the estimated ARIMA(2,0,2) model of Table 8. The histograms of 1000 bootstrap P-values for the intervention term DE and the dummy variable for November (DNov) are shown in Figure 14 and Figure 15, respectively. Both of these figures again show that the ARIMA and HAC methods provide similar results.
For each of the three examples presented in this paper, the ARIMA method of intervention analysis and HAC methods yielded similar results. The four methods (ARIMA, HAC, Kern-HAC, and Newey-West) also yielded similar results for 1000 bootstrap samples from the original time series in each case. These results demonstrate that the simpler HAC method can be used for intervention
Table 7. MLR model fitted to the synthetic data.
Table 8. ARIMA(0,0,2) regression model fitted to the synthetic data.
Table 9. Results of significance tests using the three HAC estimators for the synthetic data.
Figure 11. ACF and PACF plots of residuals from the MLR model for the synthetic data.
Figure 12. Plots of ACF and P-values of L-B test for residuals from the ARIMA(3,0,0) model for the synthetic data.
Figure 13. Time series plots of the synthetic data Y0 (top left) and five bootstrap samples.
Figure 14. Histograms of P-values of the intervention term DE for the four methods considered in this paper from 1000 parametric bootstrap samples generated from the synthetic data.
Figure 15. Histograms of P-values of the dummy variable for DDec for the four methods considered in this paper from 1000 parametric bootstrap samples generated from the synthetic data.
analysis instead of the ARIMA model, especially in situations where finding the right ARIMA model turns out to be challenging.