The Statistical Arbitrage Study of CSI 500 Stock Index Futures Based on Intraday Effect

Show more

1. Introduction

The intraday effect is a kind of calendar effect. The calendar effect refers to the market vision related to date and time. Its universal existence has been confirmed by numerous scholars, which has a certain impact on financial market investment and academic research. According to the research ideas and methods of experts and scholars, there are roughly three types of tests for the intraday effect. The first category, descriptive statistics, some scholars based on the time series of financial variables, as well as skewness, kurtosis, normality, fluctuation interval and mean and other factors to analyze the existence of intraday effect or not. For example, Wu Gang-hui (2014) divided the trading time into 18 time periods and calculated the average value of six indicators such as volume, depth, absolute spread and relative spread in each time period to discuss the intraday effect characteristics [1] . For example, Wang Weiguo et al. (2015) believe that time series data with intraday effects must be adjusted to more accurately estimate and analyze the regression model [2] . The second category, Non-Parametric Statistical Methods, Wang Shun (2014) used the modified LM non-parametric test to identify the calendar effect of stock index futures and spot volatility, and found that their volatility showed double “W” and oblique “L” type [3] . Lin Xiang-you et al. (2015) tested the possible intra-week effect, maturity effect and double calendar effect of the securities market by Wilcoxon-Mann-Whitney nonparametric test method, AR model with dummy variables, and double difference mode [4] . Zhao Xiujuan et al. (2015) analyzed 1 minute data and found that the absolute yield of the Shanghai and Shenzhen 300 stock index futures was “LM” type, and the transaction volume was “WV” type [5] . The third category, the virtual variable regression method, is also the most used method adopted by most scholars. In the empirical process, the least squares regression model, ARMA model or GARCH model is selected for the distribution characteristics of the variable sequence, and time-dependent dummy variables are added to the model. According to the parameter significance of the dummy variable, the difference of the research target in each time period is determined. For example, Guo Yan-feng (2013) used both OLS and GARCH models to test the intraday, overnight and intraday effects of the Shanghai-Shenzhen 300 stock index futures yield sequence [6] . Yang Hua-qing (2015) used the AR-GARCH model with dummy variables to test the calendar effect in the closing price, trading volume and open interest of the Shanghai and Shenzhen 300 stock index futures, and found that the intraday effect mainly appeared in the market opening and closing stage [7] . For example, Zhou Shijun (2016) used high frequency to achieve volatility, considering the impact of overnight earnings, constructing Copula-Realized GARCH model to estimate the optimal hedging ratio of stock index futures [8] ; Yang Yang (2016) proposed theoretical analysis of stock index futures affecting the spot Pathway, and establish the GARCH model empirical study on the impact of SSE 50 and CSI 500 stock index futures on the volatility of the spot market [9] .

Statistical arbitrage refers to the use of quantitative analysis methods to build a portfolio based on statistical correlation theory, independent of economic theory, to achieve a certain degree of risk aversion, and obtain stable excess returns. The key to statistical arbitrage is the choice of arbitrage strategy. The strategy of statistical arbitrage is very rich, mainly including arbitrage strategy based on co-integration, strategy based on neural network model and strategy based on principal component analysis. The strategy based on co-integration is widely used and widely studied. For example, Hao Ning (2016) uses statistical arbitrage to establish a long-term equilibrium co-integration equation to analyze the arbitrage space of CSI 500 stock index futures [10] . Yan Liang-wen et al. (2016) based on the co-integration and conditional heteroscedasticity model theory, searching for the best arbitrage and risk measure by standard residual sequence, thus establishing an optimal arbitrage scheme, and achieving better results than the threshold method based on confidence level determination [11] . Zhang Zhen (2017) proposed a dynamic prediction interval, and statistical arbitrage through a co-integration model of dynamic thresholds [12] .

The CSI 500 stock index futures is a futures contract with the CSI 500 Index as the subject matter. Since its listing on April 16, 2015, it has been less than three years. At present, the domestic research on the CSI 500 stock index futures mainly focuses on price discovery and risk. Research was on transfer and other functions, impact on stock market stocks, spot market and statistical arbitrage. For example, Xu Jinjian (2016) introduced a virtual variable GARCH model to empirically analyze the impact of stock index futures on the index volatility [13] . Bai Hao (2017) studied the impact of the CSI 500 stock index futures trading on its underlying index price fluctuations [14] . There are also a few scholars who have studied the calendar effect of CSI 500 stock index futures, but relatively few. For example, Wu Xiaohua (2016) found that the Shanghai and Shenzhen 300 stock index futures have a weekly effect by establishing a conditional heteroscedasticity model, while the CSI 500 stock index futures and the SSE 50 stock index futures are not affected by the week effect [15] .

According to the sample selection method of the CSI 500 Index, the CSI 500 stock index futures with the CSI 500 Index as the target can reflect the development status of the small and medium-sized companies. Secondly, since the listing time has only been three years, there is relatively little research on the CSI 500 stock index futures. Therefore, it is of great significance to conduct the intraday effect and statistical arbitrage research on the intraday trading data for the CSI 500 stock index futures. I hope that through the empirical research of this article, the financial investors will be referred to the investment information and expand the research scope of the CSI 500 stock index futures.

2. Theory and Modeling Steps

2.1. Principle of Calendar Effect

Calendar effects are a kind of market vision. In financial markets, the cyclical trend of financial indicators such as volatility, trading volume, and trading frequency is related to the date, including seasonal effects, monthly effects, and intra-week effects, due date effect, and so on [16] . The intraday effect is one of them, which refers to the fact that financial variables have significant fluctuations during certain periods of the trading day.

Specifically, the seasonal effect refers to a phenomenon in which abnormal changes in sequence objects are related to seasonal factors. The week effect, also known as the intra-week effect, means that the change in the economic indicator on a certain day of the week is significantly different from the other days. Similar to the week effect, there is also a workday effect. Obviously, the workday effect, that is, the sequence object, differs between weekdays and non-workdays. The maturity effect, also known as the delivery date effect, refers to the abnormal fluctuation of the yield, volatility or volume of the underlying index on the settlement date of the financial product [17] . The intraday effect refers to the significant fluctuations in financial variables during certain periods of the trading day.

2.2. Test Model of Intraday Effect

This paper uses the virtual variable regression method to test the intraday effect of the CSI 500 stock index futures trading data indicators. To prevent multicollinearity, a session dummy variable is set every 5 minutes. Intraday trading 4 hours for a total of 48 5 minute periods, reducing the total number of dummy variables by one, excluding the last 5 minutes. That is to say, 47 dummy variables and constant terms are introduced in the regression model for research.

In order to prevent pseudo-regression and ensure the reliability of parameter test and estimation results, the stationary test, autocorrelation test and heteroscedasticity test should be performed on the test variables respectively. Then select the appropriate model based on the test results. When the sequence satisfies smoothness, no autocorrelation, and heteroscedasticity, select the OLS model with dummy variables, i.e.:

${X}_{t}={\alpha}_{0}+{\alpha}_{1}{h}_{1}+{\alpha}_{2}{h}_{2}+\cdots +{\alpha}_{46}{h}_{46}+{\alpha}_{47}{h}_{47}+{\epsilon}_{t}$ (1)

where, ${X}_{t}$ represents the variable of intraday effect test, ${\alpha}_{i},i=0,1,2,\cdots 47$ represents constant terms and coefficients, ${h}_{i}$ is a dummy variable, ${\epsilon}_{t}$ is a random error term. In order to prevent the occurrence of multicollinearity, the dummy variable for testing the intraday effect is one less than the actual one, that is, the first 5 minute period in each trading day is not included. When at 9:30-9:35, ${h}_{\text{1}}=1$ ${h}_{\text{2}}={h}_{3}=\cdots ={h}_{47}=0$ . When at 9:35 - 9:40, ${h}_{2}=1$ ${h}_{1}={h}_{3}=\cdots ={h}_{47}=0$ . Other definitions and so on. When ${h}_{1}={h}_{2}={h}_{3}=\cdots ={h}_{47}=0$ , the significance of the constant term is the significance of the last 5 minutes period.

When the sequence satisfies smoothness, no autocorrelation, but heteroscedasticity, choose the GARCH model with dummy variables.

$\{\begin{array}{l}{X}_{t}={\alpha}_{0}+{\alpha}_{1}{h}_{1}+{\alpha}_{2}{h}_{2}+\cdots +{\alpha}_{47}{h}_{47}+{\epsilon}_{t},\\ {\sigma}_{t}{}^{2}={\beta}_{0}+{\displaystyle \underset{i=1}{\overset{q}{\sum}}{\beta}_{i}{\epsilon}_{t-i}^{2}+}{\displaystyle \underset{i=1}{\overset{p}{\sum}}{\beta}_{i}{\sigma}_{t-i}^{2}},\end{array}$ (2)

where, the above equations are mean equation and conditional variance equation with dummy variables respectively. ${X}_{t}$ is the variable of intraday effect test, ${\alpha}_{i},{\beta}_{i}$ represents constant term and variable coefficient, ${\epsilon}_{t}$ is a random error term. ${h}_{i},i=1,2,\cdots ,47$ is the dummy variable corresponding to the 5-minute time period of the day.

When the sequence is stable and there is autocorrelation, but there is no heteroscedasticity, choose the ARMA model with dummy variables, namely:

$\{\begin{array}{l}f\left(t\right)={\varphi}_{0}+{\varphi}_{1}{X}_{t-1}+\cdots +{\varphi}_{p}{X}_{t-p}+{\epsilon}_{t}-{\theta}_{1}{\epsilon}_{t-1}-\cdots -{\theta}_{q}{\epsilon}_{t-q}\\ {X}_{t}=f\left(t\right)+{\alpha}_{0}+{\alpha}_{1}{h}_{1}+{\alpha}_{2}{h}_{2}+\cdots +{\alpha}_{47}{h}_{47}+{e}_{t}\end{array}$ (3)

where, $f\left(t\right)$ represents the ARMA(p,q) model, p is the autoregressive order, q is the moving average order, ${\varphi}_{i}\left(i=1,2,\cdots ,p\right)$ and ${\theta}_{i}\left(i=1,2,\cdots ,q\right)$ represent the coefficients of the autoregressive term and the moving average term, respectively. Other symbols have the same meaning as above.

When the sequence is stable and has autocorrelation and heteroscedasticity, ARMA-GARCH model with dummy variables is selected, that is:

$\{\begin{array}{c}{X}_{t}=f(t)+{\alpha}_{0}+{\alpha}_{1}{h}_{1}+{\alpha}_{2}{h}_{2}+\cdots +{\alpha}_{47}{h}_{47}+{\epsilon}_{t}\\ {\sigma}_{t}{}^{2}={\beta}_{0}+{\displaystyle \underset{i=1}{\overset{q}{\sum}}{\beta}_{i}{\epsilon}_{t-i}{}^{2}+}{\displaystyle \underset{i=1}{\overset{p}{\sum}}{\beta}_{i}{\sigma}_{t-i}{}^{2}}\end{array}$ (4)

where, $f\left(t\right)$ represents the ARMA model, the above equations are the mean value equation and conditional variance equation with dummy variables and ARMA model added. Other symbols have the same meaning as above. When the sequence is not stable, the differential processing is performed first, and then the autocorrelation and heteroscedasticity are performed under the condition that the smoothness is satisfied, and then the appropriate model category is selected.

2.3. Arbitrage Mechanism Based on Intraday Effect of Spread

In the existing literature, statistical arbitrage mainly uses setting thresholds for closing positions and closing thresholds. Arbitrage based on the predicted standard residual sequence, the accuracy of the model prediction and the operating speed of the actual transaction affect the possibility and size of the arbitrage. According to the intraday effect test of the spread, there is a significant fluctuation in the spread during the intraday effect period. If the position can be closed by a known time difference, the arbitrage success rate and profitability can be increased. Combine this idea with the actual establishment of arbitrage mechanism; the main trading strategies are as follows.

Arbitrage is carried out according to the intraday period of the intra-sample spread and the coefficient of the corresponding dummy variable, when the significant dummy variable coefficient in the model is positive, carry out forward arbitrage for the intraday period corresponding to this variable. Start a position at the beginning, that is, buy a near-month contract, sell a distant month contract, and close the position at the end, that is, sell the near-month contract and buy the far-month contract. Assume that the spread between the near and far-month contracts IC1709 and IC1712 detects the intraday effect at 9:55 - 10:00 am, and the dummy variable coefficient is positive. Then, according to the intraday effect arbitrage strategy, the forward transaction is carried out, and the position is opened at 9:55, and the position is closed at 10:00. In actual operation, the arbitrage effect may be poor, and the arbitrage plan needs to be corrected. The specific measures are as follows:

1) Select the best intraday effect test model in the sample. According to the period and the variable coefficient of the significant dummy variables in the model, the arbitrage of the near and far-month contract closing price data in the sample is arbitrarily, and the success rate and total success rate and profitability of each arbitrage interval are counted;

2) When there are more significant variables, that is, there are many arbitrage opportunities in the day, and there are adjacent arbitrage intervals in the same arbitrage direction, try to merge the adjacent arbitrage periods, reduce the cost of handling fees, and observe whether the arbitrage effect is improved;

3) When the success rate of some arbitrage intervals is not high and the profit of the arbitrage interval is low, delete these arbitrage opportunities.

2.4. Modeling Step

This paper establishes a virtual variable model to explore whether the intraday trading of CSI 500 stock index futures has an intraday effect. If it exists, build a statistical arbitrage strategy on this basis. The modeling steps are as follows:

1) Data preprocessing. Using the 5-minute data of the day to generate logarithmic rate of return (high-frequency yield), volume change rate, volume change rate, price fluctuation range, and near-distance monthly contract price difference, and 47 time-varying dummy variables;

2) Stationarity, autocorrelation and heteroscedasticity test. Performing stationarity tests on five test variables of near and far-month contracts in different samples, and smoothing non-stationary sequences; Secondly, autocorrelation test and ARCH effect test are respectively carried out;

3) Selection test model. Select the appropriate test model based on the results of stationarity, autocorrelation and ARCH effect test;

4) Model fitting and test result analysis. Construct a model for the test variables separately, and judge the intraday effect form according to the parameter significance analysis of the dummy variables in the fitted optimal model;

5) Build statistical arbitrage strategies and implementation fixes. According to the intraday effect of the price difference test, the corresponding statistical arbitrage strategy is constructed. The data in the sample was used for verification, and the arbitrage interval with unsatisfactory results was modified.

3. Empirical Research

3.1. Data Sources

In order to understand the intraday effect of CSI 500 stock index futures more comprehensively and enhance the reliability of the results, from the perspective of sampling points and near and far-month contracts. Select IC1705 (near-month contract) and IC1706 (far-month contract) from April 7 to April 20, 2017 (10 trading days), and IC1709 (near-month contract) and IC1712 (far-month contract) in 2017 From July 7th to July 20th (10 trading days) as two comparison samples. A total of 3840 high-frequency 5-minute trading data were used for the intraday effect study. In the statistical arbitrage strategy study, the closing price of IC1709 (near-month contract) and IC1712 (far-month contract) from July 7 to July 20, 2017 (10 trading days) is used as sample data in 2017. The closing price of solstice on July 21, 2017 (5 trading days) is used as the out of sample forecast data. The data in this article comes from the access to the letter futures trading software. Example data is shown in Table 1.

3.2. Test of Intraday Effect

3.2.1. Data Preprocessing

This paper chooses the high-frequency yield, trading volume, position, spread and price fluctuation of stock index futures as the research object. The price volatility is defined by the highest price and the lowest price. Taking the logarithm can make the data more stable, does not change the correlation between the data, and the logarithmic rate of return satisfies the additivity, which has better statistical characteristics than the simple rate of return, so the rate of return is in logarithmic form [18] . Taking into account the small change in 1 minute during the day, the 5-minute trading data is used to create a series of profitability, volume change rate, position change rate, spread sequence and price fluctuation range, in which the rate of change is expanded by 100 times and converted into a percentage sequence. The specific formula is as follows.

$\{\begin{array}{c}l{r}_{t}=\left(\mathrm{ln}{p}_{t}-\mathrm{ln}{p}_{t-1}\right)\times 100\\ dvo{l}_{t}=\frac{vo{l}_{t}-vo{l}_{t-1}}{vo{l}_{t-1}}\times 100\\ dop{e}_{t}=\frac{op{e}_{t}-op{e}_{t-1}}{op{e}_{t-1}}\times 100\\ \begin{array}{c}j{c}_{t}={p}_{1t}-{p}_{2t}\\ v{p}_{t}=\mathrm{ln}\left(h{p}_{t}\right)-\mathrm{ln}\left(l{p}_{t}\right)\end{array}\end{array}$ (5)

where, $l{r}_{t}$ is logarithmic yield, $dvo{l}_{t}$ is the volume change rate, $dop{e}_{t}$ is the rate of change in position, $j{c}_{t}$ represents the sequence of spreads, $v{p}_{t}$ represents the price fluctuation range; ${p}_{t}$ and ${p}_{t-1}$ are the closing price of stock index futures at t, $t-1$ , $vo{l}_{t}$ and $vo{l}_{t-1}$ are the volume at t, $t-1$ , $op{e}_{t}$ and $op{e}_{t-1}$ are the amount of positions at t, $t-1$ , ${p}_{1t}$ represents the time t of the current month contract, ${p}_{2t}$ is the closing price at time t of the far month contract, $h{p}_{t}$ and $l{p}_{t}$ represent the highest and lowest prices at time t.

3.2.2. Variable Verification and Model Construction

In this paper, we first test the stability, autocorrelation and heteroscedasticity of the five intraday effect test variables, and select the appropriate model based on the test results. The test results are shown in the table.

As shown in Table 2, ADF unit root is used to test the logarithmic return rate, volume change rate, position change rate, price fluctuation range and the stability of the spread sequence of the near and far month contract of China securities 500 stock index futures. The results show that all variables reject the null hypothesis except that the spread sequence accepts the existence of the unit root null hypothesis, and the sequence is stable. Therefore, differential treatment of the spread is required before regression analysis. The log return autocorrelation test p value of the IC1705, IC1706 and IC1712 contracts in the sample is greater than 0.05, accept the null hypothesis that there is no autocorrelation, and the other sequences have obvious autocorrelation. As can be seen from the ARCH effect test results in the above table, only the p value of the change rate sequence of IC1705, IC1706 and IC1709 contracts is greater than 0.05, the null hypothesis is accepted and there is no heteroscedasticity, and the rest samples have ARCH effect to varying degrees.

Among them, there were 27 zero trading volumes in the far month contract IC1712 of July for CSI 500 stock index futures in 5 minutes, making it impossible for the sequence to test autocorrelation and ARCH effect. The zero trading volume distribution was analyzed independently. The specific distribution periods are shown in Table 3.

According to the statistical analysis of the 5 minutes of zero trading volume distribution of IC1712 stock index futures within a day, it can be seen that two of the 27 times without trading are 10:35 - 10:40, 11:20 - 11:25 and 11:25 - 11:30, as well as 13:00 - 13:05, 13:15 - 13:20 and 14:20 - 14:25 in the afternoon. The periods with zero trading volume once are mainly distributed in the afternoon trading time, especially in the half hour from 13:25 to 13:55. To sum up, the probability of zero trading volume in IC1712 stock index futures within 10 minutes before the closing bell in the morning and 20 minutes before the opening bell in the afternoon is relatively high.

According to the test results in Table 2, an appropriate model is selected. The preliminary model categories are shown in Table 4.

Table 1. Data example.

Table 2. Stability, autocorrelation and p value of ARCH effect test results of variables.

Table 3. IC1712 stock index futures within 5 minutes of the volume distribution period.

Table 4. Categories of the initial model for each indicator variable.

3.2.3. Analysis of the Intraday Effect Situation

The preliminary model category in Table 4 was compared, dummy variables were added to establish the model, and the model order identification, parameter estimation and model diagnosis of each indicator variable were conducted respectively. The comprehensive model is selected by comprehensive AIC criterion, BIC criterion, parameter saliency and model residual analysis. The time period in which the intraday effect exists and the corresponding dummy variable coefficient are arranged as shown in Tables 5-9.

Table 5. Logarithmic rate of return intraday effect test parameter estimation results.

Table 6. Estimation results of intraday effect test parameters of volume change rate.

Table 7. Change in the volume of positions.

Table 8. The intraday effect significant period of the same price difference sequence under two samples.

Table 9. Summary of periods with significant intraday effect of price fluctuation amplitude series.

The imaginary coefficient of the dummy variable and the corresponding parameter estimated in the above table indicate that the CSI 500 stock index futures yield does have an intraday effect. Different days, the degree of intraday effect is different. Among them, the near month IC1709 of the sample in July has a significant negative yield at 9:30 to 9:35, 10:40 to 10:45, 10:45 to 10:50 and 11:25 to 11:30, and the far month IC1712 has a significant positive yield at 10:30 to 10:35 and 14:45 to 14:50. The intraday effect of the front-month contract is significant during 13:30 - 13:35, and the yield is negative.

According to the test results, the volume change rate of the sample contract in the near month of July changed significantly in different periods of the day. In the April sample, when 10:15 - 10:20 and 13:00 - 13:05, there are intraday effect in the near and far month contracts; In general, the intraday effect of the front-month contract is significant between 13:00 - 13:05 and 14:25 - 14:30, and the coefficient of the corresponding dummy variable is negative, indicating a decrease in trading volume.

Table 7 shows the periods in which the change rate of position holding has obvious changes on the whole and in the near- and far-month contracts. As shown in the above table, the change rate of position holding is 9.30-9:55, that is, there is a significant fluctuation within 25 minutes of opening; The change rate of positions under the recent month contract is mainly at 9:30 - 10:10 and 13:00 - 13:05, that is, the intraday effect is 40 minutes after the opening of the morning and 5 minutes after the opening of the afternoon. In addition to the 25-minute opening of the far-month contract in the morning, the significant time period in the afternoon is 14:20 - 14:25.

According to the test results, the price difference sequence has more time periods during which the fluctuation range is larger during the day. From Table 7, the virtual variables that are significant under both samples are h_{35}, h_{32}, h_{40}, h_{33}, h_{29}, h_{41} and h_{37}. That is, the interval in which the intra-day effect exists is mainly distributed at 13:20 - 13:25, 13:35 - 13:45, 13:50 - 13:55, 14:00 - 14:05 and 14:15 - 14:25. Comparing the dummy variable coefficients, there is a negative spread in the April sample during the significant period of the intraday effect, while the July sample has a positive spread and the spread is more volatile.

As can be seen from Table 9, the price fluctuation range of the CSI 500 stock index futures is between 9:30 and 9:45. There are obvious intraday fluctuations in the opening 15 minutes, of which the near-month contract is more significant than the far-month contract, and the dummy variable coefficient is significant every 5 minutes during the 25 minutes of opening. The time period in which the intraday effect exists in the July sample is mainly concentrated at 9:30 - 10:05, which is within 45 minutes of opening. The April sample was relatively less affected by the intraday effect.

3.3. Statistical Arbitrage Based on the Intraday Effect of Spreads

It is known from the above that the spread sequence has significant intraday fluctuations, so the direction and time of arbitrage are made according to the intraday effect test model of the coefficient of significant variables in the spread sequence (see Table 10 for details). There are 20 intraday effect periods, that is, there are 20 arbitrage opportunities every day. In the five time periods of 10:25 - 10:30, 10:45 - 10:50, 10:55 - 11:00, 14:00 - 14:05 and 14:55 - 15:00, the variable coefficient is negative, and reverse arbitrage can be adopted. Similarly, positive arbitrage is adopted for 15 time periods such as 9:30 - 9:35, 9:35 - 9:40 and 13:20 - 13:25.

According to this arbitrage design, carry out simulated arbitrage on the data in the sample. The success rate of arbitrage in each period is shown in Table 11.

As shown in Table 11, simulated arbitrage was carried out on the 10-day data in the sample, a total of 200 positions were built and 123 positions were closed for profit. In general, there is a 61.5% arbitrage success rate, in which 80% success rate can be obtained within the three ranges of 14:00 - 14:05, 14:30 - 14:35 and 14:55 - 15:00. The arbitrage effect is slightly worse in the 13:30 - 13:35 and 14:35 - 14:40 intervals, with a success rate of 40%. In the case of excluding fees, a total of 361.8 points of profit, a single maximum profit of 22 points, a single maximum loss −14.2 points. In order to reduce the expense of handling fee and increase the possibility of profit, the adjacent periods with the same arbitrage direction in the 20 periods with intraday effect were merged and corrected, and the in-sample and out-sample arbitrage was conducted again to observe whether the arbitrage effect was improved. The results are shown in Table 12.

After 20 significant periods of consolidation correction, reduced to 12 arbitrage interval. After the modified arbitrage design in the 10-day sample, the arbitrage effect will be improved, and the total success rate will increase to 69.2%. After deducting the handling fee, the profit will be 41951.54 yuan, of which the maximum single profit is 4284.81 yuan, the single maximum. The loss is −2877.28 yuan. If using the same arbitrage strategy to simulate the data outside the sample, the arbitrage total output power of the sample outside the sample can reach 83.3%, the profit is 7629.28 yuan, the maximum profit for a single arbitrage is 1521.907 yuan, and the maximum loss is 1799.34 yuan. At 9:30 - 9:40, 9:55 - 10:00, 10:25 - 10:30, 13:20 - 13:45, 13:50 - 13:55, 14:00 - 14:05 and 14:55-15:00 These 7 time slots are arbitrage, the success rate is 100%, but the arbitrage fails in the 14:30-14:45 period. After expanding the sample out-of-sample forecast, the success rate is reduced, and the profit increase is not large. The 5-day success rate outside the sample is 66.7%, the total profit is 15424.53 yuan, and the single maximum profit is 3121.601 yuan. The maximum loss is the same as the 2-day result outside the sample. The arbitrage success rate is 100% profitable in the two intervals of 13:50 - 13:55 and 14:00 - 14:05, and the success rate of other arbitrage intervals is generally reduced.

Table 10. Arbitrage success rate in the sample.

Note: Due to the inability to obtain the opening price at 9:30 am, the arbitrage interval is 9:30 - 9:35, and the position is opened at 9:31.

Table 11. Day effect arbitrage design.

Table 12. Adjusted in sample arbitrage results.

4. Conclusions

In this paper, we construct different virtual variable regression models for high-frequency yield, volume change rate, position change rate, price fluctuation range and price difference series by means of stationarity test, autocorrelation test and heteroscedasticity test. Then, from the sampling point and the distance of the contract, the intraday effect form of CSI 500 stock index futures is analyzed and summarized. And the statistical arbitrage strategy is constructed based on the manifestation of the intraday effect of the spread. When the coefficient of the significant variable in the intraday effect test model is positive, the intraday effect period corresponding to the variable is positively arbitrarily; the starting point is opened, and the ending time is closed. Secondly, the arbitrage strategy is revised according to the success rate of each arbitrage interval in the sample, and the modified arbitrage strategy is used to simulate the arbitrage of the sample closing price. The main conclusions are as follows:

Intraday effect of high frequency yield: The CSI 500 stock index futures yield does have an intraday effect, and the intraday effect is different under different samples. Overall, the high-frequency yields have changed significantly in at least four of the 48 5-minute periods of the day (20 minutes), and the intraday effect in the near-month contract is more pronounced than the far-month contract.

Intraday effect of volume change rate: The trading rate change rate of the recent month contract is significant in the two periods of 13:00 - 13:05 and 14:25 - 14:30, and the corresponding dummy variable coefficient is negative, indicating that the trading volume has decreased. There are 27 zero-volume cases in the 5-minute data of IC1712 stock index futures within 10 trading days, mainly distributed in the morning at 10:35 - 10:40, 11:20 - 11:25, 11:25 - 11:30, 13:00 - 13:05, 13:15 - 13:20 and 14:20 - 14:25.

Intraday effect of the rate of change in positions: The rate of change in positions has a significant intraday effect in both overall and near-term contracts. During the period of 9:30 - 9:55, the volume of positions changed significantly within 25 minutes of opening. The change rate of position holding in the near month contract mainly has intraday effect within 40 minutes after the opening of the morning and 5 minutes after the opening of the afternoon, while the intraday effect in the morning of the trading day in the far month contract is within 25 minutes after the opening of the morning, and the significant period in the afternoon is concentrated within 14:20-14:25.

Intraday effect of price volatility: The CSI 500 stock index has obvious intraday price fluctuations within 15 minutes of the opening period. The near-month contract is more significant than the far-month contract. The dummy variable coefficient is significant every 5 minutes during the 25-minute opening period.

Intraday effect of spread between near and far month contracts: The spread is most affected by the intraday effect among the five test variables. In general, the time period during which the daytime changes are mainly distributed is 13:20 - 13:25, 13:35 - 13:45, 13:50 - 13:55, 14:00 - 14:05 and 14:15 - 14:25.

Based on the intraday effect of the spread within 10 trading days in the sample, after the strategy correction of merging adjacent arbitrage intervals, the total success rate of the new arbitrage strategy in the sample is equal to 69.2%, and the profit after deducting the commission fee is 41951.54 yuan. The arbitrage total output power of the two trading days is 83.3%, the profit is 7629.28 yuan, and the maximum arbitrage gain is 1521.907 yuan. At 9:30 - 9:40, 9:55 - 10:00, 10:25 - 10:30, 13:20 - 13:45, 13:50 - 13:55, 14:00 - 14:05 and 14:55 - 15:00, this is 7 arbitrage period; the success rate is 100%. To sum up, statistical arbitrage strategy based on the intraday effect of spread has better arbitrage effect. This paper also has some shortcomings, which need to be further studied and improved. Therefore, in future studies, intraday effect and statistical arbitrage can be further discussed from the perspective of sample size.

References

[1] Wu, G.-H. (2014) Empirical Study on the Liquidity of Domestic Stock Index Futures. Fudan University, Shanghai.

[2] Wang, W. and Zhai, H. (2015) Research on the Adjustment Method of Intra-Day Effect of UHF Data. Chinese Management Science, 23, 49-56.

[3] Wang, S. (2014) Modeling of Jump Behavior and Volatility. Xiamen University, Xiamen.

[4] Lin, X.-Y., Dai, H.-X. and Gan, Y.-X. (2015) Study on the Calendar Effect of Securities Market Friday and Stock Index Futures Due Date. Journal of Guizhou University of Finance and Economics, No. 5, 48-57.

[5] Zhao, X.-J. (2015) Analysis of the Arbitrage of Shanghai and Shenzhen 300 Stock Index Futures Based on the Intraday Effect. Journal of Management Sciences, 18, 73-86.

[6] Feng, G.-Y. (2013) An Empirical Study on the Function of Stock Index Futures in China. Southwest Jiaotong University, Chengdu.

[7] Yang, H.-Q. (2015) Research on Calendar Effect of Stock Index Futures in China. Hunan University, Changsha.

[8] Zhou, S.-J. (2016) Research on Dynamic Optimal Hedging Ratio of Stock Index Futures Based on Copula-Realized GARCH Model. Nanjing University, Nanjing.

[9] Yang, Y. (2016) An Empirical Study on the Impact of China’s Stock Index Futures Trading on the Volatility of Related Spot Markets. Xinjiang University of Finance and Economics, Urumqi.

[10] Hao, N. (2016) Research on Arbitrage Strategy Based on CSI 500 Index. Modern Economic Information, No. 15, 291.

[11] Qin, L.-W., Tang, G.-Q. and Lin, J. (2016) Study on Optimal Threshold Statistical Arbitrage Based on Co-Integration-GARCH Model. Journal of Guilin University of Technology, 36, 625-631.

[12] Zhang, Z. and Xu, W. (2017) Study on the Influence of Data Frequency of Stock Index Futures on Statistical Arbitrage Performance—Time-Based Trading Mechanism Based on Dynamic Prediction Interval. Scientific Decision, No. 2, 61-75.

[13] Xu, B.-J. (2016) Analysis of the Impact of Stock Index Futures on the Index Volatility—Based on the CSI 500 Stock Index Futures. Financial Economy, 10, 87-89.

[14] Bai, W. (2017) Analysis of the Impact of CSI 500 Stock Index Futures on Stock Market Volatility. Inner Mongolia University, Hohhot.

[15] Wu, X.-H. (2016) Study on the Weekday Effect of China’s Stock Index Futures Market. Friends of Accounting, No. 10, 86-89.

[16] Zhao, Y. (2012) Study on Risk Management of Soybean Futures Price Fluctuation. Beijing Institute of Technology Press, Beijing, 62-63.

[17] Liang, W. (2016) Empirical Analysis of the Calendar Effect of SSE 50 ETF. Hebei Finance Institute, Baoding.

[18] Lu, X.-G. (2011) An Empirical Study on the Weekly Calendar Effect of China’s Shanghai and Shenzhen Index. Qingdao University, Qingdao.