The modeling and forecasting of volatility is the basis of financial asset portfolio allocation, capital asset pricing and risk management, and has always been a hot topic in the financial field. With the improvement of high frequency data accessibility and the deepening of the research on high frequency data field, the use of high frequency data to estimate and model volatility has become a new trend in financial research  . Volatility measures based on nonparametric methods and using high frequency data are called Realized Volatility (RV)   . RV is a well-known quantity that is constructed from high-frequency intra-day returns. Andersen and Bollerslev  , Andersen et al.  , Barndorff-Nielsen and Shephard  pointed out that the realized volatility can be estimated by continuous sampling of intra-day returns, and theoretically proved that it is a consistent estimate of the integrated volatility in an asset pricing process.
With the deepening of research, many scholars have found that the information in the non-trading period of stock markets which is the overnight period has a very important influence on the volatility (Hansen and Lunde   ; Taylor  ; Ahoniemi and Lanne  ; Oldfield  ; Tsiakas  ). The existing literature on stock market realized volatility has adopted several approaches to dealing with the overnight volatility. Hansen and Lunde   firstly proposed the method of combining realized variance by optimizing the squared return weights of trading time and overnight time. Then, Ahoniemi and Lanne  verified the effectiveness of this method through empirical evidence. Christoffersen  proposed the method of adding the squared return of overnight to the daily volatility measure of trading time. Maderitsch  considered the impact of structural changes of asset prices and improved the volatility measure of Hansen and Lunde   . In the study of the overnight effect of Chinese stock market on volatility, Sun  decomposed the realized volatility of trading time into two components: continuous path variation and jump, and combined with overnight volatility to form a daily volatility measure through the HAR-CJN model. Ma et al.  added overnight rate of returns as an explanatory variable to the high-frequency volatility model to study the impact on model prediction accuracy.
In general, the existing methods lack not only the analysis of the impact of asset volatility at different periods of the day, but also the research on the correlation and combination of different intraday volatility. As pointed out by Ahoniemi and Lanne  , how to optimize overnight information in the context of realized volatility remains an issue that needs further study. As an emerging market, the Chinese stock market is in a stage of rapid development and continuous improvement, which presents a unique pattern of changes. Especially from 2014 to 2016, the huge volatility of the Chinese stock market has highlighted the changeability and vulnerability of the stock market. The study on the integration of overnight variance and its impact on stock price volatility can enable us to measure the volatility more accurately, deepen our understanding of stock market volatility and improve the level of risk management of Chinese stock market, thus promoting the stable and healthy development of it.
2. Realized Volatility and Overnight Effect
2.1. Realized Volatility
Let represent the logarithmic pricing process of financial assets, and its generating mechanism can be expressed as a stochastic differential equation:
where μ(t) denotes the drift rate, σ(t) denotes the volatility, and w(t) denotes the standard Brownian motion.
The true volatility of p* on the t-th day can be defined as . As the integral of volatility, is also called the Integrated Volatility of the t-th day, and is defined as the intra-day return of the t-th day.
Andersen and Bollerslev proposed the definition of Realized Volatility (RV). Let M + 1 denotes the number of price observation values at equal intervals of
the day, that is, . Then, there are a total of M returns in a day, and the intra-day return of the jth observation period is defined as . The realized volatility is defined as:
It has be proved by Barndorff and Shephard that, without considering the jumps of asset prices, according to the Quadratic Variation theory, when , converges to the integrated volatility in probability, that is, . In other words, if the sampling frequency of intra-day returns is high enough, can be regarded as the consistent estimator of real volatility.
2.2. The Overnight Effect of China’s Stock Market Volatility
Transactions in the Chinese stock market are concentrated on the Shanghai Stock Exchange (SSE) and the Shenzhen Stock Exchange (SZSE). The opening hours of each day are from 9:30 to 11:30 and from 13:00 to 15:00, namely, there are 4 hours of trading on each day. However, since the asset prices are changing all the time, using the price changing information observed in only four hours to describe the price changes of the whole day is inaccurate. So it is necessary to consider the price changes at the non-trading hours.
We define the intra-day return as the difference between the logarithm of the daily closing price and the logarithm of the previous daily closing price. The time of a day corresponds to the closing time of the previous day to the closing time of the day. Hence, we can divide the intra-day return into four periods:
Phase I. Overnight period from 15:00 on the previous day to 9:30 on the day. The overnight return of a stock is defined as the difference between the logarithm of the day’s opening price at 9:30 and the logarithm of the previous day’s closing price at 15:00, represented by ;
Phase II. Opening hours from 9:30 to 11:30. The morning return of a stock can be obtained in the same way as calculating the intra-day return, represented by ;
Phase III. Lunch break from 11:30 to 13:00. The stock’s midday return is defined as the difference between the logarithm of the day’s opening price at 13:00 and the logarithm of the day’s closing price at 11:30, represented by ;
Phase IV. Opening hours from 13:00 to 15:00. The afternoon return of a stock can be obtained in the same way as calculating the intra-day return, represented by .
The volatilities of overnight and midday returns can be expressed as the square of the returns, that is, and . The volatilities of morning and afternoon returns are represented by realized volatility, that is, and . The partition results of the four time periods are shown in Figure 1.
To compare the changes of daily volatility at different periods, we use the 1-minite high-frequency data of Shanghai Composite Index and Shenzhen Component Index from January 2, 2014 to November 2, 2015 for comparative analysis, including 429 days of valid data of Shanghai Composite Index and 437 days of valid data of Shenzhen Component Index. Due to the partial missing and abnormal values of 1-minute data of Shanghai Composite Index obtained, in order to ensure that the analysis results are in line with the actual situation, we firstly preprocess the data by interpolating. To eliminate the influence of microstructure noise of high-frequency data, we use high-frequency data at intervals of 5, 10, 15, 20 and 30 minutes respectively for analysis. The results show that the statistical characteristics of the volatilities of the returns at different time intervals display roughly the same variation pattern. Table 1 and Table 2 respectively show the statistical characteristics of realized volatility obtained at a 5-minute sampling interval.
It can be seen from Table 1 and Table 2 that both Shanghai Composite Index and Shenzhen Composite Index have large volatilities of the morning returns. Volatilities of the afternoon returns are smaller than that of the morning,
Figure 1. The decomposition of the daily volatility structure of Chinese stock market.
Table 1. Daily volatility statistical characteristics of Shanghai Composite Index at 5-minute sampling interval.
Source: Data from Shanghai Stock Exchange.
Table 2. Daily volatility statistical characteristics of Shenzhen Component Index at 5-minute sampling interval.
Source: Data from Shenzhen Stock Exchange.
which is basically consistent with the reality of stock markets. The volatilities of the overnight period are slightly less than that of the afternoon, but are in the same order of magnitude as the averaged volatilities over the trading period. Hence, it can be seen that the overnight effect is obvious and should not be ignored. Because the volatility of midday returns is extremely small compared to that of the other three time periods, in the following analysis we choose overnight volatility, morning realized volatility and afternoon realized volatility as the main components of daily volatility.
3. Research Methods
The intra-day return of the stock market is defined as the difference between the logarithm of the day’s closing price and the logarithm of the previous day’s closing price. According to the division of the four time periods in Figure 1, the intra-day return of the t-th trading day of the stock market can be defined as , where are the morning and afternoon returns of the opening hours and the volatility can be expressed by the realized volatility, which is the sum of the squares of returns on trading time, that is, . And respectively represent the morning and afternoon returns of m equal time intervals of the opening hours. Compared with the four periods of intra-day returns, the implied integrated volatility can be divided into four parts, namely, .
To simplify writing, we define to represent conditional expectations. The basic hypothesizes discussed below are:
Assumption I. ;
Assumption II. ;
Assumption III. ;
Assumption IV. ;
where are constants, respectively.
Assumption I shows that the expected value of the integrated volatility of each time period is fixed in proportion to integrated volatility of the whole day. Assumption II requires that the conditional deviation rates of the volatilities of the morning and afternoon returns are proportional to the integrated volatilities of the corresponding period. Assumption III requires that overnight and midday volatilities are also proportional to the integrated volatilities of the corresponding period. Assumption IV means that the integrated volatilities at different periods are irrelevant. Assumption IV is supposed to simplify the problem, which is generally not true in practical problems.
To give a more accurate description of the volatility in stock market, based on the model of Hansen and Lunde, we subdivide the opening hours into morning and afternoon and add in volatility of overnight returns. Hence, we define the daily volatility measure of the Chinese stock market as a linear combination of overnight volatility, morning realized volatility and afternoon realized volatility:
where is the parameter to be estimated.
To facilitate this discussion, we define unconditional expectations . Then we have the following conclusions.
Theorem 1. For all that satisfy , there is .
Proof of Theorem 1: see Appendix.
The purpose of the established volatility measure is to better describe integrated volatility , and the difference between them can be measured by the mean square error. So this problem turns into the following optimization problem.
where Θ is the value interval of parameter .
But in fact, since is the integral of instantaneous volatility, it is an unobservable potential variable. Moreover, the asset price itself is disturbed by the micro-structure noise, so we cannot get a more accurate measurement value of actual , and other methods need to be considered. Then we give the following theorem.
Theorem 2. For , if , then is equivalent to , where is the variance of .
Proof of Theorem 2: see Appendix.
From Theorem 2, the original problem can be transformed into an optimization problem:
Hence, we have the following conclusions.
Theorem 3. The solution of the optimization problem is
In particular, when Assumption IV is true, that is, , , then we have
Proof of Theorem 3: see Appendix.
4. Empirical Analysis
4.1. Selection of Sample Data
The empirical analysis data used in this paper is still 1-minute closing price data of China’s stock market index from January 2, 2014 to November 2, 2015. Since most studies have confirmed that the 5-minute data can be considered almost impervious to the micro-structure noise of high-frequency data, and we have analyzed the 5, 10, 15, 20, 30 minutes intervals and find that there is roughly the same variation pattern, we mainly analyze Shanghai Composite Index and Shenzhen Component Index with a sampling interval of 5 minutes in the following empirical part. After processing the high-frequency data at a sampling interval of 5 minutes, 50 pieces of high-frequency data can be recorded every trading day. Then we have a total of 21,450 pieces of high-frequency data covering 429 trading days of Shanghai Composite Index, and 21,850 pieces of high-frequency data covering 437 trading days of Shenzhen Component Index.
4.2. Correlation Analysis of Returns and Volatilities
In Assumption IV, we assume that the volatilities of different time periods are irrelevant. Since this is a very strong assumption, we need to firstly test the correlation between returns and volatilities in practical applications.
We conduct an autocorrelation analysis of 5, 10, 15, 20 and 30 minutes return series of the two stock indexes. The results show that there is no autocorrelation in the series of intraday returns within the above time intervals. Then we conduct correlation tests on the series of the overnight volatility , the morning realized volatility and the afternoon realized volatility . The results are shown in Table 3 and Table 4.
As can be seen from the sample correlation coefficients in Table 3 and Table 4, the overnight volatility is correlated with both the morning realized volatility and the afternoon realized volatility. In this case, the existence of correlation cannot be ignored. Hence, we use Equation (5) to calculate the daily volatility.
4.3. Parameter Estimation and Comparison of Volatility Measures
Using the result of Theorem 3, we calculate the parameters of 5, 10, 15, 20 and 30 minutes sampling intervals. As the final results obtained from different intervals do not differ much, we chose the 5-minute interval as the representative for the following analysis and explanation. Bring the return series into the formula in Theorem 3, we calculate the parameters as shown in Table 5.
From the formula , we can get the daily volatility measure that we construct (expressed as XRV for convenience of differentiation). We take the square of the intra-day returns as a reflection of the real volatility, and use RV to represent the daily volatility measure which is constructed only by using high-frequency data of opening hours and without taking the overnight effect into account. We calculate the mean and variance of these
Table 3. Sample correlation coefficients for overnight volatility, morning realized volatility and afternoon realized volatility of Shanghai Composite Index.
Source: Data from Shanghai Stock Exchange.
Table 4. Sample correlation coefficients for overnight volatility, morning realized volatility and afternoon realized volatility of Shenzhen Component Index.
Source: Data from Shenzhen Stock Exchange.
Table 5. Results of parameters at a 5-minute sampling interval.
three volatility measures of Shanghai Composite Index and Shenzhen Component Index respectively, and show the statistical characteristics in Table 6.
It can be seen that the daily volatility measure XRV constructed by us is closer to the real volatility value than the realized volatility measure RV which does not consider the overnight effect.
Finally, we use the mean square error (MSE) as the standard to measure the error, which is the most commonly used form of loss function in such judgment. According to the principle of least square method, the smaller the sum of squared residuals, the more consistent the estimated value is with the real value. In practical application, MSE is usually used as a measurement index. The smaller the expected value of the squared difference between the estimated value and the real value, the more accurate the model is. The specific definition is
, and MSEs between the volatility measures and the real volatility are shown in Table 7.
As we can see from Table 7, the mean square errors of the daily volatility measure we build (XRV) are less than the corresponding values of the measure without considering the overnight variance and time segment (RV).
Based on the above analysis results, we can draw the conclusion that, the daily volatility measure considering the impact of overnight variance and time segment (XRV) is superior to the realized volatility (RV), and can reflect the real situation of volatility more comprehensively.
Table 6. Statistical characteristics of volatility measures at a 5-minute sampling interval.
Table 7. The mean square error between the volatility measures and the real volatility at a 5-minute sampling interval.
Due to the use of more intra-day data, the realized volatility measure based on high-frequency return series shows better statistical properties than parametric model in characterizing historical volatility. Since the trading hours of stock markets only account for a small part of a day, which are divided into two periods of morning and afternoon, asset prices change continuously. Hence, the realized volatility composed of the return series of trading time cannot fully characterize the daily volatility. Base on this point, the main work of this paper is to establish the optimized realized volatility statistics through the analyzing and processing of high-frequency trading data of China’s stock market, and compare it with the original measure through mean square error to judge the pros and cons of the new volatility measure. The main empirical results show that in terms of Shanghai Composite Index and Shenzhen Component Index, the daily volatility measure considering the impact of overnight variance and time segment is superior to realized volatility measure without considering them.
This paper proposes a daily volatility measure for Chinese stock market, which considers the impact of overnight variance and time segment. This approach is helpful for us to better understand the volatility structure of Chinese stock market and give a more accurate measure of volatility. Since the high-frequency return series are affected by the microstructure noise, and there are jumps in asset prices when it changes continuously, these factors will make the realized volatility measure have certain deviations. Therefore, the improvement of volatility measure based on microstructure and price jumps is a further work direction in the future.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.
1) Proof of Theorem 1:
By hypothesis, there is
and by the law of total expectation, we have
then we have
2) Proof of Theorem 1:
We define , and by conditional expectation formula,
Therefore, by the law of total expectation, we can have
3) Proof of Theorem 3:
Consider the minimum variance of the linear combination of random variables , that is, .
To simplify the calculation, let , . Take the partial derivative with respect to a and b respectively, and set the partial derivative to 0, then we have
By the condition and normalizing it, we have . And let, then we have . Bring them into the above results and we have
and we can get α and β.
 Andersen, T.G. and Bollerslev, T. (1998) Deutsche Mark-Dollar Volatility: Intraday Activity Patterns, Macroeconomic Announcements, and Longer Run Dependencies. Journal of Finance, 53, 219-265. https://doi.org/10.1111/0022-1082.85732
 Andersen, T.G., Bollerslev, T. and Huang, X. (2011) A Reduced Form Framework for Modeling Volatility of Speculative Prices Based on Realized Variation Measures ☆. Journal of Econometrics, 160, 176-189. https://doi.org/10.1016/j.jeconom.2010.03.029
 Hansen, P.R. and Lunde, A. (2005) A Realized Variance for the Whole Day Based on Intermittent High-Frequency Data. Journal of Financial Econometrics, 3, 525-554.
 Tsiakas, I. (2008) Overnight Information and Stochastic Volatility: A Study of European and US Stock Exchanges. Journal of Banking & Finance, 32, 251-268.