Uncertainty and periodic bouts of chaos have been an unfortunate part of the financial market since its beginnings. Financiers and bankers readily admit that in an industry so large, so complex and so global, it is naive to think such large price movements can ever be avoided. Over the last 30 years, a number of events of large price movements suggest a commonality: poor management & supervision of financial risks. The turbulence in financial markets, starting in Mexico in 1995, stock market crash of 1987, Asian financial crisis in 1997, dotcom bubble in 1999 and the Global financial crisis in 2008, has further extended the interest in risk management. The Value-at-Risk (VaR) was developed in response to such events and has become the most widely used market-risk management approach. The VaR is the maximum loss that the institution can expect at a given level of confidence over a target horizon (holding period). It is the most popular approach because it provides a single number in real currency amounts that represent the overall market risk faced by an institution.
The term “Value at Risk” was not popularly used prior to the 1990s, however, origins of the measure lie further back in time. Though the efforts were directed towards devising optimal portfolios for equity investors, the underlying mathematics in VaR was largely developed in the context of portfolio theory by  and others. Specifically, the concept of market risk and the comovements in this risk are central to the computation of VaR. The drive for the use of VaR, though, came from the crises that plague financial markets over time and the regulatory responses to such crises. The regulatory measures that invoke VaR were initiated in 1980, when the Securities and Exchange Commission (SEC) tied the capital requirements of financial service firms to the losses that would be incurred, over a thirty-day interval with 95% confidence, in different security classes; to compute these potential losses, historical returns were used.1 The measures were then described as haircuts and not as VaR, however, it was clear that the SEC was requiring financial service firms to commence on the process of estimating one-month 95% VaR and thereby hold enough capital to cover any potential losses. The trading portfolios of commercial banks and investments, at about the same time, were becoming bigger and more volatile, evoking a need for a timely and sophisticated risk control measure. In the early 1990s, there were wide variations in how many financial service firms had developed primitive measures of VaR. In 1995, J.P. Morgan initiated access to data on the variances of various security and asset classes, and covariances across various security and asset classes. J.P. Morgan had used this data internally for almost a decade to manage risk. This data allowed programmers to develop software to measure and manage risk. The service was titled RiskMetricsTM and the term VaR was used to describe the risk measure that emerged from this data. The approach found a lot of audience with investment and commercial banks, and the overseeing regulatory authorities. In the last two decades or so, VaR has become the most established risk measure to estimate the exposure of the financial service firms and has even found its acceptance in non-financial service firms  .
The key elements of VaR include a confidence interval, a specified level of loss in value, and a fixed time period over which risk is assessed. The VaR can be estimated for an entire firm, a portfolio of assets or even for an individual asset. Although the VaR measure is simple to understand and interpret, its calculation is not. The existing methodologies for VaR estimation may be classified into three approaches. First, non-parametric models approach, second, the fully parametric models approach, and third, semi-parametric models approach. The underlying assumption for all the non-parametric approaches is that the recent past will sufficiently reflect the near future and therefore we will be able to use the data from the recent past to forecast the risk in the near future. The non-parametric models are: 1) Historical Simulation and 2) Non-parametric density estimation methods. Parametric approaches are based on the statistical parameters of the risk factor distribution. The first parametric model to estimate VaR was RiskMetricsTM from Longerstaey and Spencer  . The major limitation of this approach was the assumption that the financial returns follow a normal distribution. The empirical distribution of financial returns has a fat tail  . Hence, there were attempts to search for a sophisticated volatility model to capture the characteristics which are observed in the volatility of financial returns. The three families of volatility models include: 1) the GARCH family, 2) stochastic volatility models and 3) realized volatility models. The Semi-parametric approach combines the parametric approach with the non-parametric approach. The most important models under this category are Filtered Historical Simulation (FHS), and the approach based on Extreme Value Theory.
There is always a tussle between the conservativeness and the riskiness of a risk measure: for example, if the risk measure is too conservative, too much capital would be kept aside which otherwise could be used in a more profitable way; on the other hand, if the risk measure is too risky, it will result in a large number of violations, leading the financial institution to bankruptcy. Hence, the purpose of this paper is to research into a reliable and accurate risk measurement approach in major Asian economies. The distinctive features of economies in Asia with respect to risk measurement are illuminated in this study. Specifically, the major countries of Asia (henceforth referred to as major Asian economies) like Singapore, Malaysia, Hong Kong of China, Indonesia, South Korea, Philippines, Thailand, China, Taiwan of China and India are taken into consideration. We focus on the computation of the VaR for both long and short trading positions at three confidence levels (95%, 97.5% and 99%) in major Asian economies. In the first step, VaR is estimated using Historical simulation approach, RiskMetricsTM approach, GARCH family of models and Extreme value theory (EVT) approach.
Historical simulation and RiskMetricsTM approach didn’t perform well in the preliminary analysis and are therefore not included in the empirical analysis section. Our focus in this paper is on GARCH family of models: GARCH, APARCH, GJR-GARCH, FIGARCH, FIAPARCH; and Generalized Pareto distribution (GPD) approach from the techniques of Extreme value theory (EVT). All the above-mentioned models are used to forecast VaR. The performances of the models are then evaluated by employing a two-stage procedure  . First, backtesting is implemented using unconditional and conditional coverage tests are to evaluate the statistical accuracy of the candidate models. In the second stage, only those models who qualify the first stage are compared via a loss function, in an attempt to select one model among the various candidates.
Considering the fact that most financial return series are asymmetric  , the EVT approach is assumed to be advantageous over those models which take the assumption of symmetric distributions  . However, our results do not confirm any such advantage in case of the countries considered. The results are mixed with the highest success rate of FIGARCH model. Also, the appropriateness of the models changes across quantiles and between tails.
This paper is organized as follows: Section 2 presents a brief review of the methodology and backtesting framework. Section 3 talks about the data and the empirical results and Section 4 concludes the paper.
Let be the returns of any security at time t and be the price of that security at time t. Then, the at the (1 − α) percentile be defined as
which is the probability that the returns of that security at time t will be less than or equal to , α % of the time.2 VaR is defined as the maximum loss that can be incurred on the portfolio/security over a specific period with a given level of confidence.
Beltratti and Morana  first compared FIGARCH, GARCH and IGARCH model in the estimation of VaR. The paper showed that the GARCH model provided adequate VaR estimates. Along the same line, So and Philip  found that FIGARCH model did not outperform GARCH model, the comparison also included EWMA model. Ñíguez  compared the ability of different GARCH family models (GARCH, APARCH, AGARCH, FIAPARCH and FIGARCH, and EWMA) to forecast VaR. This paper is also an attempt to evaluate and compare the ability of different GARCH family models (GARCH, APARCH, GJR-GARCH, FIGARCH and FIAPARCH) and GPD, to forecast VaR.
2.1. GARCH Modelling
There is an enormous financial econometrics literature around modelling returns in a way that captures time-varying volatility. The two most important techniques in this regard are: stochastic volatility models and Generalized Autoregressive Conditional Heteroscedasticity (GARCH)-models. However, GARCH modelling has gained popularity and acceptance in the financial time series literature, because such modelling captures some important financial time series features, like volatility clustering. GARCH modelling acknowledges that the financial return volatilities are time-dependent. Also, most of the VaR estimation approaches are GARCH based. Therefore, such approaches can be described using a GARCH framework  . In this paper, we have used the familiar variations of the GARCH framework, and herein we are presenting the very basic structure of GARCH modelling. These models are designed to model the heteroscedasticity in the time series of returns
where c and h are functions of the information set and the vector of parameters ; independent of is an iid process, with E and Var . The volatility model mentioned above encompasses a family of methodologies used to predict VaR, and we therefore use some of such methods wherein some models are symmetric, some take care of the asymmetry and long-memory.
2.2. Extreme Value Theory
EVT may be particularly effective since the VaR estimations are only related to the tails of a probability distribution. Many aspects of EVT are so appealing that they have convinced researchers for its use in calculating VaR. There are two ways to model the extremes: modelling the highest values over some threshold, known as the “Peaks-Over-Threshold (POT)” model, and modelling the maximum of a collection of random variables. In this paper, we make use of the modern approach among these two, the POT model. There are two types of analysis in POT models: the Semi-parametric approach which is built around the Hill estimator  and the fully parametric approach which is based on the Generalized Pareto distribution (GPD)  . GPD approach is used in this paper.
“VaR is only as good as its backtest. When someone shows me a VaR number, I don’t ask how it is computed, I ask to see the backtest.” (Brown, 2008, p.20)3
VaR models are useful only if they accurately predict the future risks. Therefore, in order to check the quality of the estimates made, the models used should always be backtested with appropriate methods. Backtesting is a statistical procedure where VaR estimates are systematically compared to the corresponding actual profits and losses. For example, daily VaR is calculated with a confidence level of 95%, we expect a violation to occur five times on average in every 100 days. The backtesting process is therefore used to statistically examine whether the number of violations over a specified time period is in line with the selected level of confidence. Such types of tests are known as unconditional coverage tests. These tests do not consider when the violations occur and are therefore very straightforward to implement  .
A good VaR model not only estimates the accurate number of violations but also violations that are evenly spread over time i.e. they are independent of each other. Clustering of violations implies that the underlying model does not appropriately capture the changes in market correlations and volatility. Conditional coverage tests, therefore, examine also time variation, or conditioning, in the data  .
In order to backtest our models, we in the first stage implement two backtesting approaches: unconditional coverage test of Kupiec  and conditional coverage test of Christoffersen  . In the second stage, we define a loss function for the models which pass the first stage. [Lopez  ] formalized the use of loss functions as a means of evaluating VaR estimates.
3. Empirical Analysis
3.1. Data Description
In order to estimate the Value at Risk in major Asian economies, we collected daily stock market data from Singapore, Malaysia, Hong Kong of China (hereafter Hong Kong), Indonesia, South Korea, Philippines, Thailand, China, Taiwan of China (hereafter Taiwan) and India. The data set for our empirical analysis comes from the Bloomberg database. The database contains historical and real-time economic data and financial market data, covering all sectors worldwide. The dataset (closing price) for all the countries, covers the period September 1, 1999 to January 31, 2017 for a total of 4374 observations. Daily prices are converted into daily returns using the logarithmic difference of the closing prices of two consecutive days. Figure 1(a) and Figure 1(b) plot the return series of all the ten economies under study. It is evident from the plots that the return series looks stationary, however, a deeper understanding about the different properties of the time-series under investigation requires us to study their descriptive statistics. Therefore, descriptive statistics of daily returns are presented in Table 1. The daily returns are defined as
where is the daily closing price of the stock market index on day t.
The highest averages of the daily returns are in Indonesia (0.051%), Taiwan of China (0.051%) and Philippines (0.028%). The lowest averages of the daily log returns are in Singapore (0.008%). China stands out with the highest standard deviation of 1.6%. The sample skewness shows that the daily returns for all the countries have asymmetric distribution. In all the countries, the returns have negative skewness. This indicates that the asymmetric tail extends more towards negative values than positive ones. According to the sample kurtosis estimates, the daily returns are far from being normally distributed. The lowest kurtosis estimates are 6.5 (Taiwan of China) and 7.98 (China), while the highest estimates are 18.49 (Philippines) and 12.98 (Malaysia). Based on the sample kurtosis estimates, it may be argued that the return distributions in all the markets are fat-tailed as evidenced by the Jarque Bera test statistic where the null hypothesis of normality is rejected at 1% level of significance. Given the ADF test statistic for all the countries, the null hypothesis of a unit root is present in the return series is rejected. The Ljung-Box test statistic for serial correlation shows that null hypothesis of no autocorrelation for up to 20th order is rejected at 1% level of significance for all the countries. Also, Lagrange multiplier test statistic confirms the presence of heteroscedasticity for upto 10 lags for all the countries.
Figure 1. (a) Return plots of Singapore, Malaysia, Hong Kong of China, Indonesia and South Korea. The figure plots the daily returns, , for the first 5 stock market indices under study, The sample covers the period from September 1, 1999 through January 31, 2017 for a total of 4374 observations for each country under study; (b) Return plots of Philippines, Thailand, China, Taiwan of China and India. The figure plots the daily returns, , for the other 5 stock market indices under study, The sample covers the period from September 1, 1999 through January 31, 2017 for a total of 4374 observations for each country under study.
Table 1. Descriptive Statistics of the daily returns from ten stock markets.
Statistics marked with an asterisk (⁎) are significantly different from zero at 1% confidence level. Mean = Sample mean (%); Median = Sample median (%); Std. = Standard deviation (%); Sk = Skewness; Ku = Kurtosis; JB = Jarque-Bera test statistic for normality; ADF = Augmented Dickey-Fuller test statistic for stationarity; LB = Ljung-Box test statistic for autocorrelation upto 20 lags; LM = Lagrange multiplier test statistic for heteroskedasticity upto 10 lags.
3.2. Modelling Stock Price Volatility
The AIC criteria are used to choose the best ARMA-GARCH specification for all the stock market indices. Results of the specifications are reported in Table 2. The corresponding best model for each country is fitted to data series to obtain parameter estimates. Note that the full sample parameter estimates are not reported in this paper, however, the best model specification for each country is used in the further analysis of the paper.
3.3. Backtesting Procedure
For all the models, we use a rolling sample window of 1000 observations, in order to forecast the VaR.95, VaR.975 and VaR.99 for all the stock price series (both left and right). The main advantage of this rolling window technique is that it allows us to capture dynamic time-varying characteristics of the data in different time periods. We work with 4373 observations of stock market returns for all the countries and generate 3373 out-of-sample VaR forecasts.
For all the countries, each model is evaluated by comparing the expected and actual violation percentages. Table 3 documents the out-of-sample violation percentages (both left and right) for various confidence levels for all the countries. A violation occurs when a realized return is less than the predicted VaR for a long position, and for short position a violation occurs when a realized return is greater than the predicted VaR. The violation percentage is defined as the total number of violations, divided by the total number of one-period forecasts and the whole term multiplied by 100. If the violation percentage at any quantile is greater than our level of significance (α percent), this implies an excessive underestimation of the realized return. If the violation percentage is less than our level of significance (α percent) at any quantile, the underlying model excessively overestimates the realized return. Table 3 reports the percentage violation of the VaR estimate obtained from a particular model. For example, the percentage violation in case of Singapore is 2.55% when the VaR at 5% is estimated from the GARCH model. The estimate provided by the GARCH model is too conservative in this case, and hence will require to keep too much idle capital. Similarly, the violation percentages under different scenarios for all the countries under study are provided in the table. And, to reach at the appropriate VaR estimate, which will neither be too conservative nor too risky, we follow the backtesting procedure.
To access the statistical accuracy of the various risk management models, we use the two backtesting approaches (unconditional and conditional coverage tests) explained above. Table 4 and Table 5 report the p-values of the corresponding backtesting measures test. A p-value less than 0.05, when α is 5%, will be interpreted as evidence for rejecting the null hypothesis. Similarly, a p-value less than 0.025 and 0.01 when α is 2.5% and 1% respectively, will be interpreted as evidence for rejecting the null hypothesis. The results can be summarized as follows:
Table 2. ARMA-GARCH estimation results for stock market returns.
This table reports the best ARMA-GARCH order for all the countries. The best model for each country was chosen based on the AIC criteria.
Table 3. VaR violation percentages of daily returns.
This table reports the out-of-sample VaR violations for all competing models. SIN = SINGAPORE, MAL = MALAYSIA, HK = HONG KONG, INDO = INDONESIA, SK = SOUTH KOREA, PH = PHILIPPINES, TH = THAILAND, CH = CHINA, TW = TAIWAN, IN = INDIA. The models are successively a GARCH-model, an APARCH-model, a GJR-GARCH-model, a FIGARCH-model, a FIAPARCH-model and a GPD model. All GARCH family of models have skewed student-t innovations. A violation occurs if the realized empirical return on a particular day exceeds the predicted VaR. The expected value of the VaR violation percentage is the corresponding tail size. For example, the expected VaR violation percentage at 2.5% tail is 2.5%. A calculated value less than the expected value indicates an excessive overestimation of the risk while a value greater than the expected value indicates an excessive underestimation. Note that the sign “-” implies that the model did not converge for the respective country.
Table 4. Unconditional coverage test.
This table reports the p-values of the unconditional coverage test. SIN = SINGAPORE, MAL = MALAYSIA, HK = HONG KONG, INDO = INDONESIA, SK = SOUTH KOREA, PH = PHILIPPINES, TH = THAILAND, CH = CHINA, TW = TAIWAN, IN = INDIA. The models are successively a GARCH-model, an APARCH-model, a GJR-GARCH-model, a FIGARCH-model, a FIAPARCH-model and a GPD model. All GARCH family of models have skewed student-t innovations. Note that a P-value greater than 5% in case of VaR.95; 2.5% in case of VaR.975; and 1% in case of VaR.99 indicates that the forecasting ability of the corresponding VaR model is adequate. Also, the sign “-” implies that the model did not converge for the respective country.
Table 5. Conditional coverage test.
This table reports the p-values of the conditional coverage test. SIN = SINGAPORE, MAL = MALAYSIA, HK = HONG KONG, INDO = INDONESIA, SK = SOUTH KOREA, PH = PHILIPPINES, TH = THAILAND, CH = CHINA, TW = TAIWAN, IN = INDIA. The models are successively a GARCH-model, an APARCH-model, a GJR-GARCH-model, a FIGARCH-model, a FIAPARCH-model and a GPD model. All GARCH family of models have skewed student-t innovations. Note that a P-value greater than 5% in case of VaR.95; 2.5% in case of VaR.975; and 1% in case of VaR.99 indicates that the forecasting ability of the corresponding VaR model is adequate. Also, the sign “-” implies that the model did not converge for the respective country.
1) Looking at the violation percentages, GARCH model mostly overestimates the risk exposure, however, it performs moderately well as confirmed by the two backtesting measures. A model is rejected when the violation percentages are statistically different from our theoretical values (α). GARCH model is rejected by the two backtesting measures except in case of Hong Kong of China, China and India (95% VaR, 97.5% VaR; Left tail); Singapore, Hong Kong of China, South Korea, Philippines and Thailand (99% VaR; Left tail); and Hong Kong of China (97.5% VaR; Right tail). The model performs very poorly in the right tail as compared to the left tail. For Hong Kong of China, the model performs consistently well. Its total success rate, i.e. across 10 countries and the 3 theoretical α for both the tails, is equal to 20 by 60% or 20%.
2) APARCH model, except a few exceptions, fails to meet the criterion of both “unconditional coverage” and “conditional coverage”. This model also overestimates the risk exposure. Its total success rate is equal to 8 by 60% or 13.3%.
3) GJR-GARCH model is rejected mostly for the 95% and 97.5% confidence level but its performance slightly improves at a higher quantile. Like earlier models, it also overestimates the true risk. Its total success rate is equal to 14 by 60% or 23.3%.
4) FIGARCH model performs comparatively better than all other models considered. Our analysis shows that this approach is well suited for VaR estimation. However, it is rejected in all the cases except Indonesia at 99% level of confidence for the right tail. The violation percentages, in this case, are very close to the theoretical ones. Its total success rate is equal to 19 by 60% or 31.6%.
5) FIAPARCH model produces better VaR measures for the left tail than the right tail at different confidence levels. The model is comparatively successful for the left tail but fails to produce adequate VaR measures for the right tail. Its success rate is equal to 12 out of 60% or 20%.
6) The GPD model satisfies the criterion of “unconditional coverage” for almost all the confidence levels for all the countries, but it fails to meet the criterion of “conditional coverage”. Its total success rate is equal to 11 out of 60% or 18.3%.
7) The success rates of all the models for both “unconditional coverage” and “conditional coverage” are given in Table 6.
Finally, in the second stage of the best model selection process, a loss function is calculated in order to choose the model that minimizes the total loss. The total loss was calculated only for those models which are found to be acceptable in the first stage of model selection process. Table 7 presents the results of the loss function approach applied to those models. For all confidence levels, the results are mixed as per the loss function approach.
Table 6. Success rate (in %).
This table reports the success rate of all the models. UC = unconditional coverage, CC = conditional coverage.
Table 7. Loss function approach applied only to those models which pass the two backtest measures.
This table compares the best performing models according to the loss function.. SIN = SINGAPORE, MAL = MALAYSIA, HK = HONG KONG, INDO = INDONESIA, SK = SOUTH KOREA, PH = PHILIPPINES, TH = THAILAND, CH = CHINA, TW = TAIWAN, IN = INDIA. The models are successively a GARCH-model, an APARCH-model, a GJR-GARCH-model, a FIGARCH-model, a FIAPARCH-model and a GPD model. All GARCH family of models have skewed student-t innovations. The models with the lowest loss values are boldfaced.
In this article, we compare the various methodologies developed to predict VaR. The VaR has become the most popular methods for measuring and managing market risks. All the VaR model uses historical stock market data to forecast future stock market performance. More importantly, the models rely on assumptions and approximations that do not necessarily hold in every situation. Since the methods are far from perfect, there is a good reason to question the accuracy of estimated VaR levels, hence the article. The common backtesting approaches used to measure the accuracy of VaR models have been applied. The outcomes of the backtesting provided mixed results giving us some indication of potential problems within the approaches.
As Dowd  argues, the state of the art in backtesting VaR models is improving all the time, and the current backtests should already be relatively powerful in identifying bad VaR models.
This study tested only equities. However, VaR is a relevant measure in other markets of an economy like bond market, commodity market, derivatives’ market etc. Therefore, in addition to this instrument class, future research should consider testing other types of instruments as well, such as government bonds, interest rate and commodity derivatives, options and perhaps credit bonds, if possible. Whatever the framework for future backtesting will be, the lesson to learn from this article is to understand the limitation of VaR calculation. The other limitations of the research include the lack of economic significance analysis of the models and analysis of risk spillover among the Asian markets or between other markets and Asian markets. Further research can be conducted to explore these issues. As the research proves, estimated VaR figures should never be taken as to be 100 percent accurate, no matter how sophisticated the approaches are. However, if the stakeholders of VaR know the shortcomings of VaR, the approach can be very useful in risk management, especially because no serious contenders could be used as alternatives for VaR.
1For more details on the commission, please visit: https://www.sec.gov/about/annual_report/1980.pdf.
2The values of α considered in this article are 5%, 2.5% and 1%.
3Brown, A. (2008), Private Profits and Socialized Risk―Counterpoint: Capital Inadequacy, Global Association of Risk Professionals, June/July 08 issue.