A class of risk measures, which are commonly referred to as “tail-related risk measures” in the economic literature, is based on basics of fixing ex-ante a risk tolerance level. Value-at-Risk is a common example of this class. Risk tolerance is the level of risk that an investor is willing to take. But, gauging risk appetite accurately can be a tricky task. In practice, the risk tolerance level is generally decided by judgement/or perception by a risk manager or a risk management committee or, in certain cases, an external regulatory body. For this purpose, it has been a common practice to follow recommendations by the BASEL committee of banking supervision. At present, BASEL guidelines are 99% and 99.9% confidence level for Value-at-Risk (VaR) and 97.5% confidence level for Expected Shortfall (ES)  . Most probably, those recommendations are drawn on the basis of country-wise experiences of analysing large set of historical data. Alternatively, in certain cases, the risk modeller adopts commonly used percentages viz. 99%, 95% and 90% for this purpose. Majumder  , however, documented evidence from various developed and emerging equity markets of those incidents where a minor change in the risk tolerance level translated into a large difference in VaR. Nevertheless, such instances are not uncommon in financial markets. Similar observations were documented by Degennaro  who formed examples to establish that non-cooperative choices of the risk tolerance level by two investors were resulting in a substantial variation in their VaR estimates. Therefore, in many occasions, the risk modeller’s preferences on the risk tolerance level could have large impacts on the tail measure. When those preferences are biased, being over concerned to the high volatile period/or stress or due to any other reason, the bias would be transfused into the tail measure. In this approach, the risk tolerance level, which was decided ex-ante during turbulence, maybe appropriate for the turbulent period. However, the same could be suboptimal for quiet periods. Logically, it is extremely difficult to get a risk tolerance level which is suited uniformly across scenarios and this is perhaps a reason for model risk in the conventional approach.
In an alternative approach, the present paper proposes that the risk tolerance level ought not to be pre-assigned, but may be determined by the model itself. In this framework, this parameter may vary with the shape of the loss distribution. One way to determine the same might be using the Pickands-Balkema-de Haan theorem which essentially says that, for a wide class of distributions, losses which exceed the high enough threshold follow the generalized Pareto distribution (GPD)   . Using this theorem, it is easy to establish that the extreme right tail part of a distribution asymptotically converges to the tail of a generalised Pareto distribution (GPD). This hypothesis reveals that we can always find a region in the extreme right tail of the loss distribution, for which the equivalent region from a suitable GPD is available. Therefore, there exists a threshold, data above which shows generalized Pareto behavior. The threshold would essentially be reasonably large to cover all events which are “extreme” in nature. Naturally, events belonging to the rest of the distribution are “normal” or “non-extreme” in nature. The procedure gives us the opportunity to estimate simultaneously the tail size and the starting point of the tail. In other words, it allows simultaneous estimation of VaR and the risk tolerance level. The rest of the paper is organized as follows: Section 2 describes the model. Section 3 provides empirical findings and Section 4 concludes.
2. The Model
2.1. Behaviour of Losses Exceeding a High Threshold
Suppose are n independent realizations from a random variable (X) representing the loss with distribution function with a finite or infinite right endpoint (x0). We are interested in investigating the behavior of this distribution exceeding a high threshold (u). In the line of Hogg and Klugman  , the distribution function of the truncated loss ( ) (truncated at the point u) can be defined as:
Based on , we can define the distribution function of the excess over a high threshold u:
Balkema and de Haan  and Pickands  showed that, for a large class of distributions, the generalised Pareto distribution (GPD) is the limiting distribution for the distribution of the excess, as the threshold (u) tends to the right endpoint. According to this theorem, we can find a positive measurable function such that
where the distribution function of a two parameter generalised Pareto distribution with the shape parameter ( ), and scale parameter ( ) has the following representation:
where , when and when . (2) holds if
and only if F belongs to the maximum domain of attraction of the generalised extreme value (GEV) distribution (H)  . The equivalent representation of (2) could be in terms three parameter GPD: for , the distribution function of the three parameter GPD can be expressed as the limiting distribution function of the excess. with shape parameter ( ), location parameter (u) and scale parameter ( ) has the following representation.
where , when and when .
This representation would provide us a theoretical ground to claim that there exists a threshold, the data above which would have generalized Pareto be haviour.
2.2. Identifying the Tail Region
Equations (1) and (2) suggest that for a sufficiently high threshold, it can be written:
Setting y = x + u
The right hand side of the Equation (3) can be simplified in the form of a distribution function of a GPD:
where and .
Hence, if we can fit the GPD to the conditional distribution of the excess above a high threshold, it can also be fitted to the tail of the original distribution above a certain threshold  .
When u is fixed at , would be the minimum value of y for which the Equation (4) will hold. The deviation of from would, therefore, be non-zero for , which is expected to be zero for all . We may consider an indicator, viz. the cumulative square deviation for ,
, which might be useful for identifying
. By its nature, would be an increasing function of for and would be nearly flat for . Therefore, the slope of the would be positive for , which would be almost zero for . We can identify the cut-off point, , after which the slope of the would be statistically insignificant  . To test this hypothesis, we have plotted D(y) versus for normal and t distributions (Figure 1). D(y) is almost flat after a certain cut-off in both of these cases which validates our postulate.
Therefore, we can bifurcate the underlying distribution into two parts: is the risky region of the distribution in the sense that this region could be approximated by the tail of an equivalent GPD. All large unforeseen losses would belong to this part. Conversely, is the region of the distribution which does not cause severe tail risk.
2.3. Measuring the Tail Risk
For a small quantile of order p, , we can write
VaR represents in probabilistic terms a quantile of the loss distribution function FX  . Therefore,
Equations (5) and (6) lead to interesting inferences: when the distributional form of the underlying distribution (FX(.)) is known, p and VaRp can be estimated simultaneously. Majumder  has named the new risk measure as non-subjective Value-at-Risk ( ).
Figure 1. Plot of D(y) versus y for normal and t distribution. (a) Plot of D(y) based on Normal Distribution (mean: 0, standard deviation: 1.76); (b) Plot of D(y) based on t distribution (Degrees of freedom: 2.18).
2.4. Simulation Study for Threshold Choice
When the form of the underlying loss distribution FX(.) is known, we can develop a procedure for estimating the threshold, , by a simulation study. We may recall our result in the preceding section that we can get a sufficiently high threshold u, above which the distribution function of the excesses can be approximated by the distribution function of a generalised Pareto distribution, . Initially, we fix u to some u/ and generate 100 samples each of size 4000 from the underlying distribution FX. If u/ is the true threshold, then the
deviation of from is expected to be zero for all
for the j th sample, We may consider an indicator, viz. the cumulative square deviation for , ,
which might be useful for identifying the threshold. If is the true threshold, would be zero for each sample. Based on this indicator, we can form a Mean Squared Error (MSE):
where ni is the number of observation in the ith sample exceeding . can be computed for various values of u starting from 0. The best estimate of u (say ) would be one, for which is minimum.
3. Empirical Findings
VaR and VaRN-S based on daily returns on S & P 500 Composite Index for the period of 30 years, from 18th February, 1985 to 17th February, 2015, computed using five risk models separately for the full sample and the simulated stress scenario are reported in Table 1. The stress scenario is simulated in the line of Studer  and Breuer and Krenn  , who employed the Mahalanobis distance as a mathematical tool to choose stress scenarios  . Additionally, the conditional EVT framework proposed by McNeil and Frey  was adopted to compute VaRN-S for GARCH. For each risk model, in the normal as well as the turbulent period, the equilibrium probability level1 in VaRN-S lies in-between 0.05 and 0.1 and the estimate of VaRN-S in-between VaR0.1 and VaR0.05. Furthermore, similar to the conventional model, the estimate of VaRN-S in the stress scenario is greater than the estimate of the same for the full sample indicating that the new risk measure correctly captures riskiness of markets. Hence, estimates of VaRN-S are not too arbitrary numbers to be accepted the same as a risk measure. Interestingly, the standard error of the probability level is low (highest value: 0.024 (Normal (unconditional)). This indicates that additional volatility in VaR due to introduction of time variation in the probability level would be limited.
The recurring criticism against the existing framework of market risk management has been in two leading directions: 1) it is often not possible to find a risk model which accurately predicts the data generating process and 2) input parameters are judgement-based which makes the risk measure subjective. Precision in prediction of data generating process, however, depends on skill and expertise of the risk modeller and so it is more of an art than a science. On the other hand, non-subjectivity in selection of input parameters is possible to be obtained. This could be achieved if the risk tolerance level and the threshold are simultaneously determined by the risk model. Based on this insight, we have improved on the VaR model by allowing time variation in the risk tolerance
Table 1. A comparison between VaR and VaRN-S based on S & P 500 Composite Index.
Note: VaR and VaRN-S are average based on 50 estimates. The standard error of the estimate is provided in the parenthesis. Data Source: Data Stream.
level. Our empirical study based on S & P 500 composite index reveals that the tail risk of the loss distribution is well captured by the new risk measure in the normal as well as in the stress scenarios. The significance of the research is twofold: a) reduction of bias by minimising the scope of human intervention in risk measurement which is of practical as well as of social significance and b) gauging risk appetite methodically which is of academic significance. The approach may widen the applicability of tail-related risk models in institutional and regulatory policymaking. At this stage, however, it is not possible to provide the method for backtesting the new VaR model. This might be the topic for future research.
The author is grateful to Prof. Romar Correa, former Professor of Economics, University of Mumbai and Prof. Raghuram Rajan, Katherine Dusar Miller Distinguished Professor of Finance at University of Chicago Booth School of Business and former Governor of Reserve Bank of India for their insightful comments/suggestions. He is also thankful to Chief Science Officer, Dr. Chitro Majumdar, RsRL (R-square Risk Lab) for his contribution and inspiration at the initial stage of this study.
*The views expressed in this paper are of the author and not of the organization to which he belongs.
1One minus the probability level is the risk tolerance level  .