1. Introduction and Motivation
When annuity payments are concerned, the calculation of expected present values (EPVs), needed in pricing and reserving, requires an appropriate mortality model in order to avoid biased valuations   . Frailty models are used in life insurance to represent heterogeneity in a population due to unreported risk factors  . Heterogeneity due to reported risk factors is addressed at policy issue during the underwriting process. There has been a growing literature on dependence mortality modeling in life insurance in recent years (see, e.g.,    ) where the focus has been on either the lower-tail dependence alone or upper-tail dependence alone. Frailty dependence modeling     is an approach that accounts for dependence in event times of related individuals. Clustered survival times are assumed to be conditional independent with respect to the shared risk. Our strategy is to use the conditional independence assumption to account for lower and upper-tail dependence for a given association measure. Dependence frailty models have been used by several authors with applications in medical field (see e.g.,  ). This article implements the dependence frailty approach in insurance risk management setting.
Our article contributes to the existing literature in several ways. First, we apply a shared frailty approach to life insurance risk problems where most applications have been within medical setting. Second, most dependence models in literature have focused on either the lower-tail dependence alone or upper-tail dependence alone. We apply the conditional independence assumption for an observed association measure in a positive stable frailty to account for both lower and upper-tail dependence. Third, we incorporate a stochastic dependence structure via a dynamically evolving positive stable process to model time-varying shared risk.
2. Notations, Assumptions and Data
To facilitate an easier discussion of frailty dependence modeling, the following notations are used. We consider joint-life annuity contracts where x is the age of the male annuitant whose future lifetime random variable is and y the age of the female annuitant whose future lifetime random variable is . is the present value factor, is the probability of survival until the last of dies and the EPV of the benefit. and are commutation functions.
For simplicity of notation of the commutation functions, we suppose that the limiting age of our mortality table is infinity.
The force of mortality for a life aged x during time t is assumed piece-wise constant across each whole year of age . i.e. . Similar assumptions are found in .
A deterministic financial structure is adopted for the present value factor for illustration purposes.
Model calibration with reference to standard mortality tables    is applied due to limited joint-life mortality data-set that is available in the Kenyan annuity market.
The association of Kenyan insurers (AKI) 2010 published mortality tables is based upon data collected by the association of Kenya insurers for an investigation into the mortality of assured lives in the Republic of Kenya. The AKI 2010 mortality rates are used as the baseline hazard rates in the study. Joint-life and last survivor annuitants member data for policies in-force between 2001-2013 will be used in the study to determine the average age difference for insured couples in Kenya. The annuitants data used in this article was obtained from a major Kenyan insurance company. To preserve confidentiality, we took a sub-sample of 178 joint-life policies.
3. Materials and Methods
3.1. Joint-Life Last Survivor Annuity
The proposed model can be applied to any type of joint-life annuity business. In this article, we discuss the case of joint-life last survivor annuities. This is a contract that provides level payments to two or more annuitants until the last of them dies. This can be an immediate annuity for a single lump-sum payment whose expected present value (EPV) of say amount b per annum, payable in arrears, until the last of dies is given by:
where the future lifetime random variable
Under the dependence (frailty) assumption:
Assuming independence we have:
The EPV in Equation (1) becomes:
respectively. Here again for simplicity of notation, we suppose that the limiting age of our mortality table is infinity. If the purchase price for the annuity is p then the future level payment stream b applying the traditional equivalence principle is:
The prevailing traditional insurance practice assumes independence in pricing joint-life annuity contracts thereby adopting the EPV shown in Equation (5) whose joint mortality is obtained by summing up the individual mortality rates. Frailty dependence models focus on modeling the EPV as shown in Equation (4) whose joint mortality accounts for heterogeneity and dependence.
3.2. Shared Frailty Model
Definition 1. The hazard function for a shared frailty model is given by;
where is the conditional hazard function for the jth individual in the ith group and is the shared random effect associated with the ith group. is the population’s base force of mortality.
Formally, the expression of the bivariate net survival function is summarized in the following proposition.
Proposition 1. Under the assumption of independent future life-times for a given shared frailty the bivariate net survival function is;
Proof. The bivariate conditional survival function for a given shared frailty at time is given by; and from we have that using expectation this simplifies to
3.3. Shared Frailty and Archimedean Copula Approach
Copulas have been studied in actuarial science to model joint-life survival functions . Similarity between the frailty and copula dependence approach is discussed with reference to the elliptical or Archimedean dependence copula. The family of Archimedean copula   in the bivariate case is described by reference to a generator function.
where the generator function is any non-negative decreasing function and non-negative second derivative with and its pseudo-inverse function. A special case showing similarity to shared frailty approach is when where is the frailty random variable and this leads to:
Comparing this with the marginal survival from shared frailty approach i.e. and therefore shows that
If is the Laplace transform of a gamma distribution with scale parameter 1, then the Clayton copula model is obtained. Similarly, the Gumbel copula is obtained if has a positive stable Laplace though the estimation strategies and association measures differ with the frailty approach. Whereas in the copula approach, the marginal survivor functions and the dependence structure have to be specified   for the joint survivor to be constructed. In frailty models, the dependence structure is introduced indirectly.
3.4. The Positive Stable Frailty Model
The probability density function (p.d.f) for the positive stable distribution in closed-form is given by;
For identifiability we assume (see  for proof), then we have the standard case with only one parameter r.
Proposition 2. The Laplace transform is a special case of the Power Variance Family Laplace given by:
As indicated earlier for identifiability reasons we let .
The proposed frailty distribution has many advantages. First, it is easy to implement due to the simplified Laplace transform shown in Equation (11). Second, the positive stable has an infinite mean and variance. This allows for a much higher degree of heterogeneity to be accounted for that would not be possible by using a frailty distribution with finite variance. Third, the positive stable distribution is infinitely divisible, allowing the splitting of the shared risk into cause specific risks which may be easier to interpret. The net bivariate survival, density and hazard functions at time are:
In dependence frailty models, the frailty distribution is identifiable through the   cross-ratio function, which describes how association of the bivariate hazards evolves over time. The cross ratio measure for the first life if the second life has experienced the event rather than being event free at a given time is given by;
Using the positive stable as frailty distribution, the cross ratio function from Equation (15) becomes:
From Equation (16), values of r close to zero indicate high association between and because takes values greater than 1, r close to one indicate low association between and since takes values near 1 while corresponds to independence i.e. .
We present below four examples with specific baseline distributions to find the frailty dependence hazard functions with explicit expressions.
Let follow a Weibull( ) distribution with p.d.f where is the shape parameter and a the scale parameter.
Then the survival, hazard and cumulative hazard functions are;
From Equation (14) the positive stable Weibull (PSW henceforth) frailty dependence hazard is described explicitly as:
The Weibull distribution is widely used in the analysis of lifetime data because it is flexible enough to account for an increasing ( ), decreasing ( ) or constant ( ) hazard rate. Further, the law of Weibull is useful in mortality models for annuitants see e.g. .
Let follow a Lognormal( ) distribution with parameters the p.d.f is given by;
Then the survival, hazard and cumulative hazard functions are:
From Equation (14) the positive stable Lognormal frailty dependence hazard is:
The Lognormal is also used in modeling failure time data because it can take various unimodal shapes i.e. bathtub-shaped or hump-shaped.
Let follow a Gamma( ) with p.d.f
Then the survival, hazard and cumulative hazard functions are:
The Gamma is widely used in survival analysis to generate mixtures in exponential and Poisson models. It has positive support and is also a good choice for the baseline hazard.
From Equation (14) the positive stable Gamma frailty dependence hazard is described explicitly as:
3.5. Assessment of Model Selection Criteria
The performance of the model selection criteria for the Bayesian estimation technique is validated in a comparative study with the traditional MLE method. In the Bayesian method the deviance information criteria (DIC), Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC) is applied whereas in the MLE method the Standard Error information is used.
1) where: is the posterior mean of measuring the quality of the goodness-of-fit of the considered model to the data.
is the posterior mean of stochastic nodes and is the effective number of parameter. Smaller values of DIC indicate better models and could give negative values.
2) where: p = number of parameters of the model.
3) where: p= number of parameters of the model and n= sample size. The advantage of the BIC is that it includes the BIC penalty for the number of parameters being estimated.
The Bayesian method treats all unknown parameters as random variables in a statistical model and derives their distribution conditional upon known information. This method has been applied in actuarial modeling e.g. by  in analysis of simultaneous equations for insurance rate making and by  to analyze time-varying dependent data with possible variance shifts. The Bayesian parameter estimation strategy is implemented in the following algorithm run in OpenBUGS: First, the proposal distributions for the likelihood are specified as Weibull( ), Lognormal( ) and Gamma( ) respectively. Since we do not have prior information about baseline parameters non-informative prior distribution is picked and assumed to be flat. i.e. gamma distributed random variables with mean 1 and variance 10,000 for positive parameters and normal distribution of mean 1 and variance 10,000 for parameter that can take on positive or negative values. Similar approach is found in .
The hyperparameters of initial values are chosen to be MLE estimates determined outside of OpenBUGS using standard techniques e.g. for the Weibull . The actual data to be estimated by the model is specified to be the males and females densities obtained from the AKI 2010 mortality data through standard numerical approximations. Parameters are estimated considering only the range of ages [55, 109]. Burn in period is set at 2000 as per the BGR plot to ensure sequence of draws from the posterior distribution have minimal autocorrelation and can be found by taking values from a single run of the Markov chain. This diminishes the effect of the starting distribution. We run 3 chains in parallel and after 10,000 iterations convergence will be monitored and if stationarity has been achieved (implying estimates are not dependent on the prior distributions) the mean posterior distribution will be picked as a point estimate. Models with smaller values of the DIC, BIC or AIC are preferred.
The WinBUGS codes used to analyze the dataset using Weibull are available upon request.
Brooks-Gelman-Rubin Diagnostic and Trace Plots
The BGR convergence diagnostic plots for the monitored nodes are presented in Figure 1. As the MCMC simulation progresses, the values of the total-sequence (green curve) and mean within-sequence interval width (blue curve) estimates are monitored. Their ratio (red curve) is seen to converge to one beyond 2000 iterations hence a probable choice for the burn-in period. The dynamic trace
Figure 1. BGR diagnostic plot consistent with convergence and dynamic trace plots.
plots also monitored in Figure 1 is shown to be mean-reverting and the chains appear to mix freely implying stationarity has been achieved.
Comparison with MLE
The MLE approach is concerned with obtaining parameter values say, that maximizes the probability of observing the dataD given those parameters, . The likelihood function gives the probability of the observed sample generated by the model. Generally, maximization of the likelihood function to find the ML estimates is done algebraically, but can be computational intensive. In this article, the MLE algorithm is implemented using MASS package run in R. The output is given in Table 2 using the same distribution parametrization as in the Bayesian approach Table 1.
On the basis of Bayesian the Weibull distribution is chosen since the DIC, BIC and AIC are smallest compared to the other distributions. Also using the MLE approach the Weibull is also chosen because the Standard Errors is smallest compared to the other distributions. It is interesting to note from Table 1 and Table 2 that the parameter estimates produced by the two competing approaches agree with minimal deviations. So it may be claimed that Bayesian methods are the potential candidates to be used in any inferential procedures allowing for
Table 1. Bayesian parameter estimates, DIC, BIC and AIC values.
Table 2. MLE parameter estimates and standard error values.
informative priors shared by field experts. The parameter estimates used in the study are as shown in Table 1, following the implementation of Bayesian described.
Goodness of Fit Test
A chi-square goodness-of-fit test of the data for Weibull baseline distribution is as shown in Table 3. As observed in Table 3, the chi-squared p-value is greater than 5%. Thus, we fail to reject the null hypothesis that the AKI data follow a Weibull distribution at 5% level of significance. Similarly, the Kolmogorov-Smirnov (KS) hypothesis test in Table 4 between the empirical distribution function and the fitted distribution function shows a p-value greater than 0.05. We therefore fail to reject the null hypothesis that the AKI data follow a Weibull distribution at 5% level of significance. The Weibull Q-Q Plot in Figure 2 further shows a straight line through a majority of the quantiles this also shows that the Weibull provides a good fit. We can therefore conclude that the proposed distribution is a good fit for the data.
4. Model Calibration to the AKI 2010 Male and Female Published Rates
The PSW dependence model given in Equation (17) is shown in Figure 3 where ; obtained from the Bayesian estimation procedure.
Inspired by  and for illustration purpose, we discuss the empirical results for the patterns of Equation (17) with different degrees of dependence r. We
Table 3. Goodness of fit of Weibull to the AKI data.
Table 4. Goodness of fit using K-S test.
Figure 2. Empirical quantile plot of Weibull to AKI data.
Figure 3. Dependence hazards at various empirical dependence measures.
consider two dependence measures i.e. Spearman’s correlation (black curve) and Pearson correlation (red curve) when the age difference of insured couples is greater than four. As shown in Figure 3, using the above dependence measure shows a significant impact on the hazard compared with the independence (blue curve) assumption. Indeed, one cannot neglect the dependence, even moderate.
Equation (17) is then applied to generate the dependence hazard rates using Pearson correlation (to be consistent with the parametric model chosen) shown in Table 5. Real-life data from a Kenyan insurer shows an average age difference of the insured couples to be 6 years. The joint survival probability is . The EPV for joint-life annuity
where and are commutation functions.
The independence hazard rates in Table 5 are obtained by summing the AKI 2010 male and female mortality rates. The survival probability and EPVs are computed as above. The density function is obtained through standard numerical approximations. Assuming a purchase price of 1000, the level annuity payments are obtained as indicated.
Table 5. Independence and dependence life-table construction.
Impact on Mortality Rates
As shown in Table 5 is lower than during the early annuitants ages. This can be attributed to negative effects of dependence accounted for in the frailty model, thereby accounting for lower-tail dependence that is present. Thereafter, there is an overestimation of mortality risk in the independence model compared to the dependence model due to longevity risk (positive effects of dependence). Here, the upper-tail dependence has been accounted for. Thus the independence assumption underestimates deceleration in the mortality increase at very old ages.
Impact on Annuity EPVs
Consequently, comparing the annuity EPVs and payment streams shows that the independence assumptions lead to an overestimation of the insurer’s liability at the initial stages of the contract thereafter there is an underestimation of liability due to deceleration in the mortality increase at very old ages (longevity risk).
5. Concluding Remarks
Although there is rich literature in frailty dependence modeling, most applications have been in medical field. Various dependence models have been considered in actuarial literature; however, the focus has been on either the lower-tail dependence or upper-tail dependence.
This article presents the frailty dependence approach calibrated on the AKI 2010 male and female published mortality rates due to limited joint-life mortality data-set available in the Kenyan market. This methodology offers greater flexibility than the lower-tail or upper-tail dependence models while preserving closed-form expressions for the net survival functions. Our strategy is to apply the conditional independence assumption in a positive stable frailty approach to account for lower and upper-tail dependence. A positive stable frailty approach is then applied to construct dependence life-tables. The frailty joint-life mortality rates are proposed to generate life annuity payment streams in the competitive Kenyan market.
The conclusion reached is that comparing the independence mortality assumption with the dependence frailty model shows a decrease in the insurer’s expected liability at the early annuitant’s ages followed by an increase at later ages when dependence is accounted for. This can be explained by the fact that the frail couples shall have died during the early stages of the contract, thereafter deceleration in the mortality increase at very old ages (longevity risk), underscoring the importance of dependence modeling in pricing insurance products. Thus, assuming the joint lives to be independent could lead to biased annuity valuations.
We would like to acknowledge the Association of Kenya Insurer for the study conducted on assured lives during 2010 whose mortality rates are currently being used by insurance companies in Kenya and have been used in the study.
All authors have contributed equally in the development of this article.
 Gildas, R., Francois, D., Enkelejd, H. and Youssouf, T. (2018) On Age Difference in Joint Lifetime Modeling with Life Assurance Annuity Applications. Annals of Actuarial Science, 12, 350-371.
 Meyricke, R. and Sherris, M. (2013) The Determinants of Mortality Heterogeneity and Implications for Pricing Annuities. Insurance: Mathematics and Economics, 53, 379-387.
 D’Amato, V., Haberman, S. and Piscopo, G. (2017) The Dependency Premium Based on a Multifactor Model for Dependent Mortality Data. Communications in Statistics—Theory and Methods, 48, 50-61.
 Clayton, D.G. (1978) A Model for Association in Bivariate Lifetables and Its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence. Biometrika, 65, 141-151.
 Brouhns, N., Denuit, M. and Vermunt, K.J. (2002) A Poisson Log-Bilinear Regression Approach to the Construction of Projected Lifetables. Insurance: Mathematics and Economics, 31, 373-393.
 Cossette, H., Marceau, E., Mtalai, I. and Veilleux, D. (2017) Dependent Risk Models with Archimedean Copulas: A Computational Strategy Based on Common Mixtures and Applications. Insurance: Mathematics and Economics, 78, 53-71.
 Hong, L. and Yang, L. (2018) Modeling Cause-of-Death Mortality Using Hierarchical Archimedean Copula. Scandinavian Actuarial Journal, No. 3, 247-272.
 Czado, C., Kastenmeier, R., Brechmann, E.C. and Min, A. (2012) A Mixed Copula Model for Insurance Claims and Claim Sizes. Scandinavian Actuarial Journal, 4, 278-305.
 Scollnik, D.P.M. (1993) A Bayesian Analysis of a Simultaneous Equations Model for Insurance Rate-Making. Insurance: Mathematics and Economics, 12, 265-286.