In actuarial science or biostatistics we often encounter bivariate data which are already grouped into cells forming a contingency table, see Partrat  (p 225), Gibbons and Chakraborti  (p 511-512) for examples, and the primary focus is on dependency study and we only want like to make inference on association parameters of the parametric survival copula used to model the dependency of the two components of the bivariate observations.
For the complete data, in actuarial science or biostatistics usually we assume to have a sample of nonnegative bivariate observations which are independent and identically distributed (iid) as with the bivariate survival function expressible as
where is the survival copula function, and are the marginal survival functions. The bivariate model with deductibles in actuarial science as given by Klugman and Parsa  can be considered as having complete data within this framework as we still have a sample of bivariate observations which are iid.
In this paper, we emphasize nonnegative distributions. So in general we use survival functions and survival copula functions but it is not difficult to see that the statistical procedures developed can be adjusted to handle the situation where we use distribution functions and distribution copula instead of survival function and survival copula. If we use distributions functions then the bivariate distribution function
where the marginal distribution functions are given respectively by and . In the paper by Dobric and Schmid  , distributions functions are used as the authors emphasize financial applications instead of actuarial science applications. It is not difficult to see that statistical procedures are similar.
For illustrations, we shall discuss of few examples of parametric model for survival copulas. In general, a survival copula can be viewed as a bivariate survival function but the bivariate sample of observations which is given by the complete data is not drawn directly from this bivariate survival function. This should be taken into account when developing inferences methods even when the data is complete. It is natural to have procedures which provide a unified approach for grouped data and for complete data but must be grouped so a rule for grouping the complete data needs to be specified. We shall see that a rule for grouping the data is equivalent to a rule for choosing points on the nonnegative quadrant. We propose inference procedures which are based on quadratic distance and which lead to chi-square tests statistics for the composite hypothesis.
with the vector of parameters given by and in most of the applications, we just need one or two parameters and the true vector of parameters for the copula model is denoted by . Also, by copula in general we mean survival copula.
In actuarial science we often encounter grouped data, see Klugman et al.  for the univariate case. Inferences procedures for bivariate censored data have been developed by Shih and Louis  , see the review paper by Genest et al.  but inference procedures for grouped data do not seem to receive attention and furthermore, despite that the chi-square tests statistics that Dobric and Schmid  propose to make use of a contingency table, complete data must be available first, and then transformed by the marginal empirical distribution functions, subsequently put into cells of a contingency table. By making use of multinomial distributions which are induced by a contingency table, chi-square tests can be proposed. In practice, if data are gouped into a contingency table without being transformed, then the tests procedures are no longer applicable. They also note that chi-square tests statistics can have good power along some direction of the alternatives yet being simple to apply and might be of interest for practitioners.
We also know that chi-square tests statistics in one dimension might not be consistent in all direction of the alternatives yet due to its simplicity to apply as there is a unique asymptotic chi-square distribution across the composite hypothesis, one can control the size of the test. Depending on the alternatives and by carefully choosing the intervals to partition the real line, chi-square tests can still have good power against some directions of the alternatives and in practice. Often we are primarily concerned about some type of alternatives instead of all alternatives. For these advantages, chi-square tests are still used despite there are more powerful tests such as the Cramer-Von Mises tests, see Greenwood and Nikulin  (p 124-126) for power under contaminated mixture distributions alternatives and Lehmann  (p 326-329) for discussions on power of chi-square tests which are related to the way to create intervals to group the data in one dimension.
Therefore, if we can retain the advantages of the chi-square tests in two dimensions of having a unique chi-square distribution across the null composite hypothesis and improve on the issue of arbitrariness of a grouping rule, the inference procedures might still be attractive for practitioners as implementing other tests procedures might need extensive simulations to approximate a null distribution which depends on .
In this paper, we would like to develop minimum quadratic distance (MQD) procedures for grouped data and the procedures can be extended to the situation of having complete data and they must be grouped by specifying a rule which make use of the Halton sequence of Quasi-Monte Carlo (QMC) numbers and two empirical quantiles from the two marginal distributions or marginal survival functions. Tests for copula models can be performed using chi-square tests statistics with data already grouped and if complete data is available they can be grouped according a more clearly defined rule. As mentioned earlier, the rule to select cells to group the data is a rule to select points on the nonnegative quadrant to construct quadratic distances. If complete data is available then it is established using QMC methods and based on the idea of selecting points in the nonnegative quadrant so that Cramer-Von Mises distances can be approximated by quadratic distances. The methods can also be applied to Copula models with a singular component when provided that the Copula function is differentiable with respect to the parameters given by . An example of such a copula is the one parameter Marshall Olkin(MO) copula, for discussions on MO copulas, see Dobrowolski and Kumar  and Marshall and Olkin  .
We briefly list some copula models often encountered in practice. Most of them just have one or two parameters. A subclass of Archimedean copulas has the representation using a generator which is the Laplace transform (LT) of a nonnegative random variable denoted by . The class can be represented as
If we specify a gamma LT with , then we have the Clayton or Cook-Johnson copula model
If we specify a positive stable LT then we have the positive stable copula model which is also called positive stable frailties model with
see Shih and Louis  for these families and for simulations from these copulas, and see the algorithms given by Mai and Scherer  (p 98-99).
Beside this subclass the one and two parameters Marshall Olkin copula models are also frequently used. The two parameters MO model can be expressed as
if and if ,
The model has a singular component and if , the MO copula model just has one parameter and
if and if ,
note that is singular for but a function of , is differentiable. For further discussions on MO copulas see Dobrowolski and Kumar  and see Ross  (p 103-108) for simulations from MO copulas and Gaussian copulas. The Gaussian Copula model can be represented by
with the standard normal univariate quantile function denoted by and the integrand of the above integral is a bivariate normal density function with standard normal marginals and parameter ρ.
Copulas are often used to create bivariate distributions and for inference procedures for these distributions for actuarial science, see Klugman and Parsa  , Klugman et al.  , Frees and Valdez  for examples.
Before giving further details and properties of MQD methods, we shall give the logic behind the MQD procedures.
Let the bivariate empirical survival function be defined as
with being the usual indicator function, let and define the two univariate empirical marginal survival functions as
we then have the following convergence in probability properties,
with the true survival function and the marginal survival functions are given respectively by , and . We shall assume that and are absolutely continuous, is either absolutely continuous or is absolutely continuous everywhere except when where the survival distribution can be singular as in the case of the bivariate exponential model introduced by Marshall and Olkin  .
Now if the parametric survival copula model is valid,
For the time being assume that the M points given by are already chosen, then we can define the vector of empirical components,
with the counterpart vector which makes use of the copula model,
and form the vector of differences , by choosing a symmetric positive definite matrix we can form a class of quadratic distances (QD) given by
A positive definite matrix can be used to create a weighted Euclidean norm, so we can also let
is the weighted Euclidean norm induced by and if we let then we obtain the classical Euclidean norm. QD inferences procedures developed subsequently are based on which are similar to the univariate case. For MQD procedures with univariate observations, see Luong and Thompson  .
The paper is organized as follows.
In Section 3, MQD methods will be developed using predetermined grouped data such as data presented using a contingency table. The efficient quadratic distances is derived and can be used for estimation and model testing. Asymptotic theory is established for MQD estimators and chi-square tests using quadratic distances can be constructed for testing copula models. In Section 4, by viewing grouped data as defining a set of points on the nonnegative quadrant, a rule to select points is proposed based on Quasi-Monte-Carlo numbers and two sample quantiles if complete data is available and the methods can be extended to the situation where complete data is available. The methods can be seen as similar to minimum chi-square methods with random cells but with a rule to define these cells. The choice of random cells for minimum chi-square methods is less well defined. Section 5 illustrates the implementations of MQD methods using a limited simulation study by comparing the methods of moment estimator (MM) estimators based on sample Spearman rho which requires the availability of complete data versus the MQD estimator which uses grouped data for the one parameter Marshall-Olkin model and it appears that the chi-square tests have some power to detect alternatives which can be represented as mixture or contaminated copula model such as the mixture of one parameter Marshall-Olkin copula model and Gaussian copula model from the study. The findings appear to be in line with chi-square tests in one dimension which also display similar properties if intervals are chosen properly.
2. MQD Methods Using Grouped Data
2.1. Contingency Tables
Contingency table data can be viewed as a special form of two-dimensional grouped data. We will give some more details about this form of grouped data.
Assume that we have a sample which are independent and identically distributed as which follows a non-negative continuous bivariate distribution with model survival function given by . The marginal survival functions are given respectively by and assumed to be absolutely continuous but there is no parametric model assumed for the marginals.
The vector of parameters is , the true vector of parameters is denoted by . We do not observe the original sample but observations are grouped and put into a contingency table and only the number which fall into each cells of the contingency table are recorded or equivalently the sample proportions which fall into these cells are recorded. Contingency tables are often encountered in actuarial science and biostatistics, see Partrat  (p 225), Gibbons and Chakraborti  (p 511-512) and we shall give a brief description below.
Let the nonnegative axis X be partitioned into disjoints interval with and similarly, the axis Y be partitioned into disjoints interval with .
The nonnegative quadrant can be partitioned into nonoverlapping cells of the form.
The contingency table is formed which can be viewed as a matrix with elements given by
The empirical bivariate survival function is as defined earlier with , the underlying bivariate survival distribution. We assume that is either absolutely continuous or it can have a singular component when as in the case of the bivariate exponential distribution of Marshall Olkin  but absolutely continuous elsewhere. Implicitly, the marginal survival functions and are assumed to be absolutely continuous.
The sample proportion or empirical probability for one observation which falls into cell can be obtained using
and the corresponding probability using the copula model coupled with the empirical survival distributions and with is given by
It is not difficult to see that there is redundant information displayed by a contingency table, one way to see that there is duplication is to note
and similarly, .
Therefore, the set points given by can be discarded without affecting the information provided by the contingency table. Consequently, we can view a contingency table implicitly define a grid on the nonnegative quadrant with only points. It is also clear that if we want a rule to choose cells, the same rule will allow us to choose points on the nonnegative quadrant.
The objective function of the proposed quadratic form will be given below. It is a natural extension of the objective function used in the univariate case. Define a vector with empirical components so that we only need one subscript by collapsing the points of the contingency table given by
into a vector by putting the first row of the matrix as the first batch of elements of the vector and the second row being the second batch of elements so forth so on, i.e., let
and its counterpart which makes use of the copula model is
The number of components of is M with the assumption .
A class of quadratic distances can be defined as
with being a symmetric and positive definite matrix. In this class, we focus on two choices of .
Letting , we obtain the unweighted quadratic distance, this choice is not optimum but it produces consistent estimators and can be used as preliminary estimates for to start the numerical procedures for finding more efficient estimators. The matrix is defined up to a positive constant as minimizing the objective function multiplied by a positive constant still gives the same estimators and a consistent estimate of can be used to replace without affecting the asymptotic theory for estimation and asymptotic distribution for test statistics. Using quadratic distance theory or generalized methods of moment (GMM) theory, it is not difficult to see that an optimum choice for is to let where and is an asymptotic covariance matrix which is given by
see Remark 2.4.3 given by Luong and Thompson  (p 245).
Clearly, depends on . We shall obtain the expression for and show that can be estimated by in the next section as we can obtain a preliminary consistent estimate for by using the unweighted quadratic distance or other quick methods; see the methods of moment using Spearman-rho in Section 5.2 for example. Consequently, by quadratic distance we mean the following efficient version with the objective function defined as
with . (8)
The version with will be called unweighted quadratic distance. In the next section we shall use the influence function representation for to derive and we shall also propose a consistent estimate for .
2.2. Optimum Matrix W0
The matrix which is the asymptotic covariance matrix of the vector plays an important role for MQD methods as we can obtain estimators with good efficiencies for estimators using or a consistent estimate of and we also have chi-square tests statistics. Despite that is unknown, its elements are not complicated and moreover, it can be replaced by a consistent estimate without affecting the asymptotic properties of the procedures. We shall give more details about this matrix and construct , a consistent estimate of .
Using influence representation for the vector of functions of which depend on three functions as discussed by Reid  , see technical appendix (TA1) in the Appendices for more details, it can be seen that is the covariance matrix of the vector under with
and is the usual indicator function,
are respectively the partial derivatives of with respect to u and v.
It is not difficult to see that the elements of are
with and since and are not identically distributed is not symmetric, the matrix has 9 elements, see technical Appendix (TA2) in the Appendices for more details. The elements can be expressed as
The elements can be estimated empirically by replacing in the expressions of by for . The estimates can be formed.
Therefore, we can form which estimates . Similarly, by replacing by a consistent preliminary estimate which can be obtained using the unweighted quadratic distance for example and replacing by we can estimate by .
an estimate for will have the elements given by
and define . will be used as an optimum matrix for constructing quadratic distance as the asymptotic property remain unchanged. We can replace the unknown matrix by its consitent estimate which is without affecting asymptotic theory for estimation and tests.
3. MQD Methods Using Grouped Data
The MQD estimators can be seen as given by the vector which minimizes
we can also used the weighted Euclidean norm with the use of and let
Consistency for quadratic distance estimators using predetermined grouped data or if complete data is available but must be grouped according a rule can be treated in a unified way using the following Theorem 1 which is essentially Theorem 3.1 of Pakes and Pollard  (p 1038) and the proof has been given by the authors. In fact, their Theorems 3.1 and 3.3 are also useful for Section 4 where we have complete data and we have choices to group the data into cells or equivalently forming the artificial sample points on the nonnegative quadrant to form the quadratic distances.
Theorem 1 (Consistency)
Under the following conditions converges in probability to :
1) , the parameter space Ω is compact
3) for each .
Theorem 3.1 states condition b) as but in the proof the authors just use so we state condition b) as .
An expression is if it converges to 0 in probability, if it is
bounded in probability and if it converges to 0 in probability faster than . We have occurs at the values of the vector values of the MQD estimators, so the conditions 1) and 2) are satisfied for both versions. Implicitly, we make the assumption that the parameter space Ω is compact. Also, for both versions only at in general if the number of components of is greater than the number of parameters of the model, i.e., .
For we have for some since survival functions evaluated at points are components of and these functions are bounded. This implies that there exist real numbers u and v with such that
Therefore, the minimum quadratic distance (MQD) estimators are consistent, i.e., . The Theorem 3.1 given by Pakes and Pollard  (p 1038-1039) is an elegant theorem using the norm concept of functional analysis. Now we turn our attention to the question of asymptotic normality for the quadratic distance estimators and it is possible to have unified approach using their Theorem 3.3, see Pakes and Pollard  (p 1040-1043) where we shall restate their Theorem as Theorem 2 and Corollary 1 given subsequently after the following discussions on the ideas behind their theorem, allowing us to get asymptotic normality results for estimators obtained from extremum of a smooth or nonsmooth objective function.
Note that (16)
The points are predetermined by a contingency table we give and we have no choice but to analyze the grouped data as they are presented.
Note that is non-random and if we assume is differentiable with repect to with derivative matrix , then we can define the random function to approximate with
By using which is the partial derivative of with repect to , the matrix can be displayed explicitly as
Note that is differentiable and a quadratic function of , the vector which minimizes can be obtained explicitly with
and since . is assumed to be a positive define matrix; we have
Clearly set up fits into the scopes of their Theorem 3.3 where we shall rearrange the results to make them more suitable for MQD methods and verify that we can satisfy the regularity conditions of Theorem 3.3. We shall state Theorem 2 and Corollary 1 below which are essentially their Theorem (3.3) and the proofs have been given by Pakes and Pollard  . Note that the condition 4) is slightly more stringent but simpler to check than the condition 3) in their Theorem.
Let be a vector of consistent estimators for , the unique vector which satisfies .
Under the following conditions:
1) The parameter space Ω is compact, is an interior point of Ω.
3) is differentiable at with a derivative matrix of full rank.
4) for every sequence of positive numbers which converge to zero.
6) is an interior point of Ω.
Then, we have the following representation which will give the asymptotic distribution of in Corollary 1, i.e.,
or equivalently, using equality in distribution,
The proofs of these results follow the results used to prove Theorem 3.3 given by Pakes and Pollard  (p 1040-1043). For expression (22) or expression (23) to hold, in general only condition 5) of Theorem 2 is needed and there is no need to assume that has an asymptotic distribution. From the results of Theorem 2, it is easy to see that we can obtain the main result of the following Corollary 1 which gives the asymptotic covariance matrix for the quadratic distance estimators for both versions.
Let , if then with
The matrices and depend on , we also adopt the notations .
We observe that when applying condition 4) of Theorem 2 to MQD methods in general involves technicalities. Note that to verify the condition 4, it is equivalent to verify
a regularity condition for the approximation is of the right order which implies the condition 3 given by their Theorem 3.3, which might be the most difficult to check. The rest of the conditions for Theorem 2 are satisfied in general.
and define which can be expressed as
Consequently, can also be expressed as
Since the elements of are bounded in probability, it is not difficult to see that the sequence is bounded in probability and continuous in probability with as . Also note that . Therefore, results given in section of Luong et al.  (p 218) can be used to justify the sequence of functions. attains its maximum on the compact set in probability and hence has the property as and .
Using results of Corollary 1, we have asymptotic normality for the MQD estimators which is given by
as given by expression (19) can be estimated once the parameters are estimated.
3.2. Model Testing
3.2.1. Simple Hypothesis
In this section, the quadratic distance will be used to construct goodness of fit test statistics for the simple hypothesis
H0: data coming from a specified distribution with distribution
is specified. The chi-square test statistic with its chi-square asymptotic distribution and its degree of freedom
It is not difficult to see that indeed we have the above asymptotic chi-square distribution as and
, using standard results for distribution of quadratic forms,
3.2.2. Composite Hypothesis
The quadratic distances can also be used for construction of the test satistics for the composite hypothesis
H0: data comes from a parametric model .The chi-square test statistic and its asymptotic distribution are given similarly in this case by
with .To justify the asymptotic chi-square distribution given above, note that we have the equality in probability, . It suffices to consider the asymptotic distribution of as we also have the following equalities in distribution,
as given by expression. Therefore we also have the following equalities in distribution, which can be reexpressed as
or equivalently, with .
and note that and the trace of the matrix is ; the rank of the matrix is also equal to its trace using the techniques as given by Luong and Thompson  (p 248-249).
4. Estimation and Model Testing Using Complete Data
In Section 4.1 and Section 4.2, we shall define a rule of selecting the points if complete data are available. Selecting points is equivalent to define the cells used to group the data and we shall see that random cells will be used as the points constructed using Quasi-Monte Carlo (QMC) numbers on the unit square multiplied by two chosen sample quantiles from the two marginal distributions will be used. They are random and can be viewed as sample points on the nonnegative quadrant forming an artificial sample. For minimum chi-square methods it appears to be difficult to have a rule to choose cells to group the data, see discussions by Greenwood and Nikulin  (p 194-208). We need a few preliminary notions tools and define sample quantiles then statistics can be viewed as functionals of the sample distribution; the notion of influence function is also introduced and this useful tool will be used to find their asymptotic variance of the functional.
We shall define the pth sample quantile of a distribution as we shall need two sample quantiles from the marginal distributions together with QMC numbers to construct an approximation of an integral. Our quadratic distance based on selected points can be viewed as an approximation of a continuous version given by an integral as given by expression (33).
From a bivariate distribution we have two marginal distributions and . The univariate sample pth quantile of the distribution assumed to be continuous is based the sample distribution function
and it is defined to be and its model counterpart is given by . We also use the notation and . We define similarly the qth sample quantile for the distribution as and its model counterpart with .
The sample survival function is defined as
The sample quantile functions or can be viewed as statistical functionals of the form with or . The influence function of is a valuable tool to study the asymptotic properties of the statistical functional and will be introduced below. Let H be the true distribution and is the usual empirical distribution which estimates H; also let be the degenerate distribution at x, i.e., if and , otherwise; the influence function of T viewed as a function of x, is defined as a functional directional derivative at H in the direction of . Letting , is defined as
and is a linear functional.
Alternatively, it is easy to see that and this gives a convenient way to compute the influence function. It can be shown that the influence function of the pth sample quantile is given by
with h being the density function of the distribution H which is assumed to be absolutely continuous, see Huber  (p 56), Hogg et al.  (p 593). A statistical functional with bounded influence function is considered to be robust, B-robust and consequently the pth sample quantile is robust. The sample quantiles are robust statistics.
Furthermore, as is based on a linear functional, the asymptotic variance of is simply with being the variance of
the expression inside the bracket since in general we have and we have following representation when is bounded as a function of ,
, see Hogg et al.  (p 593). Consequently, in general we have for bounded influence functional with the use of means of central limit theorems (CLT) the following convergence in distribution
The influence function representation of a functional which depends only on one function such as is the equivalence of a Taylor expansion of a univariate function and the influence function representation of a functional which depends on many functions is the equivalence of a Taylor expansion of a multivariate function with domain in an Euclidean space and having range being the real line. Since we work with marginal survival functions, we define the pth sample quantiles of the marginals survival functions as
The influence functions for and can be derived using the definitions of influence functions or obtained from the influence functions of and .
Subsequently, we shall introduce the Halton sequences with the bases and and the first M terms are denoted by
We also use to denote set of points . The sequence of points belong to the unit square can be obtained as follows.
For , we divide the interval into half ( ) then in fourth
( ) so forth so on to obtain the sequence .
, we divide the interval
Note that the Halton sequence of numbers are deterministic and useful for approximating an integral, if we would like to compute numerically an integral of the form with being a bivariate function. Using the M terms of the Halton sequence and QMC principles, it can be approximated as
but if we are used to integration by simulation we might want to think the M terms represent a quasi random sample of size M from a bivariate uniform distribution which is useful for approximating A.
From observations which are given by iid with common bivariate survival distribution . Let the two marginal survival functions be denoted by and and they are absolutely continuous by assumption; also define the bivariate empirical distribution function which is similar to the bivariate empirical survival function as
The two empirical marginal survival functions are defined respectively by
We might want to think that we would like to approximate the following Cramer-Von Mises distance expressed as an integral given by
which is similar to univariate Cramér-Von Mises (CVM) distance and minimizing the distance with respect to will give the CVM estimator for , see Luong and Blier-Wong  for CVM estimation for example.
In the next section we shall give details on how to form a type of quasi sample or artificial sample of size M using the
We can see the expression (34) is an unweighted quadratic distance using the identity matrix as weight matrix instead of . The unweighted quadratic distance still produces consistent estimators but possibly less efficient estimators than estimators using the quadratic distance with for large samples and for finite samples the estimators obtained using might still have reasonable performances and yet being simple to obtained.
The set of points is a set of points proposed to be used to form optimum quadratic distances in case that complete data is available. We shall see the set of points depend on two quantiles chosen from the two marginal distributions and they are random consequently. We might want to think that we end up working with random cells.
As for the minimum chi-square methods if random cells stabilize into fixed cells minimum chi-square methods in general have the same efficiency as based on stabilized fixed cells, see Pollard  (p 324-326) and Moore and Spruill  for the notion of random cells; quadratic distance methods will share the same properties. The chosen points are random but it will be shown that they do stabilize and therefore these random points can be viewed as fixed at stabilized points and despite that they are random, it does not affect efficiencies of the estimators or asymptotic distributions of goodness-of-fit test statistics which make use of them. These properties will be further discussed and studied in more details in the next section along with the introduction of an artificial sample of size M given by the points on the nonegative quadrant which give us a guideline on how to choose points if complete data is available.
4.2. Halton Sequences and an Artificial Sample
From the M terms of the Halton sequences, we have .
Let and , we can form the artificial sample with elements given by with with . Note that we have the following relationships between empirical quantile based on distributions and survival functions with and .
We can view being a form of quasi random sample on the nonnegative quadrant and these are the points proposed to be used in case of complete data is available. In general, we might want to choose if and if n is small we try to ensure . Consequently as , M remains bounded. If , there might be difficulty to obtain the matrix as might be nearly singular. In practice we tend to replace by a near optimum matrix obtained from by regularizing the eigenvalues of which might not be stable which causes the matrix to be nearly singular hence will not be available; see Section 5.1 for more discussions on these issues.
Since and , with and for and the points are non-random or fixed.
It turns out that quadratic distances for both versions constructed with the points are asymptotic equivalent to quadratic distances using the points so that asymptotic theory developed using the points considered to be fixed continue to be valid; we shall show indeed this is the case. Similar conclusions have been established for the minimum chi-square methods with the use of random cells provide that these cells stabilize to fixed cells, see Theorem 2 given by Pollard  (p 324-326). We shall define a few notations to make the arguments easier to follow.
Define and similarly let
We work with the quadratic distance defined using which leads to consider quadratic of the form . Now to emphasize and which depend on , we also use respectively the notations and and define
It suffices to verify that results of Theorem 1, Theorem 2 and its corollary in Section 3 continue to hold.
Observe that we have
This also means that we have the same limit in probability for and as we have and .
It remains to establish .
Using results on the influence functions representations for functionals as discussed, it suffices to show that the vector has the same influence representation as the vector to conclude that all the asymptotic results are valid even are random.
We shall derive the influence functions for elements of the vector of functional and show that they are the same for the corresponding elements of the vector of functional . Let be the true bivariate survival function and under the parametric model being considered, and we also use the notation .
Let be the degenerate bivariate survival function at the point , i.e., if and and , otherwise.
Let the degenerate survival function at x be defined as
, otherwise. Similarly, let the degenerate survival function at
which is a contaminated bivariate survival function and
Similarly for the marginals,
Now, we consider the jth element of , with each
Clearly, depend on and
but we can use the influence function representation of , a technique proposed by Reid  (p 80-81) but in this case it will need three influence functions which are given by
which is bounded with respect to ,
and the expression is reduced to
by noting the first two terms of the the RHS of the above expression cancel each other since we have which implies
If we compare with the corresponding jth term of given by the functional , we can verify the functional has the same influence functions as the functional . It is not difficult to see that we have the equalities
Therefore, all the asymptotic results of Section 3 remain valid and all these influence functions are bounded so that inference methods making use of these functionals are robust in general. Furthermore, we can consider the inference procedures based on quadratic distances as we have non-random points if they can be replaced by without affecting the asymptotic results already established in Section 3. For more discussions on random cells and influence function techniques for minimum chi-square methods and related quadratic distance methods, see Luong  .
5. Numerical Issues and a Limited Study
5.1. Numerical Issues
In this section we shall consider the numerical problem of not being able to obtain the matrix as might be nearly singular and we need to replace by a near optimum matrix obtained from .The techniques of regularizing a matrix have been introduced by Carrasco and Florens  (p 809-810) for GMM estimation with continuum moment conditions, MQD methods can be viewed as similar to GMM with a finite number of moment conditions and clearly the techniques can also be applied for MQD methods. We use the spectral decomposition of to obtain its eigenvalues and eigenvectors, see Hogg et al.  (p 179) for the spectral decomposition of a symmetric positive definite matrix which allows us to express
where the are positive eigenvalues with corresponding eigenvectors given by the of the matrix . Now, observe that
is not obtainable numerically. It is due to the eigenvalues which are not stable, the regularization of will lead to the following matrix which hopefully is obtainable and approximate . It consists of perturbing the by a small positive number a and define the approximate optimum matrix as
Carrasco and Florens  (p 809-810) for GMM estimation with continuum moment conditions have shown that asymptotic theory remains unchanged if at a suitable rate as . This condition is difficult to verify in practice. However, we might want to continue to use the asymptotic theory in an approximate sense, i.e., we can replace by and view such a replacement does not modify the asymptotic theory in practice.
A more rigorous approach to justify the chi-square distribution for goodness of fit tests is to divide into 2 steps, first using to construct the distance for estimation and letting be the vector which minimizes
Using Equation (31) we have
also see expression (3.4.2) given by Luong and Thompson  (p 248). The matrices and are respectively consistent estimates of and .
It suffices to find the Moore-Penrose generalized inverse of and construct the test statistics as
The asymptotic distribution of the test statistics will be again chi-square with degree of freedom using distribution theory for quadratic forms, see Luong and Thompson  (p 247) for example and for generalized inverses, see Harville  (p 493-514).
Note that if can be used for estimation then we can let , i.e. there is no need to use two quadratic distances separately.
5.2. A Limited Simulation Study
For the study, we fix the number of points . The two samples quantiles are 0.99 quantiles or 0.01 survival functions quantiles if marginal empirical survival functions are used instead of distribution functions for estimation without construction of goodness-of-fit tests. The points used are constructed using the procedures given in Section 4.2. We consider the one parameter MO copula model with
if and if . (37)
is differentiable with respect to and is singular if and see Dobrowolski and Kumar  (p 2). For this model, the model
Spearman rho , see Dobrowolski and Kumar  (p 5).
The sample Spearman rho is simply the Pearson correlation coefficient but computed using ranks of the observations from the two empirical marginal distributions, see Conover  (p 314-318).
If complete data are available, equating gives the moment estimator
and one might expect that the moment estimator has reasonable efficiency as we only has one parameter in this model and the estimate is based on ranks.
The moment estimate can be used to compute which is needed for chi-square tests and for estimation using quadratic distances. We use and there is no problem on inverting the matrix . Clearly if data is already grouped we can use the unweighted quadratic distance to provide a consistent preliminary estimate for . The efficient MQD estimator is denoted by . In the simulation study since we have so many marginal survival functions which can be used so we decide to draw observations directly from the Copula Models. This is not what happens in real life situation but we want to test the procedures. We do not have the computing resources for a large scale study and try various marginal survival functions. More works need to be done but we want to illustrate the procedures.
We use sample size and the number of samples used is . For comparison of of MQD estimator versus Methods of moment (MM) estimator we use the ratio of relative efficiency
where the mean square error of an estimator for is defined as
which can be estimated using M samples each of size n.
The unweighted QD estimator is denoted by as the identity matrix I is used for the unweighted quadratic distance. The corresponding
can similarly be used for comparison and it can be estimated using simulated samples.
The range of parameter being considered is , the results are summarized using the first table of Table 1 where we find that the MM estimator and the two quadratic distance estimators have practically equal efficiency up to 4 or 5 decimal precisions.
To study the size of the chi-square tests and the power of the tests let H0: The MO copula model with as given by expression (37) and . With , . Observations are drawn from the model specified by by which specifies the model is a contaminated one given by
is as defined earlier, is the Gaussian copula defined as
Procedures to simulate from Gaussian and MO copulas are given in chapter 6 by Ross  (p 97-108). We use and , the sample size
(a)Power study using M = 25 points, n = 3000 and the alternative hypothesis specified as the contaminated model
, 0 < λ < 1.
(b)Critical point for the test using the 95th percentile of a chi-square distribution, χ 0.95 2 ( 24 ) = 36.41 .Power study using M = 35 points, n = 3000 and the alternative hypothesis specified as the contaminated model ( 1 − λ ) C M O ( u , v ) + λ C G a u s s i a n ( u , v ) , 0 < λ < 1.
Table 1. Asymptotic relative efficiencies comparisons for MQD estimators versus MM estimator using N = 1000 samples of size n = 1000 for the one parameter MO copula Model.
Critical point for the test using the 95th percentile of a chi-square distribution, .
and we use . Dobric and Schmid  (p 1060-1061) in their study have used and their chi-square tests have around 70 degrees of freedom. With only occasionally that is nearly singular, if this happens we discard the sample. We do not have resources for larger scale study; each run takes around three minutes to complete. As most of the time we are drawing observations using an alternative model but for testing we must estimate the parameter of the MO model, the algorithm tends to take time to converge. The study is very limited as the number of simulated samples is small with and only a few copula models are considered but it seems to point to the potential uses of MQD chi-square tests. The tests especially with seem to have power especially along some directions which can be represented as a mixture type of models as shown by the means and standard deviations of the chi-square statistics as displayed in the second and third table of Table 1. More simulation works are needed to assess the power of the MQD tests using various copula models. There are not many statistical procedures for copula models using data that have been already grouped. MQD methods might be useful for this type of situation.
Minimum Quadratic Distance Methods (MQD) offer a unified for estimation and model testing using grouped data under the form of a contingency table for parametric copula models without having to assume parametric models for the marginal distributions. The methods share with minimum chi-square methods by having a unique asymptotic distribution across the composite hypothesis for testing which make the implementations relatively simple without requiring extensive simulations for approximating the null asymptotic distribution. It is shown in this paper that if complete data are available, a rule to define points based on QMC numbers can be proposed to alleviate the arbitrariness on the choice of points to construct quadratic distances. The rule will also make quadratic distances close to Cramer-Von Mises distances. It is well known that in one dimension, chi-square tests cannot be consistent against all alternatives but if the intervals are chosen properly the tests still can have good power against some form of alternatives considered to be useful for applications.
MQD tests statistics with the rule of choosing points might preserve the same properties and by being relative simple to implement, they can be useful for applied works. More numerical and simulation works are needed for further study the power of the MQD tests.
The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support form the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.
Technical Appendix 1 (TA1)
In this technical appendix, we shall consider influence function representation for the vector of functionals to justify the expression. is as given by expression (9) in Section 3.2.
consider the l-th element of , it is given by
which is a functional which depend on three functions but we still can applied the techniques given by Reid  (p 80) to have an influence representation of the functional. Since it depends on three functions we shall have three coresponding influence functions. Let with if and and , elsewhere; also, similarly let with if and , elsewhere and let with if and , elsewhere, with . Consequently,
The three influence functions are given respectively by
Consequently, we have the influence representation for the l-th element of with
and since are iid we have the equality in distribution asymptotically,
Equivalently, using vector notations we have the following equality in distribution asymptotically by letting
, a result which is needed in Section 3.2.
Technical Appendix 2 (TA2)
In this technical appendix, we shall justify the validity of expression (9) of Section 3.2.
The covariance matrix is defined as , the vector
Therefore the elements of the matrix are given by
Now, note that the above equalities which give the elements of the matrix can be reexpressed as the equalities as given by expression (9) in Section 3.2.