Count data are often encountered in many fields of applications which include actuarial sciences and fitting discrete count models are of interests. Classical methods such as maximum likelihood (ML) procedures often require the probability function of the model to have closed-form and furthermore the inferences techniques do not lead to distribution free statistics when using the Pearson statistics. In fact, if a model does not fit the data, better models can be created using compound procedure, stop sum procedure or mixing procedure and the new models might provide a better fit as they can take into account modeling processes which were omitted earlier.
For discussions on these procedures see the books by Johnson et al. , Klugman et al.  but for these better models often they do not have closed-form probability mass functions but their probability generating functions often remain simple and have closed-form expressions.
For example, if count data display long tailed behavior so that the Poisson model with probability generating function does not provide a good fit, the positive discrete stable (DPS) distribution can be created and be used as an alternative to the Poisson distribution. The discrete positive stable distribution (DPS) does not have closed-form or simple form for probability mass function but its probability generating function is simple and given by
see Christoph and Schreiber  for this distribution. In their paper, expression (6) gives the representation of the probability mass function of the DPS distribution using series,
and expression (8) gives the recursive formula to compute using the previous terms
The probability mass function appears to be complicated and for model validation there is a need for a statistic for model testing. By having these issues, it will make maximum likelihood (ML) procedures difficult to implement.
GMM procedures based on probability generating function appear to be a natural way to introduce alternatives to ML procedures, bypassing the use of the probability mass function explicitly and focus uniquely on the probability generating function. In this vein, the procedures proposed in this paper make use of GMM and generalized estimating equation theory and they are less simulation intensive oriented than inference techniques given by the paper by Luong et al. .
We shall use general GMM methodology but adapted it to situations where moment conditions are based on probability generating function so that estimation and model testing can be carried out in a unified way for discrete count models. The choice of moments of the developed GMM procedures makes use of estimating function theory which allows us the use a number of points based on probability generating which tends to infinity as the sample size . Furthermore, we also related GMM estimation with the approach using generalized estimating equations (GEE) based on a set of elementary or basic unbiased estimating function but unlike GEE procedures, GMM procedures also provide distribution free chi-square statistics for model testing but the theory of estimating function is useful as it provides insight on the choice of sample moments for GMM estimation. In another word, the proposed methods blend classical GMM procedures and inference techniques based on estimating equation which in general will allow flexibility, efficiency and model testing yet being relatively simple to implement and might be of interests for practitioners. Consequently, the new methods differ from proposed GMM procedures in the literature from the following points:
1) GMM procedures as proposed by Doray et al.  only make use of a finite number of points of the probability generating function, our methods aim at achieving higher efficiency yet remain simple to implement and it is done by linking to the theory of estimating function, it can accommodate the use of a number of points from the probability generating function instead of being fixed, it goes to infinity as .
2) The new GMM procedures remain simpler to implement than GMM procedures using a continuum moment conditions in general as proposed by Carrasco and Florens  or methods on adapting GMM procedures using a continuum of moment conditions for characteristic function proposed by Carrasco and Kotchoni  to probability generating function. Practitioners might find the sophisticated methods based on a continuum moment conditions difficult to implement.
The paper is organized as follows. In Section 2, we review available results from general GMM theory, despite the results are not new once the moment conditions are defined but they make the paper more self-contained as these results will be adapted subsequently with moments conditions extracted from the probability generating function when count models are considered. In Section 3, GMM estimation and related GEE estimation for count models are considered. The chi-square statistics are also given in Section 3.2.2. In Section 3.2.3, we consider GMM procedures based on optimum orthogonal estimating functions. In Section 4 we illustrate the implementation of the GMM methodology and preliminary results show that the methods are simple to implement and have the potentials of being very efficient. The new methods display flexibility as they can accommodate the changes to the sample moments for better efficiencies if needed and it can be done within the framework of the inference methods developed.
2. Generalized Method of Moments (GMM) Methodology
The inferences techniques based on probability generating functions developed in this paper make use of results of Generalized Method of Moments (GMM) theory which are well established once the moment conditions are specified, see Martin et al.  (p 352-384, also see Hamilton . In this section, we shall briefly review GMM methodology for estimation and moment restrictions testing to make the paper easier to follow and connect to the problem on how to select moment conditions for discrete distributions based on probability generating functions for applying GMM methods.
The estimating equations of GMM methods will also be linked to the theory of estimating equations and generalized estimating equations (GEE) as developed by Godambe and Thompson , Morton , Liang and Zeger .
2.1. Generalized Estimating Equations (GEE) and GMM Estimation
For data, we shall assume that we have n independent observations , these observations need not be identically distributed but each will follow a distribution which depends on the same vector of parameters , , is compact and . The true vector of parameters is denoted by .
For the time being, assume that we have identified n unbiased basic estimating functions or elementary estimating functions denoted by with the property
for . (1)
The optimum estimating functions based on linear combinations of for estimating is given by
and is the variance of .
The vector of estimators based on the optimum estimating equations are given as solutions of the system of equations . This result is given by Godambe and Thompson  (page 4) and Morton  (page 229-230).
In applications, often we restrict our attention to with some common functional form so that we also use the notations and more precisely , is a constant.
With this notation which is commonly used in the literature, notice that the random variables given by need not be identically distributed.
Also, since estimating equations are defined up to a constant which does not depend on , the related estimating functions used can be re-expressed equivalently as
and the vector of estimators based on the optimum equations are given as solutions of the system of equations , using expression (3). Using vector notations, the vector of optimum estimating functions based on
expression (3) can be expressed as , and the vector of estimators based on are solutions of and from this observation it is clear that the factor can be omitted when defining estimating functions or equations.
Now suppose that we have vector with the property
the optimum estimating functions for estimating based on linear combinations of the elements of the set are also called generalized optimum estimating functions, see Morton  (p 229-230), also see expression (6) as given by Liang and Zeger  (page 15) are given by
and the estimators are given by the vector obtained by solving
where is the covariance matrix of under and its inverse , is also referred to as a working matrix in the literature of esti-
mating equation theory and which is a p by k matrix.
Clearly expression (4) is more general than expression (3) and is reduced to expression (3) when is a scalar instead of a vector.
For the studies of estimating functions, Godambe and Thompson  emphasized efficiency of estimating equations rather efficiency of the vector of optimum estimators obtained by solving estimating equations.
For applications, often we need the asymptotic covariance matrix of . For this purpose, we use the set up for the study of generalized estimating equations (GEE) as considered by Liang and Zeger  (p 15-16). Using a Taylor’s expansion and results of their Theorem 2 (p 16) we can obtain the asymptotic covariance matrix of .
with convergence in probability denoted by and convergence in distribution denoted by .
Therefore, the asymptotic covariance for is simply which can be estimated. A Fisher scoring algorithm as given by expression (6) as described by Liang and Zeger  (p 16) can be used to obtain the estimators numerically . The algorithm gives the j + 1-th iteration based on the previous j-th iteration as
Other numerical techniques to obtain can be used. For example, we can consider solving the system of equations given by expression (4) and expression (5) as and can be obtained by minimizing , techniques for minimization can be used to obtain numerically.
Now we turn our attention to GMM estimation methodology and we observe that the set of estimating equations using expression (2) can be reobtained using a GMM estimation set up. GMM estimation is based on the use of a k moment conditions specified by a vector function
with its expectation with the property
The sample moments being the counterparts of
are defined as and define the vector of sample moments as
Now we need a positive definite symmetric matrix or a positive definite matrix symmetric with probability one which is denoted by to define a quadratic form using , will be defined subsequently and this allows the objective function
to be formed for GMM estimation and the GMM estimators are given by the vector which minimizes .
We shall define the matrix first then its estimate is from which we can obtain its inverse . In fact can be viewed as the limit as of
the covariance matrix of the vector and the covariance matrix of can be seen as given by , then and its estimate can be defined respectively as
and with a preliminary consistent estimate for then we can define
and is positive definite with probability one and clearly is symmetric, its inverse is which exists with probability one. Despite that these two expressions for are asymptotically equivalent but for numerical implementations of the methods in finite samples, the matrix
has more chance to be invertible.
Under suitable differentiability assumptions imposed on the vector function , the GMM estimators given by is consistent and has an asymptotic multivariate normal distribution, i.e.,
The asymptotic covariance of is simply and depends on so we also use the notation, and with and is a p by k matrix, its transpose is . Since , an estimate of is
Using , the asymptotic covariance matrix of can be estimated.
We also notice that we can recover optimum estimating equations estimators using the following GMM estimation set-up by letting , i.e., the number of sample moments is equal to the number of parameters to be estimated and
Minimizing the corresponding GMM objective function yields the vector of GMM estimators which are given by the following system of equations since is positive definite with probability one,
which is the same system of equations for obtaining the optimum estimating equations estimators as discussed. Using vector notations, the vector of optimum
estimating functions is simply and the related estimators are obtained by solving .
The estimating equations based on of GMM procedures are based on partial derivatives of and can be seen as equivalent to
Observe that the vector of estimating functions is also formed based on linear combinations of elements of which is similar to the vector of optimum estimating functions but it might not be optimum as the matrix and the matrix , no longer depends on i. With not only being independent but they are also identically distributed then we have the equivalence of the two methods. We also notice that used for GMM estimation plays a similar role as the working matrix for GEE estimation but it is often simpler to obtain than . Often, more derivations are needed to obtain .
Based on expression (7) and the observation just made concerning expression (8), we shall define for GMM slightly different than used for generalized estimating functions (GEE) by letting for GMM estimation
for the first p components of the vector depending on the models being studied.
We might also want to consider including other for depending on the model being studied for the sake of efficiency, i.e., this leads to define
with which is the vector of optimum
estimating function based on elements of the set and with its components depend upon for to be defined based on the model under investigation, and define the GMM objective function as
see Section 3 for more details for the choice of for GMM methods with models based on probability generating functions.
One advantage of the GMM approach over generalized estimating equations (GEE) approach is with GMM approach, we have an objective function to be minimized and it leads to construction of chi-square tests for moment restrictions meanwhile there is no such equivalent test statistic if we use the generalized estimating equations approach. Furthermore, we shall see in Section 3 when applied to discrete distributions with moment conditions extracted from probability generating function, testing for moment restrictions can be viewed as testing goodness-of-fit for the count model being used. Consequently, estimation and model testing can be treated in a unified way using this approach.
As mentioned earlier, the GMM objective function evaluated at can be used to construct a test statistic which follows an asymptotic chi-square distribution for testing the null hypothesis which specify the validity of the vector moment conditions, i.e.,
but we need , i.e., the number of sample moments must exceed the number of parameters to be estimated.
2.2. Testing the Validity of Moment Restrictions
We notice that since and the vector of GMM estimators is consistent with and in general , the following statistics can be constructed and will have an asymptotic chi-square distributions. These statistics are also known as Hansen’s statistics after Hansen’s seminal works, see Hansen  and they can be used for testing the validity of moment restrictions.
For testing the simple hypothesis for ; is specified, the Hansen’s statistic is given as
and the asymptotic distribution of the statistic is chi-square with k degree of freedom, i.e., under .
For testing the composite hypothesis
We need to obtain first by minimizing then the Hansen’s statistic is given as
and the asymptotic distribution of the statistic is chi-square with degree of freedom, i.e., under , assuming .
These statistics will be used subsequently with moment conditions extracted from the model probability generating function in Section 3. We shall show in the next sections that these statistics are consistent test statistics in general for model testing with the discrete model specified by its probability generating function. These statistics are also distribution free. The distribution free property is not enjoyed by goodness-of-fit test statistics for model testing based on the empirical probability function which is defined as
with being independent and identically random variables from a discrete model specified by the model probability generating function which are given by Rueda and O’Reilly , Marcheselli et al.  as the null distributions of the statistics depend on the unknown parameters. In addition, the procedures as proposed by Doray et al.  only make use of k fixed points to generate moment conditions regardless of the sample size n.
The procedures proposed in this paper are different as the number of points selected from the probability generating function goes to infinity as .
3. GEE and GMM Methods with Moment Conditions from Probability Generating Function
In this section, we shall give attention to count models and we shall assume that we have a random sample of n independent and identically distributed observations which follow the same distribution as X and X follows a nonnegative integer discrete distribution with probability mass function with no closed form but with model probability generating function with closed form and relatively simple to handle, is well defined on the domain of .
It is well known that in general, the probability mass function is uniquely characterized by its corresponding probability generating function. Subsequently, two versions of GMM objective functions will be introduced based on estimating function theory. The first version is based on using points of to form moment conditions which are commonly used in the literature and it is given in Section 3.2.1 and Section 3.2.2, the second version is based on and it is given in Section 3.2. 3.
Optimum estimating functions can be used to obtain estimators but we emphasize here the GMM approach as tests for moment restrictions with asymptotic chi-square distribution free can also be obtained which can be interpreted as goodness-of-fit tests for the parametric family used. However, optimum estimating functions theory is very useful for identifying sample moments for efficiency of GMM procedures.
3.1. Generalized Estimating Functions (GEE)
First, we shall define the basic unbiased estimating functions , i.e., with the property , then we shall form the optimum estimating functions based on linear combination of these elementary estimating functions. Since the basic elementary estimating functions are unbiased estimating function, the optimum estimating functions will be unbiased.
For each observation , we shall associate the value
As , the set will become dense in and define the elementary estimating functions as
and clearly .
Since is independent of for , we have the property
, for . (12)
The elements of the set are said to be mutually orthogonal if elements of the set have the property as defined by expression (12), see Godambe and Thompson  (page 139). Therefore, using Godambe and Thompson  (page 139) optimality criteria the optimum estimating functions for estimating based on linear combination of the basic estimating functions which are orthogonal are given by
and clearly , the optimum estimating functions are also unbiased.
We define the vector
Since and letting be the variance of , so
Therefore, equivalently the vector of optimum estimating function is given by
For GEE estimation as given by expression (4) and expression (5), we need to specify the vector . Let us partition into two components with
, . (17)
We select two points and , for example by letting and and therefore we can form two sets of elementary basic unbiased estimating functions using these two points which are given by
and . (18)
These two sets of elementary unbiased estimating function are selected because as we shall see when used to form moment conditions for the GMM objective function, they allow the construction of consistent chi-square tests.
Furthermore, with the probability generating function we can derive the expectation of X which is denoted by and another set of elementary unbiased estimating function can be created which is given by and since the sample mean if incorporated into estimating equations in general might help to improve the efficiency of the estimators, this set of estimating functions are also being considered and used for forming the vector of generalized estimating functions. By making use of these three sets of elementary unbiased estimating functions lead us to define
provided for the model exists and note that can be obtained from the derivative of the probability generating function, in fact , .
If does not exist, the last component of is replaced by with close to 1, say for example and see section 4 for an illustration and for finding the working matrix . For estimators to have a multivariate asymptotic normal distribution, we also need the existence of the common variance of under the model.
Having specified the vector
GEE estimation can be performed using results and procedures of Section 2.1, the vector of GEE estimators is obtained by solving the system of equations as given by expression (5). Observe that with the notations being introduced, denote a function which also depends on , i.e., and clearly are not identically distributed but are identically distributed vectors of random variables. Therefore, GEE estimators are no longer asymptotically equivalent to GMM estimators using the same vectors . With the notations being used, GEE estimators and GMM estimators are asymptotically equivalent only if have a common multivariate distribution.
3.2. GMM Methodology
Before defining the sample moment vector for GMM methods, let us for the time being turn our attention on how to obtain a preliminary consistent estimate in general, Such a preliminary estimate is needed for numerical algorithms to implement GMM procedures and to define the matrix which is used to define the GMM objective function. The nonlinear least-squares (NLS) estimators can be used to obtain a preliminary consistent estimate with being the vector which minimizes
Note that the estimating functions of the nonlinear least-squares methods are
and they have some resemblance to the optimum ones as they are also based on linear combinations of but they are not optimum.
3.2.1. GMM Objective Function
Now we turn our attention to defining the vector
we have seen GMM estimators are no longer equivalent to GEE estimators if we define as for GEE methods, some modifications appear to be necessary and to ensure that GMM estimators have comparable efficiencies to the ones obtained by using optimum estimating functions based on , we shall let
with the corresponding sample moment, , is the vector of optimum function based on
and keeping as for GEE estimation, so the corresponding sample moment vector for GMM estimation is
with being just defined and
if exists, otherwise let , is chosen to close to 1 but . The GMM objective function can be constructed and given by
3.2.2. Model Testing Using GMM Objective Function
Now we shall turn our attention to the problem of testing a model specified by its probability generating function. Let be the random sample drawn from the nonnegative integer discrete distribution with probability generating function and we want to test the following simple null hypothesis which specifies , is specified, i.e.
and clearly if is true we have .
The following chi-square statistics
For practical applications, the chi-square tests are in general consistent to detect common departure that we are interested as we shall see that if , the test will allow us to reject in general as . Indeed, we have this property via the chi-square statistic, because if the chi-square statistic will converge to infinity.
In order not to have this property, we must have
If then two of its components given by
must simultaneously converge to 0 in probability, i.e.,
and . (20)
We shall show that in general for encountered for applications it cannot happen.
, this implies
, this implies .
Observe that in general for probability generating function used for applications, the function is convex for and when , i.e., , see Resnick  (p 22-23).
Furthermore, for encountered in applications, we also have in general for some . This also means in general, there is only one point a with at most where crosses since and are both strictly convex functions and . Therefore, we cannot have simultaneously convergence as given by expression (19) and the chi-square test is consistent in general as it can detect common departure from as .
For testing the composite , we need to estimate by by minimizing first and subsequently use to compute the following chi-square statistic and with .
These chi-square statistics are distribution free as there is no unknown parameter in these chi-square distributions for the statistics used. These goodness-of-fit tests are simpler to implement than the ones based on matching sample probability generating function with its model counterpart using a continuum of moment conditions as given by Theorem 10 of Carrasco and Florens  (p 812-813). Note that maximum likelihood estimators if used concomitantly with the common classical Pearson statistics often have complicated distributions and the statistics are no longer distribution free, see Chernoff and Lehmann , Luong and Thompson  and these classical Pearson’s test statistics are not consistent in general.
3.2.3. Further Extensions: The Use of Orthogonal Estimating Functions
Notice that beside the set of basic estimating functions
as defined earlier we also have another set of basic estimating functions given by with .
Consequently, if in addition of the first set of estimating functions, we also want to incorporate the second set of basic estimating functions for building then we can use optimum orthogonal estimating functions and instead of the first p components of the vector are given by the vector
which is the vector optimum estimating functions based on the set of basic estimating functions , we shall use a more general vector of optimum estimating functions which can incorporate a larger set of basic estimating functions as described below.
Observe that we also have another set of estimating functions given by with and clearly form a mutually orthogonal basic estimating function but together combining the two sets of basic estimating functions to form the set
the basic estimating functions of the combined set are not mutually orthogonal because is not equal to 0. Using Gram-Schmidt orthogonalizing procedure we can replace by
which can also be represented as
and is simply the variance of since the basic estimating functions are unbiased,
Now, it is easy to see that that set
is a set of mutually orthogonal of basic or elementary estimating functions, see Definition 2.2 and Theorem 2.1 as given by Godambe and Thompson  (p 139-140).
Li and Turtle  (p 177) also use a similar orthogonalization procedure and Theorem 2.1 for creating optimum estimating functions for ARCH model.
The first p components of the vector of sample moment functions are simply the optimum estimating functions based on linear combinations of basic estimating functions of the set and again using Theorem 2.1 by Godambe and Thompson , the vector of optimum estimating functions is given by
and is as defined by expression (15).
Now we shall display the expression for , first note that
and , so that
with the variance
is as given by expression (21) and the covariance
The expression for can be displayed fully and it is given by
with this vector of optimum estimating functions, the sample moments function for forming the corresponding GMM objective function can be defined and given below.
and keeping as the component vector of as specified by expression (19), so and the choice of with the use
of optimum orthogonal estimating functions constructed using two set of basic estimating functions for is to be preferred for improving the efficiency
for estimation for some models if as defined by expression
(19) in Section 3.1.1 does not give satisfactory results for efficiency for GMM estimation. Model testing procedures using this GMM objective function are identical to procedures for GMM objective function used earlier.
We might also want to enlarge the vector of by adding more components but more components also tend to create numerical difficulties because the matrix will be nearly singular and the numerical inversion of such a matrix is often problematic.
Finally, we note that the GMM methods developed although are primarily for discrete distributions, the methods can also accommodate nonnegative continuous defined using Laplace transforms as discussed in Luong  as Laplace transforms are related to probability generating functions.
4. An Example and Numerical Illustrations
We shall use an example to illustrate the procedures, let us consider a random sample of observations is drawn from the Poisson distribution with probability generating function , . For this model is scalar. We would like to use GMM methods here as despite that maximum likelihood estimator for is available and given by , using does not lead to tractable distribution free goodness of fit test statistics with the use of Pearson type statistics as mentioned earlier.
For this model, the coefficient
We consider the case with the sample moment vector given by
will have four components with the components given respectively by
We can use as is simple to obtain here and can be used as a preliminary consistent estimate. Now we can let
The elements of as given by expression (24) can be computed using only the probability generating function of the model since we have
the variance of X is
For the Poisson model,
as given by expression (24) tends to be invertible with less numerical difficulties.
The GMM objective function is given by
minimizing it allows us to obtain the corresponding GMM estimators . In order to obtain an estimated asymptotic variance for , we can define
evaluated at .
The asymptotic variance for can be estimated as and the chi-
square statistic for testing the composite hypothesis is given by and .
For testing the feasibility of GMM methods with this example, limited simulation studies are conducted. GMM methods can be implemented without numerical difficulties for .
For values of with , if expression (23) is used for , the matrix tends to be nearly singular and the elements of need to be computed with higher accuracies in order to be able to invert . We found that software like Maple or Mathematica is more able to compute with higher accuracy than R.
Often by using a spectral decomposition of , we can obtain numerically although directly asking for the inverse using R, it might just give the message, matrix is nearly singular and does not return the inverse. As can be seen by using the spectral representation of ,
with being an orthonormal matrix, , and is a diagonal matrix with diagonal elements consist of eigenvalues of and these eigenvalues need to be computed with high accuracies and they also must be positive numerically, so by keeping more digits to compute the eigenvalues of then in general, can be obtained and computed as
if expression (24) is used instead of expression (23) for with software which keeps more accuracy on computing numbers then we encounter less numerical problems to invert . For models which is difficult to obtain, an empirical likelihood (EL) approach based on the same sample moments can be used and have the same efficiency as GMM methods but the numerical computations for implementing EL methods are also more involved, see Luong  on the use of penalty function for obtaining EL estimators.
We simulate samples of size from the Poisson distribution with and obtain respectively the GMM estimate, the NLS estimate and the ML estimate. The NLS estimate is the non linear least-squares estimate as mentioned in the beginning of Section 3.2.
For comparison of relative efficiencies of these methods we estimate the ratios
and where MSE(GMM), MSE(NLS), MSE(ML)
are respectively the estimates of mean square error of GMM estimator, NLS estimator and ML estimator using simulated samples. The efficiency of GMM estimator is practically identical to the efficiency of ML estimator but the efficiency on NLS estimator is much lower and getting worse as increases in comparison with ML estimator. The results are displayed in TableA1.
In order to test whether the chi-square test has power to detect departure from the model used here we use the negative binomial with mean equals to
and variance equals to as departure from the Poisson model with
and simulate samples of size and the model used is Poisson with mean . We can estimate the power of the tests at these alternative and results are displayed in TableA2. The level used for the chi-square tests is with the critical point being the 0.95th percentile of a chi-square distribution with 3 degree of freedom, . The results obtained are also encouraging and show that the chi-square tests have considerable power to detect departures. As n becomes large the estimate power also decreases as expected since as , the negative binomial distribution also tends to the Poisson distribution. Larger scale simulation studies with more parametric families are needed to confirm the efficiencies of the proposed methods.
At this point, we can conclude that the methods appear to be relatively simple to implement and have the potentials to be efficient for some count models and have the advantage of only using of probability generating function instead of probability mass function, allowing inferences to be made for a much larger class of parametric families without relying on extensive use of simulations. The proposed GMM methodology also combines traditional GMM methodology with generalized estimating function methodology and both of these methodologies are well-known alternatives to ML methodology. There is a lack of statistics for model testing when using generalized estimating function methodology and it is overcome by the proposed procedures.
The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support from the editorial staff of Open Journal of Statistics to process the paper are all gratefully acknowledged.
Table A1. Estimate relative efficiency comparisons between GMM, NLS and ML estimators.
M = 100 simulated samples are used and each with sample size n = 100.
Table A2. Estimate power of the chi-square tests using the Poisson model with parameter θ.
M = 100 simulated samples of size n = 100 for each sample are drawn from a negative binomial distribution with mean = 5 and variance = .
 Luong, A., Bilodeau, C. and Blier-Wong, C. (2018) Simulated Minimum Hellinger Distance Inference Methods for Count Data. Open Journal of Statistics, 8, 187-219.
 Doray, L.G., Jiang, S.M. and Luong, A. (2009) Some Simple Method of Estimation for the Parameters of the Discrete Stable Distribution with the Probability Generating Function. Communications in Statistics—Simulation and Computation, 38, 2004-2017.
 Rueda, R. and O’Reilly, F. (1999) Tests of Fit for Discrete Distributions Based on the Probability Generating Function. Communication in Statistics—Simulation and Computation, 28, 259-274.
 Chernoff, H. and Lehmann, E.L. (1954) The Use of Maximum Likelihood Estimates in Chi-Square Tests for Goodness of Fit. Annals of Mathematical Statistics, 25, 579-586.
 Li, D.X. and Turtle, H.J. (2000) Semi-Parametric ARCH Models: An Estimating Function Approach. Journal of Business and Economic Statistics, 18, 174-186.
 Luong, A. (2017) Maximum Entropy Empirical Likelihood Methods Based on Laplace Transforms for Nonnegative Continuous Distributions with Actuarial Applications. Open Journal of Statistics, 7, 459-482.