The goal of a logistic regression analysis is to find the best fitting model to describe the relationship between an outcome and covariates where the outcome is dichotomous,  considered the logistic regression model is a member of the class of the generalized linear models. Many assumptions and more details considered about the behavior of logistic model see   , also for more application see     . The goodness-of-fit is very important to decide if the more succinct model is adequate. After fitting the logistic regression model, the next step is to examine the proposed model how well fits the observation data and to know how effective the model is; this is called as its goodness-of-fit. Goodness-of-fit tests for the logistic regression can be split into three types: 1) Those based an examination of residuals; 2) Those based a test which groups the observation; 3) Those which do not group observation. Methods in 1) are more general and subjective assessments of a model and are not considered in this work. This is not to undervalue then they are often the most valuable approach to model assessment. The observed values for Bernoulli regression are just 0 s and 1 s and this makes graphical approaches less easy to handle. The focus of this work is the test statistics. In next section, tests using grouping are considered, with those that do not need to group the data being discussed in section 3. Investigate the behavior of the asymptotic distribution of goodness-of-fit tests is considered in section 4 with comparisons between some goodness-of-fit tests, evaluated by simulation data with two different sample sizes. The simulation in this work was designed according to simulation that made by  , which made comparisons between some goodness-of-fit tests in logistic regression models with sparse data. The results of his simulation showed that some goodness-of-fit tests have reasonable power compared with other tests. However, Kuss did not give information about the asymptotic distribution of these statistics. This paper supposes to show the behavior of the asymptotic distribution of goodness-of-fit tests for logistic regression model. Finally, conclusion and further discussion made in the last section.
2. Goodness-of-Fit Tests with Grouping
 proposed and developed approaches involving grouping based on the values of the estimated probabilities obtained from the fitted logistic model. Two grouping methods were proposed. The first approach is based on grouping the data according to percentiles of the estimated probabilities, and the second approach is based on grouping the data according to fixed cutoff values of the estimated probabilities. Tests with grouping based on estimated probabilities were proposed and developed by    .  developed a score test statistic which essentially compares two fitted model.
Hosmer and Lemeshow Test The calculation of this test dependent upon grouping of estimated probabilities which use g groups. The first group contains the observations which have the smallest estimated probabilities, the second group contains values have the next smallest estimated probabilities and the last group contains the observation with the largest : here n is the size of the sample and g the total number of groups. Before defining a formulae to calculate we will consider some notions. The statistic test is obtained by calculating Pearson chi-square statistic from the table with two rows and g columns of observed and expected frequencies. In the row with y = 1 summing of the all estimated probabilities in a group give the estimated expected value. In the row with y = 0 estimated expected value is obtained by summing one minus the estimated probabilities over all subjects in the group. We can denotes the observed number of subjects have had the event present and absent respectively in each group columns g :
where is the number of the observation in group g. The expected number of subjects of present and absent respectively is denoted by:
Then is simply obtained by calculation the Pearson statistic for the observed and expected frequencies from the table as:
from which it following
and finally we get
where, is the total number of values in sth group, is the number of responses for the number of covariates in the sth group, defining as
where, , and is the average of the estimated probabilities which are defined as:
Here, the number of observations within covariate pattern i is denoted by . Use of an extensive set of simulations proved that when , where is the individual binomial denominator and the fitted logistic model is the correct model, then the distribution of is approximated by the distribution with degrees of freedom  .
Hosmer and Lemeshow Test
The second grouping strategy was proposed from Hosmer and Lemeshow denoted by , this method depends upon grouping the estimated probabilities in groups based on fixed cutpoint, so each group contains all subjects with fitted probability located in specific intervals. For example, the cutpoint of the first group is , then this group contains all subjects with estimated probabilities located in this interval; the second group contains all subjects with estimated probabilities located between cutpoint and the last group has interval .
The calculation of uses exactly the same formulae used to calculate : the only difference between the two approaches is in the construction of the groups. The distribution of is approximated by the distribution with degrees of freedom.
Although Hosmer and Lemeshow tests are good, it requires grouping, and choice of g is
・ g is arbitrary but almost everywhere in the literature and in software a value of 10, or very similar is chosen.
・ Smaller values of g might be chosen for smaller n.
・ Sparse data causes a problem for H and lead to uneven group widths for C.
3. Goodness-of-Fit Tests without Grouping
Deviance and Pearson Chi-Square Tests
Two of the most commonly used goodness-of-fit measures, are the Pearson’s chi-squared and the deviance D goodness-of-fit test statistics but the behaviour of these tests are unstable with bernoulli data; see  . The general idea of the deviance is make comparison between two models the first model is full model with p parameters and the second model is a model with q parameters, where . The deviance can write as
where , are the likelihoods for the full and small model and , denoted to the log-likelihood: Asymptotically this is in df. The residual deviance is the case when the large model is saturated and has n parameters. In case of the logistic regression model  introduced specific form when ; the residual deviance can then be found as
In this case the deviance is invalid as a goodness-of-fit test, because it is a function of , which does not compare the observed values with fitted values.
Also,  discussed that Pearson chi-square goodness of fit statistic when ; can be written:
which is equal to the sample size: this is not a useful goodness-of-fit test.
Residual Sum of Squares Test
 proposed a method, which used the unweighted residual sum of squares a goodness-of-fit test to assess the model adequacy. The idea of this approach is to keep all the individual values of mi but to give less weight in cases of mi are small. The unweighted residual sum of squares statistic considers only the numerator of the Pearson chi-squares statistic, which is the summation again over the individual observations, the statistic can be written:
Of course, the relative weighting for varying mi is not relevant for our case where mi = 1.  discussed how to compute the moments and asymptotic distribution of the RSS statistic. They give useful expressions for the mean and variance which are easier to compute than the expressions given by  . The proposed asymptotic mean and variance of RSS are respectively, and var , where ,
, and d is vector with
elements . Used the standardized statistic to assess significance by referring the following to the standard normal
Several type statistics have been used for goodness-of-fit in logistic regression, such as that proposed by  .
where, represents the log-likelihood evaluated at the ML estimation parameters and represents the log-likelihood of the model containing only an intercept. Another version due to  is
Information Matrix tests: IMT and IMTDIAG
The Information Matrix test (IMT) is a test for general mis-specification, proposed by  . The two well-known expressions for the information matrix coincide only if the correct model has been specified and the IMT takes advantage of this fact. The IMT avoids the grouping necessary for tests like the Hosmer-Lemeshow test. Many researchers,     pointed out the behaviour of the asymptotic distribution of IMT statistic and dispersion matrix.  discussed the information matrix test and showed that it is useful with binary data models.  claimed that, the IMT has reasonable power compared with other tests, without information about the behaviour of the asymptomatic distribution of IMT. The idea of the information matrix test is to compare
and , as these differ when the model is mis-specified
but not when the model is correct.
Let, consider binary regression, where the outcome for individual i, i = 1, ・・・, n is a random variable . Also where is a dimensional vector of covariates and is a p-dimensional vector of parameters. It will be convenient to write and to be the contribution to the log-likelihood from unit i.
The p-dimensional likelihood equations can be written:
We can also derive the matrix as:
The idea behind the information matrix test is that if the model is correctly specified then the quantity:
has zero mean. By comparing (1) and (2) we can compute this quantity, for a general value of , as the sum of:
We can test the null hypothesis that IM has zero mean by computing the variance of IM and then constructing a standard statistic. The first step is to
compute the variance of where we write for essentially the right
hand side of (3):
where we have changed the symmetric matrix into a vector in order to be able to use standard methods. As is symmetric we do not wish to
duplicate entries, so is the -dimensional vector:
where is the element of . If we write:
then because the different terms are independent we obtain:
which is a dimensional matrix where .
We should also note that if B is defined as essentially the log-likelihood, i.e.
then the variance of B is the matrix :
Before compute the covariance of A and B, we get, using
For independently and identically random variables and under the the second term of the is zero, and covariance of A and B in this case is the matrix, and so
Central limit arguments suggest that asymptotically is a dimensional normal variable. However, the IM-test requires A to be evaluated at , , say, and at this value we know that B = 0. Consequently the variance of is the variance of A conditional on B = 0 which is .
Assuming a logistic regression we have and so we can evaluate the dispersion matrices at the MLEs as:
If we write then one version of the IM test is found by referring to a variable with degrees of freedom equal to the rank of .
The idea of the IMDIAG test and IM test are the same, the only difference is that for the former the elements of are just the diagonal elements of , so is the p dimensional vector:
To explain the difference in size of vector in the two cases of IM test and IMDIAG test, let us consider a simple example. Suppose we have a symmetric matrix with elements and dimension as:
where, . Then in the case of the IM test, the dimension of vector is and elements are:
whereas in the case of IMDIAG test, is the dimensional vector:
4. Simulation Study
Our work, focus on behaviour of goodness of fit tests under alternative hypotheses in case of missing covariate model and in case of the wrong model, because these cases we could not reproduce Kuss’s work in. We will focus on four goodness-of-fit tests . Therefore, we examine in more depth the behaviour of the tests and determine more information about asymptotic MLE distribution in case of the wrong model
or in the case of the missing covariate,
where , X and U independent.
Simulation study designed as Kuss’s work:
・ The sample sizes are n = 100 and n = 500.
・ Applied only on extreme sparseness when .
・ number of simulation is 1000.
・ distribution of the predictor variables X, U is , X and U independent chosen to confirm with Kuss’s work.
・ Use four of goodness-of-fit tests from the simulation study under three different alternative hypotheses:
(a) True covariate.
(b) Missing covariate.
(c) Wrong functional form of the covariate.
・ Fitted model in all cases is a standard logistic model with an intercept and one covariate.
・ All the tests on the null hypothesis under .
Results and Discussion of Tests under Correct Model
In Table 2, reported some results, the mean, variance and the empirical power of four goodness-of-fit tests from simulation study under correct model, namely
Statistics used in the simulation as goodness-of fit tests are: Hosmer- Lemeshow , Information matrix , Information matrix Diagonal and residual sum of squares (RSS). The asymptotic distribution of statistics is distribution, where the mean and variance equal df and 2df respectively. In case of statistic we chosen the number of group is g = 10 so, degree of freedom is . The results shown in Table 1, the mean and variance of all statistics appeared close to df and 2df. Moreover, the simulation study appeared reasonable results when fit the model with sample size n = 500. However, there is slightly large variance of in case of sample size n = 100. Overall, the empirical power and type I error looks good.
In the second case, the results reported the mean, variance and the power to detect a mis-specified model for same goodness-of-fit tests under missing covariate model, when the model is:
and fit standard logistic regression model with .
Table 2, showed results from simulation study under alternative hypotheses missing covariate model. The mean and variance of all statistics close to df and 2df, but we have slightly smaller variance in case of . However, we have low power when used IM statistics in case of sample size n = 500, IMDIAG statistic and RSS in case of sample size n = 100 and statistic in both cases of sample size.
The final case we will show the results of power to detect a mis-specified model for four goodness-of-fit tests under the wrong functional form of the covariate model
and fit the model as previous cases.
Table 1. Results of N = 1000 simulation with sample size n = 100 and n = 500 under correct model.
Table 2. Results of N = 1000 simulation with sample size n = 100 and n = 500 under missing covariate model.
In Table 3, reported results for goodness-of-fit tests from simulation study under wrong model. The mean and variance of all statistics appeared very larger in two cases of sample size comparing with degree of freedom of statistics. How- ever, high power in all goodness-of-fit tests in both sample size were found, that is meaning this tests have rejected all the null hypothesis. On the other hand, Kuss’s results appeared low power in case of sample size n = 100 compared with our results.
In Figure 1, we plot vs and we show the true model (continues line). If we fit , these putative approximation are shown for , and (dot and dash, dash and dot) line respectively.
Table 3. Results of N = 1000 simulation with sample size n = 100 and n = 500 under wrong model.
Figure 1. Plots of the different logistic model given .
5. Conclusion and Further Work
The work considered in this paper was centered on the asymptotic distribution of goodness-of-fit tests in logistic regression model. We also consider the comparison between some global goodness-of-fit tests, which compared with Kuss’s results. Application of simulation apply in two types of goodness-of-fit tests, those based a test which groups the observation and those which do not group observation. Our results of study confirm the work of Kuss’s regarding
the power of goodness-of-fit tests, which related the Rss , Hosmer-Lemeshow, IM and IMDIAG tests under correct and missing model. However, our results about the asymptomatic distribution of goodness-of-fit tests show, various combinations of behavior on the mean and variance of statistics, which, the asymptotic distribution of statistics is Chi-square . The results under correct model show reasonable power for all methods, slightly larger variance found in case of Hosmer-Lemeshow test, and smaller variance under missing covariate model. As we know the goodness-of-fit statistics are distributed asymptotically as central distribution under H0 when the model is correctly specified, and is non-central under H1 when the model mis-specificed. However, under wrong model the results show strange behavior, which all the means and variances are not satisfy the assumption on asymptotic distribution with men df and variance 2df, also, it is appeared with high power. The problem means that in some circumstances properties of the distribution of the statistics of tests (e.g mean and variance) are far away from the properties of distribution. In fact, the interesting point here, some of goodness-of-fit tests seem affected by assumption on covariance matrix. So, many issues about the mean and variance of the asymptotic distribution of goodness-of-fit statistic should also be examined.
 Hosmer, D.W., Hosmer, T. and Lemeshow, S. (1980) A Goodness-of-Fit Tests for the Multiple Logistic Regression Model. Communications in Statistics, 10, 1043-1069.
 Lemeshow, S. and Hosmer, D.W. (1982).A Review of Goodness of Fit Statistics for Use in the Development of Logistic Regression Models. American Journal of Epidemiology, 115, 92-106.
 Hosmer, D.W., Hosmer, T., Le Cessie, S. and Lemeshow, S. (1997) A Comparison of Goodness-of-Fit Tests for the Logistic Regression Model. Statistics in Medicine, 16, 965-980.
 Brown, C.C. (1982) On A Goodness of Fit Test for the Logistic Model Based on Score Statistics. Communications in Statistics Theory and Methods, 10, 1097-1105.