To explain the difference in size of vector in the two cases of IM test and IMDIAG test, let us consider a simple example. Suppose we have a symmetric matrix with elements and dimension as:
where, . Then in the case of the IM test, the dimension of vector is and elements are:
whereas in the case of IMDIAG test, is the dimensional vector:
4. Simulation Study
Our work, focus on behaviour of goodness of fit tests under alternative hypotheses in case of missing covariate model and in case of the wrong model, because these cases we could not reproduce Kuss’s work in. We will focus on four goodness-of-fit tests . Therefore, we examine in more depth the behaviour of the tests and determine more information about asymptotic MLE distribution in case of the wrong model
or in the case of the missing covariate,
where , X and U independent.
Simulation study designed as Kuss’s work:
・ The sample sizes are n = 100 and n = 500.
・ Applied only on extreme sparseness when .
・ number of simulation is 1000.
・ distribution of the predictor variables X, U is , X and U independent chosen to confirm with Kuss’s work.
・ Use four of goodness-of-fit tests from the simulation study under three different alternative hypotheses:
(a) True covariate.
(b) Missing covariate.
(c) Wrong functional form of the covariate.
・ Fitted model in all cases is a standard logistic model with an intercept and one covariate.
・ All the tests on the null hypothesis under .
Results and Discussion of Tests under Correct Model
In Table 2, reported some results, the mean, variance and the empirical power of four goodness-of-fit tests from simulation study under correct model, namely
Statistics used in the simulation as goodness-of fit tests are: Hosmer- Lemeshow , Information matrix , Information matrix Diagonal and residual sum of squares (RSS). The asymptotic distribution of statistics is distribution, where the mean and variance equal df and 2df respectively. In case of statistic we chosen the number of group is g = 10 so, degree of freedom is . The results shown in Table 1, the mean and variance of all statistics appeared close to df and 2df. Moreover, the simulation study appeared reasonable results when fit the model with sample size n = 500. However, there is slightly large variance of in case of sample size n = 100. Overall, the empirical power and type I error looks good.
In the second case, the results reported the mean, variance and the power to detect a mis-specified model for same goodness-of-fit tests under missing covariate model, when the model is:
and fit standard logistic regression model with .
Table 2, showed results from simulation study under alternative hypotheses missing covariate model. The mean and variance of all statistics close to df and 2df, but we have slightly smaller variance in case of . However, we have low power when used IM statistics in case of sample size n = 500, IMDIAG statistic and RSS in case of sample size n = 100 and statistic in both cases of sample size.
The final case we will show the results of power to detect a mis-specified model for four goodness-of-fit tests under the wrong functional form of the covariate model
and fit the model as previous cases.
Table 1. Results of N = 1000 simulation with sample size n = 100 and n = 500 under correct model.
Table 2. Results of N = 1000 simulation with sample size n = 100 and n = 500 under missing covariate model.
In Table 3, reported results for goodness-of-fit tests from simulation study under wrong model. The mean and variance of all statistics appeared very larger in two cases of sample size comparing with degree of freedom of statistics. How- ever, high power in all goodness-of-fit tests in both sample size were found, that is meaning this tests have rejected all the null hypothesis. On the other hand, Kuss’s results appeared low power in case of sample size n = 100 compared with our results.
In Figure 1, we plot vs and we show the true model (continues line). If we fit , these putative approximation are shown for , and (dot and dash, dash and dot) line respectively.
Table 3. Results of N = 1000 simulation with sample size n = 100 and n = 500 under wrong model.
Figure 1. Plots of the different logistic model given .
5. Conclusion and Further Work
The work considered in this paper was centered on the asymptotic distribution of goodness-of-fit tests in logistic regression model. We also consider the comparison between some global goodness-of-fit tests, which compared with Kuss’s results. Application of simulation apply in two types of goodness-of-fit tests, those based a test which groups the observation and those which do not group observation. Our results of study confirm the work of Kuss’s regarding
the power of goodness-of-fit tests, which related the Rss , Hosmer-Lemeshow, IM and IMDIAG tests under correct and missing model. However, our results about the asymptomatic distribution of goodness-of-fit tests show, various combinations of behavior on the mean and variance of statistics, which, the asymptotic distribution of statistics is Chi-square . The results under correct model show reasonable power for all methods, slightly larger variance found in case of Hosmer-Lemeshow test, and smaller variance under missing covariate model. As we know the goodness-of-fit statistics are distributed asymptotically as central distribution under H0 when the model is correctly specified, and is non-central under H1 when the model mis-specificed. However, under wrong model the results show strange behavior, which all the means and variances are not satisfy the assumption on asymptotic distribution with men df and variance 2df, also, it is appeared with high power. The problem means that in some circumstances properties of the distribution of the statistics of tests (e.g mean and variance) are far away from the properties of distribution. In fact, the interesting point here, some of goodness-of-fit tests seem affected by assumption on covariance matrix. So, many issues about the mean and variance of the asymptotic distribution of goodness-of-fit statistic should also be examined.