Minimum Penalized Hellinger Distance for Model Selection in Small Samples

ABSTRACT

In statistical modeling area, the Akaike information criterion AIC, is a widely known and extensively used tool for model choice. The φ-divergence test statistic is a recently developed tool for statistical model selection. The popularity of the divergence criterion is however tempered by their known lack of robustness in small sample. In this paper the penalized minimum Hellinger distance type statistics are considered and some properties are established. The limit laws of the estimates and test statistics are given under both the null and the alternative hypotheses, and approximations of the power functions are deduced. A model selection criterion relative to these divergence measures are developed for parametric inference. Our interest is in the problem to testing for choosing between two models using some informational type statistics, when independent sample are drawn from a discrete population. Here, we discuss the asymptotic properties and the performance of new procedure tests and investigate their small sample behavior.

In statistical modeling area, the Akaike information criterion AIC, is a widely known and extensively used tool for model choice. The φ-divergence test statistic is a recently developed tool for statistical model selection. The popularity of the divergence criterion is however tempered by their known lack of robustness in small sample. In this paper the penalized minimum Hellinger distance type statistics are considered and some properties are established. The limit laws of the estimates and test statistics are given under both the null and the alternative hypotheses, and approximations of the power functions are deduced. A model selection criterion relative to these divergence measures are developed for parametric inference. Our interest is in the problem to testing for choosing between two models using some informational type statistics, when independent sample are drawn from a discrete population. Here, we discuss the asymptotic properties and the performance of new procedure tests and investigate their small sample behavior.

Cite this paper

P. Ngom and B. Ntep, "Minimum Penalized Hellinger Distance for Model Selection in Small Samples,"*Open Journal of Statistics*, Vol. 2 No. 4, 2012, pp. 369-382. doi: 10.4236/ojs.2012.24045.

P. Ngom and B. Ntep, "Minimum Penalized Hellinger Distance for Model Selection in Small Samples,"

References

[1] W. G. Cochran, “The Test of Goodness of Fit,” The Annals of Mathematical Statistics, Vol. 23, No. 3, 1952, pp. 315- 345. doi:10.1214/aoms/1177729380

[2] G. S. Watson, “On the Construction of Significance Tests on the Circle and the Sphere,” Biometrika, Vol. 43, No. 3-4, 1956, pp. 344-352. doi:10.2307/2332913

[3] D. S. Moore, “Chi-Square Tests in Studies in Statistics,” 1978

[4] D. S. Moore, “Tests of Chi-Squared Type Goodness of Fit Techniques,” 1986.

[5] D. W. K. Andrews, “Chi-square Diagnostic Tests for Econometric Models: Theory,” Econometrica, Vol. 56, No. 6, 1988, pp. 1419-1453. doi:10.2307/1913105

[6] H. A. Kaike, “Information Theory and Extension of the Likelihood Ratio Principle,” Proceedings of the Second International Symposium of Information Theory, 1973, pp. 257-281.

[7] Q. H. Vuong, “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses,” Econometrika, Vol. 57, No. 2, 1989, pp. 257-306. doi:10.2307/1912557

[8] Q. H. Vuong and W. Wang, “Minimum Chi-Square Estimation and Tests for Model Selection,” Journal of Econometrics, Vol. 57, No. 1-2, 1993, pp. 141-168. doi:10.1016/0304-4076(93)90104-D

[9] P. Ngom, “Selected Estimated Models with á-Divergence Statistics Global,” Journal of Pure and Applied Mathematics, Vol. 3, No. 1, 2007, pp. 47-61.

[10] A. Diédhiou and P. Ngom, “Cutoff Time Based on Generalized Divergence Measure,” Statistics and Probability Letters, Vol. 79, No. 10, 2009, pp. 1343-1350. doi:10.1016/j.spl.2009.02.006

[11] D. R. Cox, “Tests of separate families of hypotheses,” Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Los Angeles, 20-30 June 1961, pp. 105-123.

[12] H. Akaike, “A New Look at the Statistical Model Identification,” IEEE Transaction on Information Theory, Vol. 19, No. 6, 1974, pp. 716-723.

[13] S. Kullback and R. A. Leibler, “On Information and Sufficiency,” The Annals of Mathematical Statistics, Vol. 22, No. 1, 1951, pp. 79-86. doi:10.1214/aoms/1177729694

[14] R. J. Bearn, “Minimum Hellinger Distance Estimates for Parametric Models,” The Annals of Mathematical Statistics, Vol. 5, No. 3, 1977, pp. 445-463.

[15] D. G. Simpson, “Hellinger Deviance Test: Efficiency, Breakdown Points and Examples,” Journal of American Statistical Association, Vol. 84, No. 405, 1989, pp. 107- 113. doi:10.1080/01621459.1989.10478744

[16] B. G. Lindsay, “Efficiency versus Robustness: The Case for Minimum Distance Hellinger Distance and Related Methods,” Annals of Statistics, Vol. 22, No. 2, 1994, pp. 1081-1114. doi:10.1214/aos/1176325512

[17] A. Basu and B. G. Lindsay, “Minimum Disparity Estimation for Continuous Models: Efficiency, Distributions and Robustness,” The Annals of Mathematical Statistics, Vol. 46, No. 4, 1994, pp. 683-705. doi:10.1007/BF00773476

[18] A. Basu, I. R. Harris and S. Basu, “Tests of Hypotheses in Discrete Models Based on the Penalized Hellinger Distance,” Statistics and Probability Letters, Vol. 27, No. 4, 1996, pp. 367-373. doi:10.1016/0167-7152(95)00101-8

[19] A. Basu and S. Basu, “Penalized Minimum Disparity Methods for Multinomial Models,” Statistica Sinica, Vol. 8, 1998, pp. 841-860.

[20] M. W. Birch, “The Detection of Partial Association, II: The General Case,” Journal of the Royal Statistical Society, Vol. 27, No. 1, 1965, pp. 111-124.

[21] J. P. W. Pluim, J. B. A. Maintz and A. M. Viergever, “f-Information Measures to Medical Image Registration,” IEEE Transactions on Medical Imaging, Vol. 23, No. 12, 2004, pp. 1508-1516. doi:10.1109/TMI.2004.836872

[22] I. Vajda, “Theory of Statistical Evidence and Information,” Kluwe Academic Plubisher, Dordrecht, 1989.

[23] D. Morales, L. Pardo and I. Vajda, “Asymptotic Divergence of Estimates of Discrete Distribution,” Journal of Statistical Planning and Inference, Vol. 483, No. 3, 1995, pp. 347-369. doi:10.1016/0378-3758(95)00013-Y

[24] N. Cressie and T. R. C. Read, “Multinomial Goodness of Fit Test,” Journal of the Royal Statistical Society, Vol. 463, No. 3, 1984, pp. 440-464.

[25] K. Zografos and K. Ferentinos, “Divergence Statistics Sampling Properties and Multinomial Goodness of fit and Divergence Tests,” Communications in Statistics—Theory and Methods, Vol. 19, No. 5, 1990, pp. 1785-1802. doi:10.1080/03610929008830290

[26] M. Salicru, D. Morales, M. L. Menendez, et al., “On the Applications of Divergence Type Measures in Testing Statistical Hypotheses,” Journal of Multivariate Analysis, Vol. 51, No. 2, 1994, pp. 372-391. doi:10.1006/jmva.1994.1068

[27] A. Bar-Hen and J. J. Dandin, “Generalisation of the Mahalanobis Distance in the Mixed Case,” Journal of Multivariate Analysis, Vol. 532, No. 2, 1995, pp. 332-342. doi:10.1006/jmva.1995.1040

[28] L. Pardo, D. Mmorales, M. Salicrù and M. L. Menendez, “Generalized Divergences Measures: Amount of Information, Asymptotic-Distribution and Its Applications to Test Statistical Hypotheses,” International Sciences, Vol. 84, No. 3-4, 1995, pp. 181-198.

[29] M. L. Menendez, L. Pardo, M. Salicrù and D. Morales, “Divergence Measures, Based on Entropy Functions and Statistical Inference,” Sanky?: The Indian Journal of Statistics, Vol. 57, No. 3, 1995, pp. 315-337.

[30] I. Csiszár, “Information-Type Measure of Difference of Probability Distribution and Indirect Observations,” Studia Scientiarum Mathematicarum Hungarica, Vol. 2, 1967, pp. 299-318.

[31] M. Broniatowski and A. Toma, “Dual Divergence Estimators and Tests: Robustness Results,” Journal of Multivariate Analysis, Vol. 102, No. 1, 2011, pp. 20-36.

[32] A. Basu, A. Mandal and L. Pardo, “Hypothesis Testing for Two Discrete Populations Based on the Hellinger Distance,” Statistics and Probability Letters, Vol. 80, No. 3-4, 2010, pp. 206-214. doi:10.1016/j.spl.2009.10.008

[33] F. Liese and I. Vajda, “Convex Statistical Distance, vol. 95 of Teubner-Texte zur Mathematik,” 1987.

[34] R. Tamura and D. D. Boos, “Minimum Hellinger Distance Estimation for Multivariate Location and Covariance,” Journal of American Statistical Association, Vol. 81, No. 333, 1989, pp. 223-229.

[35] A. Basu, S. Sarkar and A. N. Vidyashankar, “Minimum Negative Exponential Disparity Estimation in Parametric Models,” Journal of Statistical Planning and Inference, Vol. 582, No. 2, 1997, pp. 349-370. doi:10.1016/S0378-3758(96)00078-X

[36] I. R. Harris and A. Basu, “Hellinger Distance as Penalized Loglikelihood,” Communications in Statistics— Theory and Methods, Vol. 21, No. 3, 1994, pp. 637-646. doi:10.1080/03610929208830804

[37] A. Mandal, R. K. Patra and A. Basu, “Minimum Hellinger Distance Estimation with Inlier Modification,” Sankhya, Vol. 70, 2008, pp. 310-322.

[1] W. G. Cochran, “The Test of Goodness of Fit,” The Annals of Mathematical Statistics, Vol. 23, No. 3, 1952, pp. 315- 345. doi:10.1214/aoms/1177729380

[2] G. S. Watson, “On the Construction of Significance Tests on the Circle and the Sphere,” Biometrika, Vol. 43, No. 3-4, 1956, pp. 344-352. doi:10.2307/2332913

[3] D. S. Moore, “Chi-Square Tests in Studies in Statistics,” 1978

[4] D. S. Moore, “Tests of Chi-Squared Type Goodness of Fit Techniques,” 1986.

[5] D. W. K. Andrews, “Chi-square Diagnostic Tests for Econometric Models: Theory,” Econometrica, Vol. 56, No. 6, 1988, pp. 1419-1453. doi:10.2307/1913105

[6] H. A. Kaike, “Information Theory and Extension of the Likelihood Ratio Principle,” Proceedings of the Second International Symposium of Information Theory, 1973, pp. 257-281.

[7] Q. H. Vuong, “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses,” Econometrika, Vol. 57, No. 2, 1989, pp. 257-306. doi:10.2307/1912557

[8] Q. H. Vuong and W. Wang, “Minimum Chi-Square Estimation and Tests for Model Selection,” Journal of Econometrics, Vol. 57, No. 1-2, 1993, pp. 141-168. doi:10.1016/0304-4076(93)90104-D

[9] P. Ngom, “Selected Estimated Models with á-Divergence Statistics Global,” Journal of Pure and Applied Mathematics, Vol. 3, No. 1, 2007, pp. 47-61.

[10] A. Diédhiou and P. Ngom, “Cutoff Time Based on Generalized Divergence Measure,” Statistics and Probability Letters, Vol. 79, No. 10, 2009, pp. 1343-1350. doi:10.1016/j.spl.2009.02.006

[11] D. R. Cox, “Tests of separate families of hypotheses,” Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Los Angeles, 20-30 June 1961, pp. 105-123.

[12] H. Akaike, “A New Look at the Statistical Model Identification,” IEEE Transaction on Information Theory, Vol. 19, No. 6, 1974, pp. 716-723.

[13] S. Kullback and R. A. Leibler, “On Information and Sufficiency,” The Annals of Mathematical Statistics, Vol. 22, No. 1, 1951, pp. 79-86. doi:10.1214/aoms/1177729694

[14] R. J. Bearn, “Minimum Hellinger Distance Estimates for Parametric Models,” The Annals of Mathematical Statistics, Vol. 5, No. 3, 1977, pp. 445-463.

[15] D. G. Simpson, “Hellinger Deviance Test: Efficiency, Breakdown Points and Examples,” Journal of American Statistical Association, Vol. 84, No. 405, 1989, pp. 107- 113. doi:10.1080/01621459.1989.10478744

[16] B. G. Lindsay, “Efficiency versus Robustness: The Case for Minimum Distance Hellinger Distance and Related Methods,” Annals of Statistics, Vol. 22, No. 2, 1994, pp. 1081-1114. doi:10.1214/aos/1176325512

[17] A. Basu and B. G. Lindsay, “Minimum Disparity Estimation for Continuous Models: Efficiency, Distributions and Robustness,” The Annals of Mathematical Statistics, Vol. 46, No. 4, 1994, pp. 683-705. doi:10.1007/BF00773476

[18] A. Basu, I. R. Harris and S. Basu, “Tests of Hypotheses in Discrete Models Based on the Penalized Hellinger Distance,” Statistics and Probability Letters, Vol. 27, No. 4, 1996, pp. 367-373. doi:10.1016/0167-7152(95)00101-8

[19] A. Basu and S. Basu, “Penalized Minimum Disparity Methods for Multinomial Models,” Statistica Sinica, Vol. 8, 1998, pp. 841-860.

[20] M. W. Birch, “The Detection of Partial Association, II: The General Case,” Journal of the Royal Statistical Society, Vol. 27, No. 1, 1965, pp. 111-124.

[21] J. P. W. Pluim, J. B. A. Maintz and A. M. Viergever, “f-Information Measures to Medical Image Registration,” IEEE Transactions on Medical Imaging, Vol. 23, No. 12, 2004, pp. 1508-1516. doi:10.1109/TMI.2004.836872

[22] I. Vajda, “Theory of Statistical Evidence and Information,” Kluwe Academic Plubisher, Dordrecht, 1989.

[23] D. Morales, L. Pardo and I. Vajda, “Asymptotic Divergence of Estimates of Discrete Distribution,” Journal of Statistical Planning and Inference, Vol. 483, No. 3, 1995, pp. 347-369. doi:10.1016/0378-3758(95)00013-Y

[24] N. Cressie and T. R. C. Read, “Multinomial Goodness of Fit Test,” Journal of the Royal Statistical Society, Vol. 463, No. 3, 1984, pp. 440-464.

[25] K. Zografos and K. Ferentinos, “Divergence Statistics Sampling Properties and Multinomial Goodness of fit and Divergence Tests,” Communications in Statistics—Theory and Methods, Vol. 19, No. 5, 1990, pp. 1785-1802. doi:10.1080/03610929008830290

[26] M. Salicru, D. Morales, M. L. Menendez, et al., “On the Applications of Divergence Type Measures in Testing Statistical Hypotheses,” Journal of Multivariate Analysis, Vol. 51, No. 2, 1994, pp. 372-391. doi:10.1006/jmva.1994.1068

[27] A. Bar-Hen and J. J. Dandin, “Generalisation of the Mahalanobis Distance in the Mixed Case,” Journal of Multivariate Analysis, Vol. 532, No. 2, 1995, pp. 332-342. doi:10.1006/jmva.1995.1040

[28] L. Pardo, D. Mmorales, M. Salicrù and M. L. Menendez, “Generalized Divergences Measures: Amount of Information, Asymptotic-Distribution and Its Applications to Test Statistical Hypotheses,” International Sciences, Vol. 84, No. 3-4, 1995, pp. 181-198.

[29] M. L. Menendez, L. Pardo, M. Salicrù and D. Morales, “Divergence Measures, Based on Entropy Functions and Statistical Inference,” Sanky?: The Indian Journal of Statistics, Vol. 57, No. 3, 1995, pp. 315-337.

[30] I. Csiszár, “Information-Type Measure of Difference of Probability Distribution and Indirect Observations,” Studia Scientiarum Mathematicarum Hungarica, Vol. 2, 1967, pp. 299-318.

[31] M. Broniatowski and A. Toma, “Dual Divergence Estimators and Tests: Robustness Results,” Journal of Multivariate Analysis, Vol. 102, No. 1, 2011, pp. 20-36.

[32] A. Basu, A. Mandal and L. Pardo, “Hypothesis Testing for Two Discrete Populations Based on the Hellinger Distance,” Statistics and Probability Letters, Vol. 80, No. 3-4, 2010, pp. 206-214. doi:10.1016/j.spl.2009.10.008

[33] F. Liese and I. Vajda, “Convex Statistical Distance, vol. 95 of Teubner-Texte zur Mathematik,” 1987.

[34] R. Tamura and D. D. Boos, “Minimum Hellinger Distance Estimation for Multivariate Location and Covariance,” Journal of American Statistical Association, Vol. 81, No. 333, 1989, pp. 223-229.

[35] A. Basu, S. Sarkar and A. N. Vidyashankar, “Minimum Negative Exponential Disparity Estimation in Parametric Models,” Journal of Statistical Planning and Inference, Vol. 582, No. 2, 1997, pp. 349-370. doi:10.1016/S0378-3758(96)00078-X

[36] I. R. Harris and A. Basu, “Hellinger Distance as Penalized Loglikelihood,” Communications in Statistics— Theory and Methods, Vol. 21, No. 3, 1994, pp. 637-646. doi:10.1080/03610929208830804

[37] A. Mandal, R. K. Patra and A. Basu, “Minimum Hellinger Distance Estimation with Inlier Modification,” Sankhya, Vol. 70, 2008, pp. 310-322.