[1] K. L. Liang and S. L. Zeger, “Longitudinal Data Analysis Using Generalised Estimating Equations,” Biometrika, Vol. 73, No. 1, 1986, pp. 13-22. http://dx.doi.org/10.1093/biomet/73.1.13
[2] A. Qu, B. G. Lindsay and B. Li, “Improving Generalized Estimating Equations Using Quadratic Inference Functions,” Biometrika, Vol. 87, No. 4, 2000, pp. 823-836.
http://dx.doi.org/10.1093/biomet/87.4.823
[3] A. Qu and P. X. K. Song, “Assessing Robustness of Generalized Estimating Equations and Quadratic Inference Functions,” Biometrika, Vol. 91, No. 2, 2004, pp. 447-459.
http://dx.doi.org/10.1093/biomet/91.2.447
[4] I. E. Frank and J. H. Friedman, “A Statistical View of Some Chemometrics Regression Tools (with Discussion),” Technometrics, Vol. 35, No. 2, 1993, pp. 109-148.
http://dx.doi.org/10.1080/00401706.1993.10485033
[5] R. Tibshirani, “Regression Shrinkage and Selection via the LASSO,” Journal of Royal Statistical Society, Series B, Vol. 58, No. 1, 1996, pp. 267-288.
[6] J. Q. Fan and R. Li, “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties,” Journal of American Statistical Association, Vol. 96, No. 456, 2001, pp. 1348-1360. http://dx.doi.org/10.1198/016214501753382273
[7] W. Pan, “Akaike’s Information Criterion in Generalized Estimating Equations,” Biometrics, Vol. 57, No. 1, 2001, pp. 120-125.
http://dx.doi.org/10.1111/j.0006-341X.2001.00120.x
[8] H. Akaike, “Information Theory and an Extension of the Maximum Likelihood Principle,” Proceedings of the 2nd International Symposium on Information Theory, Budapest, 1973, pp. 267-281.
[9] L. Wang and A. Qu, “Consistent Model Selection and Data-Driven Smooth Tests for Longitudinal Data in the Estimating Equations Approach,” Journal of the Royal Statistical Society: Series B, Vol. 71, No. 1, 2009, pp. 177-190.
http://dx.doi.org/10.1111/j.1467-9868.2008.00679.x
[10] G. Schwarz, “Estimating the Dimension of a Model,” The Annals of Statistics, Vol. 6, No. 2, 1978, pp. 461-464.
http://dx.doi.org/10.1214/aos/1176344136
[11] W. J. Fu, “Penalized Estimating Equation,” Biometrics, Vol. 59, No. 1, 2003, pp. 126-132.
http://dx.doi.org/10.1111/1541-0420.00015
[12] P. R. Xu, W. Fu and L. X. Zhu, “Shrinkage Estimation Analysis of Correlated Binary Data with a Diverging Number of Parameters,” Science China Mathematics, Vol. 56, No. 2, 2013, pp. 359-377. http://dx.doi.org/10.1007/s11425-012-4564-y
[13] J. Fan and R. Li, “Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery,” Proceedings of the International Congress of Mathematicians, Vol. 3, European Mathematical Society, 2006, pp. 595-622.
[14] J. Fan and J. Lv, “A Selective Overview of Variable Selection in High-Dimensional Feature Space,” Statistica Sinica, Vol. 20, No. 1, 2009, pp. 101-148.
[15] M. Crowder, “On the Use of a Working Correlation Matrix in Using Generalised Linear Models for Repeated Measures,” Biometrika, Vol. 82, No. 2, 1995, pp. 407-410.
http://dx.doi.org/10.1093/biomet/82.2.407
[16] L. Hansen, “Large Sample Properties of Generalized Method of Moments Estimators,” Econometrica, Vol. 50, No. 4, 1982, pp. 1029-1054. http://dx.doi.org/10.2307/1912775
[17] M. Ueki, “A Note on Automatic Variable Selection Using Smooth-Threshold Estimating Equations,” Biometrika, Vol. 96, No. 4, 2009, pp. 1005-1011. http://dx.doi.org/10.1093/biomet/asp060
[18] J. Fan and H. Peng, “Nonconcave Penalized Likelihood with a Diverging Number of Parameters,” The Annals of Statistics, Vol. 32, No. 3, 2004, pp. 928-961.
http://dx.doi.org/10.1214/009053604000000256
[19] L. Wang, “GEE Analysis of Clustered Binary Data with Diverging Number of Covariates,” The Annals of Statistics, Vol. 39, No. 1, 2011, pp. 389-417. http://dx.doi.org/10.1214/10-AOS846
[20] J. Zhou and A. Qu, “Informative Estimation and Selection of Correlation Structure for Longitudinal Data,” Journal of the American Statistical Association, Vol. 107, No. 498, 2012, pp. 701-710. http://dx.doi.org/10.1080/01621459.2012.682534
[21] J. J. Dziak, “Penalized Quadratic Inference Functions for Variable Selection in Longitudinal Research,” Ph.D Thesis, The Pennsylvania State University, 2006.
http://sites.stat.psu.edu/~jdziak/DziakDissert.pdf