Efficiency of Selecting Important Variable for Longitudinal Data

ABSTRACT

Variable
selection with a large number of predictors is a very challenging and important
problem in educational and social domains. However, relatively little attention
has been paid to issues of variable selection in longitudinal data with
application to education. Using this longitudinal educational data (Test of
English for International Communication, TOEIC), this study compares multiple regression,
backward elimination, group least selection absolute shrinkage and selection operator
(LASSO), and linear mixed models in terms of their performance in variable selection.
The results from the study show that four different statistical methods contain
different sets of predictors in their models. The linear mixed model (LMM)
provides the smallest number of predictors (4 predictors among a total of 19
predictors). In addition, LMM is the only appropriate method for the repeated
measurement and is the best method with respect to the principal of parsimony.
This study also provides interpretation of the selected model by LMM in the
conclusion using marginal *R*^{2}.

Cite this paper

Ra, J. & Rhee, K. (2014). Efficiency of Selecting Important Variable for Longitudinal Data.*Psychology, 5,* 6-11. doi: 10.4236/psych.2014.51002.

Ra, J. & Rhee, K. (2014). Efficiency of Selecting Important Variable for Longitudinal Data.

References

[1] Agresti, A. (2002). Categorical data analysis (2nd ed.). Boboken, NJ: John Wiley & Sons.

[2] Agresti, A., & Finlay, B. (1986). Statistical method for the social sciences (2nd, ed.). San Francisco, CA: Dellen.

[3] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov, & F. Csaki (Eds.), Second international symposium on information theory (pp. 267-281). Budapest: AcademiaiKiado.

[4] Altman, D. G., & Andersen, P. K. (1989). Bootstrap investigation of the stability of a Coxregression model. Statistics in Medicine, 8, 771-783.

[5] Bernstein, I. H. (1989). Applied multivariate analysis. New York: Springer-Verlag.

[6] Bondell, H. D., & Reich, B. J. (2008). Simultaneous regression shrinkage, variable selectionand clustering of predictors with OSCAR. Biometrics, 64, 115-123.

[7] Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multipleregression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum.

[8] Derksen, S., & Keselman, H. J. (1992). Backward, forward and stepwise automated subset selection algorithms. British Journal of Mathematical and Statistical Psychology, 45, 265-282.

[9] Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32, 407-489.

[10] Fan, J., & Li, R. (2001). Variable selection vianonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348-1360.

[11] Foster, D. P., & George, E. I. (1994). The risk inflation criterion for multiple regression. The Annals of Statistics, 22, 1947-1975.

[12] George, E. I., & McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association, 88, 881-889.

[13] Gilks, W. R., Wang, C. C, Yvonnet, B., & Coursaget, P. (1993). Random effects models for longitudinal data using Gibbs sampling. Biometrics, 49, 441-453.

[14] Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression. New York: John Wiley& Sons.

[15] Laird, N., & Ware, J. H. (1982). Random effect models for longitudinal data. Biometrics, 38, 963-974.

[16] Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15, 611-675.

[17] Meier, L., van de Geer, S., & Buhlmann, P. (2008). The group lasso for logistic regression. Journal of Royal Statistical Society, B, 70, 53-71.

[18] Menard, S. (1995). Applied logistic regression analysis (Sage university paper series on quantitative application in the social sciences, series no. 106) (2nd ed.). ThousandOaks, CA: Sage.

[19] O’Hara, R. B., & Sillanpaaa, M. J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian Analysis, 4, 85-118.

[20] Orelien.,& Edwards, L. J. (2008). Fixed effect variable selection in linear mixed models using statistics. Computational Statistics & Data Analysis, 52, 1896-1907.

[21] R Development Core Team. (2013). R: A language environment for statistical computing. Vienna, Austria: The R foundation for statistical computing. http://www.R-project.org/

[22] Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461-464.

[23] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, B, 58, 267-288.

[24] Vonesh, E. F., & Chinchilli, V. M. (1997). Linear and Nonlinear models for the analysis of repeated measurement. New York: Marcel Dekker.

[25] Vonesh, E. F., Chinchilli, V. M., & Pu, K. W. (1996).Goodness-of-fit in generalized nonlinear mixed-effects model. Biometrics, 52, 572-587.

[26] Yuan, M., & Lin, Y. (2006).The composite absolute penalties family for grouped and hierarchical variable selection. Journal of the Royal Statistical Society, B, 68, 49-67.

[27] Zhang, H. H. Wahba, G., Lin, Y., Voelker, M., Ferris, M., Klein, R., & Klein, B. (2004). Variable selection and model building via like lihood basis pursuit. Journal of the American Statistical Association, 99, 659-672.

[28] Zheng, B. Y. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19, 1265-1275.

[1] Agresti, A. (2002). Categorical data analysis (2nd ed.). Boboken, NJ: John Wiley & Sons.

[2] Agresti, A., & Finlay, B. (1986). Statistical method for the social sciences (2nd, ed.). San Francisco, CA: Dellen.

[3] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov, & F. Csaki (Eds.), Second international symposium on information theory (pp. 267-281). Budapest: AcademiaiKiado.

[4] Altman, D. G., & Andersen, P. K. (1989). Bootstrap investigation of the stability of a Coxregression model. Statistics in Medicine, 8, 771-783.

[5] Bernstein, I. H. (1989). Applied multivariate analysis. New York: Springer-Verlag.

[6] Bondell, H. D., & Reich, B. J. (2008). Simultaneous regression shrinkage, variable selectionand clustering of predictors with OSCAR. Biometrics, 64, 115-123.

[7] Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multipleregression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum.

[8] Derksen, S., & Keselman, H. J. (1992). Backward, forward and stepwise automated subset selection algorithms. British Journal of Mathematical and Statistical Psychology, 45, 265-282.

[9] Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32, 407-489.

[10] Fan, J., & Li, R. (2001). Variable selection vianonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348-1360.

[11] Foster, D. P., & George, E. I. (1994). The risk inflation criterion for multiple regression. The Annals of Statistics, 22, 1947-1975.

[12] George, E. I., & McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association, 88, 881-889.

[13] Gilks, W. R., Wang, C. C, Yvonnet, B., & Coursaget, P. (1993). Random effects models for longitudinal data using Gibbs sampling. Biometrics, 49, 441-453.

[14] Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression. New York: John Wiley& Sons.

[15] Laird, N., & Ware, J. H. (1982). Random effect models for longitudinal data. Biometrics, 38, 963-974.

[16] Mallows, C. L. (1973). Some comments on Cp. Technometrics, 15, 611-675.

[17] Meier, L., van de Geer, S., & Buhlmann, P. (2008). The group lasso for logistic regression. Journal of Royal Statistical Society, B, 70, 53-71.

[18] Menard, S. (1995). Applied logistic regression analysis (Sage university paper series on quantitative application in the social sciences, series no. 106) (2nd ed.). ThousandOaks, CA: Sage.

[19] O’Hara, R. B., & Sillanpaaa, M. J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian Analysis, 4, 85-118.

[20] Orelien.,& Edwards, L. J. (2008). Fixed effect variable selection in linear mixed models using statistics. Computational Statistics & Data Analysis, 52, 1896-1907.

[21] R Development Core Team. (2013). R: A language environment for statistical computing. Vienna, Austria: The R foundation for statistical computing. http://www.R-project.org/

[22] Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461-464.

[23] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, B, 58, 267-288.

[24] Vonesh, E. F., & Chinchilli, V. M. (1997). Linear and Nonlinear models for the analysis of repeated measurement. New York: Marcel Dekker.

[25] Vonesh, E. F., Chinchilli, V. M., & Pu, K. W. (1996).Goodness-of-fit in generalized nonlinear mixed-effects model. Biometrics, 52, 572-587.

[26] Yuan, M., & Lin, Y. (2006).The composite absolute penalties family for grouped and hierarchical variable selection. Journal of the Royal Statistical Society, B, 68, 49-67.

[27] Zhang, H. H. Wahba, G., Lin, Y., Voelker, M., Ferris, M., Klein, R., & Klein, B. (2004). Variable selection and model building via like lihood basis pursuit. Journal of the American Statistical Association, 99, 659-672.

[28] Zheng, B. Y. (2000). Summarizing the goodness of fit of generalized linear models for longitudinal data. Statistics in Medicine, 19, 1265-1275.