JILSA  Vol.2 No.4 , November 2010
Fast Variable Selection by Block Addition and Block Deletion
ABSTRACT
We propose the threshold updating method for terminating variable selection and two variable selection methods. In the threshold updating method, we update the threshold value when the approximation error smaller than the current threshold value is obtained. The first variable selection method is the combination of forward selection by block addi-tion and backward selection by block deletion. In this method, starting from the empty set of the input variables, we add several input variables at a time until the approximation error is below the threshold value. Then we search deletable variables by block deletion. The second method is the combination of the first method and variable selection by Linear Programming Support Vector Regressors (LPSVRs). By training an LPSVR with linear kernels, we evaluate the weights of the decision function and delete the input variables whose associated absolute weights are zero. Then we carry out block addition and block deletion. By computer experiments using benchmark data sets, we show that the proposed methods can perform faster variable selection than the method only using block deletion, and that by the threshold updating method, the approximation error is lower than that by the fixed threshold method. We also compare our method with an imbedded method, which determines the optimal variables during training, and show that our method gives comparable or better variable selection performance.

Cite this paper
nullT. Nagatani, S. Ozawa and S. Abe, "Fast Variable Selection by Block Addition and Block Deletion," Journal of Intelligent Learning Systems and Applications, Vol. 2 No. 4, 2010, pp. 200-211. doi: 10.4236/jilsa.2010.24023.
References
[1]   V. N. Vapnik, “Statistical Learning Theory,” John Wiley & Sons, New York, 1998.

[2]   S. Abe, “Support Vector Machines for Pattern Classification,” 2nd Edition, Springer-Verlag, New York, 2010.

[3]   K.-R. Müller, A. J. Smola, G. R?tsch, B. Sch?lkopf, J. Kohlmorgen and V. Vapnik, “Predicting Time Series with Support Vector Machines,” In: W. Gerstner, A. Germond, M. Hasler and J. D. Nicoud, Eds., Proceedings of the 7th International Conference on Artificial Neural Networks (ICANN '97), Springer-Verlag, Berlin, 1997, pp. 999-1004.

[4]   J. A. K. Suykens, “Least Squares Support Vector Machines for Classification and Nonlinear Modeling,” Neural Network World, Vol. 10, No. 1-2, 2000, pp. 29-47.

[5]   V. Kecman, T. Arthanari and I. Hadzic, “LP and QP Based Learning from Empirical Data,” Proceedings of International Joint Conference on Neural Networks (IJCNN 2001), Washington, DC, Vol. 4, 2001, pp. 2451- 2455.

[6]   G. M. Fung and O. L. Mangasarian, “A Feature Selection Newton Method for Support Vector Machine Classification,” Computational Optimization and Applications, Vol. 28, No. 2, 2004, pp. 185-202.

[7]   S. D. Stearns, “On Selecting Features for Pattern Classifiers,” Pro-ceedings of International Conference on Pattern Recognition, Coronado, 1976, pp. 71-75.

[8]   P. Pudil, J. Novovi?ová and J. Kittler, “Floating Search Methods in Feature Selection,” Chemometrics and Intelligent Laboratory Systems, Vol. 15, No. 11, 1994, pp. 1119-1125.

[9]   J. Bi, K. P. Bennett, M. Em-brechts, C. Breneman and M. Song, “Dimensionality Reduction via Sparse Support Vector Machines,” Journal of Machine Learning Research, Vol. 3, No. 7-8, 2003, pp. 1229-1243.

[10]   T. Nagatani and S. Abe, “Backward Variable Selection of Support Vector Regressors by Block Deletion,” Proceedings of International Joint Conference on Neural Net-works (IJCNN 2007), Orlando, FL, 2007, pp. 1540-1545.

[11]   I. Guyon, J. Weston, S. Barnhill and V. Vap-nik, “Gene Selection for Cancer Classification Using Support Vector Machines,” Machine Learning, Vol. 46, No. 1-3, 2002, pp. 389-422.

[12]   S. Abe, “Neural Networks and Fuzzy Sys-tems: Theory and Applications,” Kluwer, 1997.

[13]   D. Har-rison and D. L. Rubinfeld, “Hedonic Prices and the Demand for Clean Air,” Journal of Environmental Economics and Man-agement, Vol. 5, 1978, pp. 81-102.

[14]   Delve Datasets, http://www.cs.toronto.edu/~delve/data/ datasets.html

[15]   D. Fran?ois, F. Rossi, V. Wertz and M. Verleysen, “Resampling Methods for Parameter-Free and Robust Feature Selection with Mutual Information,” Neurocomputing, Vol. 70, No. 7-9, 2007, pp. 1276-1288.

[16]   A. Asuncion and D. J. Newman, 2007. “UCI Machine Learning Repository,” http://www.ics.uci.edu/~mlearn/ MLRepository.html

[17]   L. Song, A. Smola, A. Gretton and K. M. Borgwardt, “Supervised Feature Selection via Dependence Estimation,” NIPS 2006 Workshop on Causality and Feature Selection, Vol. 227, 2007.

[18]   “Milano Chemometrics and QSAR Research Group,” http://michem.disat.unimib.it/chm/download/download.htm

[19]   A. Rakotomamonjy, “Analysis of SVM Regression Bounds for Variable Ranking,” Neurocomputing, Vol. 70, No. 7-9, 2007, pp. 1489-1501.

[20]   “UCL Machine Learning Group,” http://www.ucl.ac.be/ mlg/index.php?page=home

[21]   F. Rossi, A. Lendasse, D. Fran?ois, V. Wertz and M. Verleyse, “Mutual Information for the Selection of Relevant Variables in Spectro-metric Nonlinear Modeling,” Chemometrics and Intelligent Laboratory Systems, Vol. 80, No. 2, 2006, pp. 215-226.

 
 
Top