Support Vector Machines for Regression: A Succinct Review of Large-Scale and Linear Programming Formulations

Pablo Rivas-Perea^{*},
Juan Cota-Ruiz^{*},
David Garcia Chaparro^{*},
Jorge Arturo Perez Venzor^{*},
Abel Quezada Carreón^{*},
Jose Gerardo Rosiles^{*}

Show more

References

[1] J. Mercer, “Functions of Positive and Negative Type, and Their Connection with the Theory of Integral Equations,” Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, Vol. 209, 1909, pp. 415-446.
doi:10.1098/rsta.1909.0016

[2] R. Courant and D. Hilbert, “Methods of Mathematical Physics,” Interscience, New York, 1966.

[3] J. Shawe-Taylor and N. Cristianini, “Kernel Methods for pattern Analysis,” Cambridge University Press, New York, 2004. doi:10.1017/CBO9780511809682.002

[4] N. Cristianini and B. Scholkopf, “Support Vector Machines and Kernel Methods: The New Generation of Learning Machines,” Ai Magazine, Vol. 23, No. 3, 2002, p. 31.

[5] B. E. Boser, I. M. Guyon and V. N. Vapnik, “A Training Algorithm for Optimal Margin Classifiers,” Proceedings of the 5th Annual Workshop on Computational Learning Theory, Pittsburgh, July 1992, pp. 144-152.

[6] K. Labusch, E. Barth and T. Martinetz, “Simple Method for High-Performance Digit Recognition Based on Sparse Coding,” IEEE Transactions on Neural Networks, Vol. 19, No. 11, 2008, pp. 1985-1989.
doi:10.1109/TNN.2008.2005830

[7] H. Al-Mubaid and S. Umair, “A New Text Categorization Techniqueusing Distributional Clustering and Learning Logic,” IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 9, 2006, pp. 1156-1165.
doi:10.1109/TKDE.2006.135

[8] K. Wu and K.-H. Yap, “Fuzzy SVM for Content-Based Image Retrieval: A Pseudo-Label Support Vector Machine Framework,” IEEE Computational Intelligence Magazine, Vol. 1, No. 2, 2006, pp. 10-16.
doi:10.1109/MCI.2006.1626490

[9] N. Sapankevych and R. Sankar, “Time Series Prediction Using Support Vector Machines: A Survey,” IEEE Computational Intelligence Magazine, Vol. 4, No. 2, 2009, pp. 24-38. doi:10.1109/MCI.2009.932254

[10] D. Peterson and M. Thaut, “Model and Feature Selection in Microarray Classification,” Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bio-informatics and Computational Biology, La Joll, 7-8 October 2004, pp. 56-60. doi:10.1109/CIBCB.2004.1393932

[11] A. Sanchez and V. David, “Advanced Support Vector Machines and Kernel Methods,” Neurocomputing, Vol. 55, No. 1-2, 2003, pp. 5-20.
doi:10.1016/S0925-2312(03)00373-4

[12] L. Zhang and W. Zhou, “On the Sparseness of 1-Norm Support Vector Machines,” Neural Networks, Vol. 23, No. 3, 2010, pp. 373-385.
doi:10.1016/j.neunet.2009.11.012

[13] V. N. Vapnik, “The Nature of Statistical Learning Theory,” Springer, New York, 1995.

[14] A. J. Smola and B. Scholkopf, “A Tutorial on Support Vector Regression,” Statistics and Computing, Vol. 14, No. 3, 2004, pp. 199-222.
doi:10.1023/B:STCO.0000035301.49549.88

[15] B. Huang, Z. Cai, Q. Gu and C. Chen, “Using Support Vector Regression for Classification,” Advanced Data Mining and Applications, Vol. 5139, 2008, pp. 581-588.

[16] V. Vapnik, S. Golowich, and A. Smola, “Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing,” Advances in Neural Information Processing Systems, Vol. 9, 1997, pp. 281-287.

[17] E. Osuna, R. Freund and F. Girosi, “An Improved Training Algorithmfor Support Vector Machines,” Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Workshop, Amelia Island, 24-26 September 1997, pp. 276-285. doi:10.1109/NNSP.1997.622408

[18] T. Joachims, “Making Large Scale SVM Learning Practical,” Advances in Kernel Methods, 1999, pp. 169-184.

[19] J. Platt, “Using Analytic QP and Sparseness to Speed Training of Support Vector Machines,” Advances in Neural Information Processing Systems, MIT Press, Cambridge, 1999, pp. 557-563.

[20] R. Collobert and S. Bengio, “Svmtorch: Support Vector Machines Forlarge-Scale Regression Problems,” Journal of Machine Learning Research, Vol. 1, 2001, pp. 143-160.
doi:10.1162/15324430152733142

[21] R. Rifkin, “Everything Old Is New Again: A Fresh Look at Historical Approaches in Machine Learning,” Ph.D. Dissertation, Massachusetts Institute of Technology, 2002.

[22] O. Mangasarian and D. Musicant, “Large Scale Kernel Regression via linear Programming,” Machine Learning, Vol. 46, No. 1-3, 2002, pp. 255-269.
doi:10.1023/A:1012422931930

[23] P. Drineas and M. W. Mahoney, “On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning,” Journal of Machine Learning Research, Vol. 6, 2005, pp. 2153-2175.

[24] D. Hush, P. Kelly, C. Scovel and I. Steinwart, “QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines,” The Journal of Machine Learning Research, Vol. 7, 2006, p. 769.

[25] S. Sra, “Efficient Large Scale Linear Programming Support Vector Machines,” Lecture Notes in Computer Science, Vol. 4212, 2006, pp. 767-774.
doi:10.1007/11871842_78

[26] Y. Censor and S. Zenios, “Parallel Optimization: Theory, Algorithms, and Applications,” Oxford University Press, Oxford, 1997.

[27] C. Hildreth, “A Quadratic Programming Procedure,” Naval Research Logistics Quarterly, Vol. 4, No. 1, 1957, pp. 79-85. doi:10.1002/nav.3800040113

[28] Z. Lu, J. Sun and K. R. Butts, “Linear Programming Support Vector Regression with Wavelet Kernel: A New Approach to Nonlinear Dynamical Systems Identification,” Mathematics and Computers in Simulation, Vol. 79, No. 7, 2009, pp. 2051-2063.
doi:10.1016/j.matcom.2008.10.011

[29] Y. Torii and S. Abe, “Decomposition Techniques for Training Linear Programming Support Vector Machines,” Neurocomputing, Vol. 72, No.4-6, 2009, pp. 973-984.
doi:10.1016/j.neucom.2008.04.008

[30] S. S. Haykin, “Neural Networks and Learning Machine,” Prentice Hall, Upper Saddle River, 2009.

[31] S. Wright, “Primal-Dual Interior-Point Methods,” Society for Industrial Mathematics, 1987, p. 309.
doi:10.1137/1.9781611971453

[32] L. Wang, “Support Vector Machines: Theory and Applications,” Studies in Fuzziness and Soft Computing, Vol. 177, Springer-Verlag, Berlin, 2005.

[33] P. Bradley and O. Mangasarian, “Massive Data Discrimination via Linear Support Vector Machines,” Optimization Methods and Software, Vol. 13, No. 1, 2000, pp. 1-10. doi:10.1080/10556780008805771

[34] A. Smola, B. Scholkopf and G. Ratsch, “Linear Programs for Automatic Accuracy Control in Regression,” Ninth International Conference on Artificial Neural Networks, Edinburgh, 7-10 September 1999, pp. 575-580.
doi:10.1049/cp:19991171

[35] B. Scholkopf and A. J. Smola, “Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond,” The MIT Press, Cambridge, 2002.

[36] P. R. Perea, “Algorithms for Training Large-Scale Linear Programming Support Vector Regression and Classification,” Ph.D. Dissertation, The University of Texas, El Paso, 2011.

[37] P. Rivas-Perea and J. Cota-Ruiz, “An Algorithm for Training a Large Scale Support Vector Machine for Regression Based on Linear Programming and Decomposition Methods,” Pattern Recognition Letters, Vol. 34, No. 4, 2013, pp. 439-451. doi:10.1016/j.patrec.2012.10.026

[38] Y.-Z. Xu and H. Qin, “A New Optimization Method of Large-Scale svms Based on Kernel Distance Clustering,” International Conference on Computational Intelligence andSoftware Engineering, Wuhan, 11-13 December 2009, pp. 1-4.

[39] C. Bishop, “Neural Networks for Pattern Recognition,” Oxford University Press, Oxford, 1995.