JDAIP  Vol.3 No.2 , May 2015
Comparison of Feature Reduction Techniques for the Binominal Classification of Network Traffic
Abstract: This paper tests various scenarios of feature selection and feature reduction, with the objective of building a real-time anomaly-based intrusion detection system. These scenarios are evaluated on the realistic Kyoto 2006+ dataset. The influence of reducing the number of features on the classification performance and the execution time is measured for each scenario. The so-called HVS feature selection technique detailed in this paper reveals many advantages in terms of consistency, classification performance and execution time.
Cite this paper: Ammar, A. (2015) Comparison of Feature Reduction Techniques for the Binominal Classification of Network Traffic. Journal of Data Analysis and Information Processing, 3, 11-19. doi: 10.4236/jdaip.2015.32002.

[1]   Song, J., Takakura, H., Okabe, Y., Eto, M., Inoue, D. and Nakao, K. (2011) Statistical Analysis of Honeypot Data and Building of Kyoto 2006+ Dataset for NIDS Evaluation. Proceedings of the 1st Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, Salzburg, 10-13 April 2011, 29-36.

[2]   MIT Lincoln Lab., Information Systems Technology Group (1998) The 1998 Intrusion Detection Off-Line Evaluation Plan.

[3]   Abdi, H. and Williams, L.J. (2010) Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 433-459.

[4]   Hornik, K., Stinchcombe, M. and White, H. (1989) Multilayer Feedforward Networks Are Universal Approximators. Neural Networks, 2, 359-366.

[5]   Huang, G.B., Chen, Y.Q. and Babri, H.A. (2000) Classification Ability of Single Hidden Layer Feedforward Neural Networks. IEEE Transactions on Neural Networks, 11, 799-801.

[6]   Wong, P.M., Gedeon, T.D. and Taggart, I.J. (1995) An Improved Technique in Porosity Prediction: A Neural Network Approach. IEEE Transactions on Geoscience and Remote Sensing, 33, 971-980.

[7]   Yacoub, M. and Bennani, Y. (1997) HVS: A Heuristic for Variable Selection in Multilayer Artificial Neural Network Classifier. Intelligent Engineering Systems through Artificial Neural Networks, St. Louis, January 1997, 527-532.

[8]   Wold, H. (1975) Soft Modeling by Latent Variables: The Nonlinear Iterative Partial Least Squares Approach. Perspectives in Probability and Statistics, Papers in Honour of MS Bartlett, 520-540.

[9]   Haenlein, M. and Kaplan, A.M. (2004) A Beginner’s Guide to Partial Least Squares Analysis. Understanding Statistics, 3, 283-297.

[10]   Leray, P. and Gallinari, P. (1999) Feature Selection with Neural Networks. Behaviormetrika, 26, 145-166.

[11]   Kayacik, H.G., Zincir-Heywood, A.N. and Heywood, M.I. (2005) Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets. Proceedings of the 3rd Annual Conference on Privacy, Security and Trust, 12-14 October 2005, 85-89.

[12]   Arauujo, N., de Oliveira, R., Ferreira, E.-W., Shinoda, A.A. and Bhargava, B. (2010) Identifying Important Characteristics in the KDD99 Intrusion Detection Dataset by Feature Selection Using a Hybrid Approach. 2010 IEEE 17th International Conference on Telecommunications (ICT), Doha, 4-7 April 2010, 552-558.

[13]   Guo, Y., Wang, B., Zhao, X., Xie, X., Lin, L. and Zhou, Q. (2010) Feature Selection Based on Rough Set and Modified Genetic Algorithm for Intrusion Detection. 2010 5th International Conference on Computer Science and Education (ICCSE), Hefei, 24-27 August 2010, 1441-1446.

[14]   Mi, A.Z. and Hai, L.P. (2010) A Clustering-Based Classifier Selection Method for Network Intrusion Detection. 2010 5th International Conference on Computer Science and Education (ICCSE), Hefei, 24-27 August 2010,1001-1004.

[15]   Nguyen, H.D. and Cheng, Q. (2011) An Efficient Feature Selection Method for Distributed Cyber Attack Detection and Classification. 2011 45th Annual Conference on Information Sciences and Systems (CISS), Baltimore, 23-25 March 2011, 1-6.

[16]   Wang, J., Li, T.H. and Ren, R.R. (2010) A Real Time IDSs Based on Artificial Bee Colony-Support Vector Machine Algorithm. 2010 3rd International Workshop on Advanced Computational Intelligence (IWACI), Suzhou, 25-27 August 2010, 91-96.

[17]   Zhang, F.L. and Wang, D. (2013) An Effective Feature Selection Approach for Network Intrusion Detection. 2013 IEEE 8th International Conference on Networking, Architecture and Storage (NAS), Xi’an, 17-19 July 2013, 307-311.

[18]   Hota, H.S. and Shrivas, A.K. (2014) Data Mining Approach for Developing Various Models Based on Types of Attack and Feature Selection as Intrusion Detection Systems (IDS). In: Mohapatra, D.P. and Patnaik, S., Eds., Intelligent Computing, Networking, and Informatics, Springer India, New Delhi, 845-851.

[19]   Jackson, J.E. (2005) A User’s Guide to Principal Components, Volume 587. John Wiley & Sons, Hoboken.

[20]   Kim, S.B. and Rattakorn, P. (2011) Unsupervised Feature Selection Using Weighted Principal Components. Expert Systems with Applications, 38, 5704-5710.