ABSTRACT In recent years, significant research has been devoted to the development of Intrusion Detection Systems (IDS) able to detect anomalous computer network traffic indicative of malicious activity. While signature-based IDS have proven effective in discovering known attacks, anomaly-based IDS hold the even greater promise of being able to automatically detect previously undocumented threats. Traditional IDS are generally trained in batch mode, and therefore cannot adapt to evolving network data streams in real time. To resolve this limitation, data stream mining techniques can be utilized to create a new type of IDS able to dynamically model a stream of network traffic. In this paper, we present two methods for anomalous network packet detection based on the data stream mining paradigm. The first of these is an adapted version of the DenStream algorithm for stream clustering specifically tailored to evaluate network traffic. In this algorithm, individual packets are treated as points and are flagged as normal or abnormal based on their belonging to either normal or outlier clusters. The second algorithm utilizes a histogram to create a model of the evolving network traffic to which incoming traffic can be compared using Pearson correlation. Both of these algorithms were tested using the first week of data from the DARPA ’99 dataset with Generic HTTP, Shell-code and Polymorphic attacks inserted. We were able to achieve reasonably high detection rates with moderately low false positive percentages for different types of attacks, though detection rates varied between the two algorithms. Overall, the histogram-based detection algorithm achieved slightly superior results, but required more parameters than the clustering-based algorithm. As a result of its fewer parameter requirements, the clustering approach can be more easily generalized to different types of network traffic streams.
Cite this paper
nullZ. Miller, W. Deitrick and W. Hu, "Anomalous Network Packet Detection Using Data Stream Mining," Journal of Information Security, Vol. 2 No. 4, 2011, pp. 158-168. doi: 10.4236/jis.2011.24016.
 R. Perdisci, G. Gu, and W. Lee, “Using an Ensemble of One-Class svm Classifiers to Harden Payload-Based Anomaly Detection Systems,” ICDM ’06: Proceedings of the Sixth Integnation Conference on Data Mining, Hong Kong, 18-22 December 2006, pp. 488-498.
 D. Anderson, T. Lunt, H. Javits, and A. Tamaru. “Nides: Detecting Unusual Program Behavior Using the Statistical Component of the Next Generation Intrusion Detection Expert System,” Technical Report SRI-CSL-95-06, Computer Science Laboratory, SRI International, Menlo Park, May 1995.
 R. Perdisci, “Statistical Pattern Recognition Techniques for Intrusion Detection in Computer Networks, Challenges and Solutions,” University of Cagliari, Italy, 2006.
 M. Mahoney and P. Chan, “Learning Non Stationary Models of Normal Network Tra?c for Detecting Novel Attacks,” ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, July 2002, pp. 376-385.
 M. Mahoney, “Network Trafic Anomaly Detection Based on Packet Bytes,” ACM-SAC, Melbourne FL, 2003 pp. 346-350.
 R. Perdisci, D. Ariu, P. Fogla, G. Giacinto and W. Lee, “McPAD: A Multiple Classifier System for Accurate Payload-based Anomaly Detection,” Computer Networks, Special Issue on Traffic Classification and Its Applications to Modern Networks, Vol. 5 No. 6, 2009, pp. 864- 881.
 K. Wang and S. Stolfo, “Anomalous Payload-Based Network Intrusion Detection,” Recent Advances in Intrusion Detection (RAID), Vol. 3224, 2004, pp. 203-222.
 K. Wang, “Network Payload-based Anomaly Detection and Content-based Alert Correlation” Ph.D. Thesis, Columbia University, New York, 2006.
 J. Tang, “An algorithm for Streaming Clustering”, MSc. Thesis, Uppsala University, Uppsala Sweden, 2011.
 A. Bifet, G. Holmes, R. Kirkby and B. Pfahringer, “MOA: Massive Online Analysis,” Journal of Machine Learning Research 11, 2010, pp. 1601-1604.
 F. Cao, M. Ester, W. Qian and A. Zhou, “Density-Based Clustering over an Evolving Data Stream with Noise,” SIAM Conference Data Mining, Bethesda, 2006.
 R. Lippmann, J. Haines, D. Fried, J. Korba and K. Das, “The 1999 DARPA Off-Line Intrusion Detection Evaluation,” Computer Networks, Vol. 34, No. 4, 2000, pp. 579- 595. doi:10.1016/S1389-1286(00)00139-0
 K. L. Ingham and H. Inoue, “Comparing Anomaly Detection Techniques for HTTP,” Recent Advances in Intrusion Detection (RAID), 2007.
 T. Detristan, T. Ulenspiegel, Y. Malcom and M. Underduk, “Polymorphic Shellcode Engine Using Spectrum Analysis,” Phrack, Vol. 11, No. 61, 2003.
 I.H. Witten and E. Frank, “Data Mining: Practical Machine Learning Tools and Techniques,” Second Edition, Morgan Kaufmann Publishers, Waltham, 2005.
 L. Portnoy, E. Eskin and S. Stolfo, “Intrusion Detection with Unlabeled Data Using Clustering,” ACM CSS Workshop on Data Mining Applied to Security, 2001.
 M. Ester, H. Kriegel, J. Sander and X. Xu, “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” International Conference on Knowledge Discovery in Databases and Data Mining (KDD-96), Portland, August 1996, pp. 226-231.
 K. Mumtaz and K. Duraiswamy, “An Analysis on Density Based Clustering of Multi Dimensional Spatial Data,” Indian Journal of Computer Science and Engineering, Vol. 1, No. 1, 2010, pp. 8-12.
 A. Forestiero, C. Pizzuti and G. Spezzano, “FlockStream: a Bio-Inspired Algorithm for Clustering Evolving Data Streams,” ICTAI ’09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence, Washington DC, 2009, pp. 1-8.