CN  Vol.3 No.3 , August 2011
Survey on Spam Filtering Techniques
Abstract: In the recent years spam became as a big problem of Internet and electronic communication. There developed a lot of techniques to fight them. In this paper the overview of existing e-mail spam filtering methods is given. The classification, evaluation, and comparison of traditional and learning-based methods are provided. Some personal anti-spam products are tested and compared. The statement for new approach in spam filtering technique is considered.
Cite this paper: nullS. Nazirova, "Survey on Spam Filtering Techniques," Communications and Network, Vol. 3 No. 3, 2011, pp. 153-160. doi: 10.4236/cn.2011.33019.

[1]   Wikipedia, “Spam”.

[2]   Wikipedia, “E-mail spam”.

[3]   Symantec, “State of Spam and Phishing. A Monthly Report 2010,” 2010.

[4]   J. P. Denning, “ACM President’s Letter: Electronic Junk,” Communications of the ACM, Vol. 25, No. 3, March 1982, pp. 163-165. doi:10.1145/358453.358454

[5]   M. Sahami, “Learning Limited Dependence Bayesian Classifiers,” Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, The AAAI Press, Menlo Park, 1996, pp. 334-338.

[6]   M. Sahami, S. Dumais, D. Heckerman and E. Horvitz, “A Bayesian Approach to Filtering Junk Email,” AAAI Technical Report WS-98-05, AAAI Workshop on Learning for Text Categorization, 1998.

[7]   J. R. Hall, “How to Avoid Unwanted Email,” Communications of the ACM, Vol. 41, No. 3, 1998, pp. 88-95. doi:10.1145/272287.272329

[8]   E. Gabber, M. Jakobsson, Y. Matias and A.J. Mayer, “Curbing Junk E-Mail via Secure Classification,” Proceedings of the Second International Conference on Financial Cryptography, Springer-Verlag London, 23-25 March 1998, pp. 198-213.

[9]   R. A. Fisher, “On Some Extensions of Bayesian Inference Proposed by Mr. Lindley,” Journal of the Royal Statistical Society: Series B, Vol. 22, No. 2, 1960, pp. 299-301.

[10]   G. Robinson, “A Statistical Approach to the Spam Problem,” 2003. (accessed March 2011).

[11]   P. Boldi, M. Santini and S. Vigna, “PageRank as a Function of the Damping Factor,” Proceedings of the 14th International Conference on World Wide Web, ACM New York, 10-14 May 2005. doi:10.1145/1060745.1060827

[12]   J. Gordillo and E. Conde, “An HMM for Detecting Spam Mail,” Expert Systems with Applications, Vol. 33, No. 3, 2007, pp. 667-682. doi:10.1016/j.eswa.2006.06.016

[13]   L. M. Spracklin and L. V. Saxton, “Filtering Spam Using Kolmogorov Complexity Estimates,” in Russian, 21st International Conference on Advanced Information Networking and Applications Workshops (Ainaw’07), Niagara Falls, 21-23 May 2007, pp. 321-328.

[14]   S. V. Korelov, A. K. Kryukov and L. U. Rotkov, “Text Messages’ Digital Analysis on Spam Identification,” in Russian, Proceedings of Scientific Conference on Radiophysics, Nizhni Novgorod State University, Nizhny Novgorod Oblast, 2006.

[15]   W.-F. Hsiao and T.-M. Chang, “An Incremental Cluster-Based Approach to Spam Filtering,” Expert Systems with Applications, No. 34, No. 3, 2008, pp. 1599-1608. doi:10.1016/j.eswa.2007.01.018

[16]   S. M. Lee, D. S. Kim and J. S. Park, “Spam Detection Using Feature Selection and Parameters Optimization,” IEEE International Conference on Intelligent and Software Intensive Systems, Krakow, 15-18 February 2010, pp. 883-888. doi:10.1109/CISIS.2010.116

[17]   M. F. Saeddian and H. Beigy, “Spam Detection Using Dynamic Weighted Voting Based on Clustering,” Proceedings of the 2008 Second International Symposium on Intelligent Information Technology Application, Vol. 2, pp. 122-126. doi:10.1109/IITA.2008.140

[18]   M. Sasaki and H. Shinnou, “Spam Detection Using Text Clustering,” IEEE Proceedings of the 2005 International Conference on Cyberwords, Singapore, 23-25 November 2005, pp. 316-319. doi:10.1109/CW.2005.83

[19]   P. Cortez, C. Lopes, P. Sousa, M. Rocha and M. Rio, “Symbiotic Data Mining for Personalized Spam Filtering,” IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Milan, 15-18 September 2009, pp. 149-156. doi:10.1109/WI-IAT.2009.30

[20]   W. Lauren, “Spam Wars,” Communications of the ACM —Program Compaction, Vol. 46, No. 8, 2003, p. 136.

[21]   G. Pawel and M. Jacek, “Fighting the Spam Wars: A Re-Mailer Approach with Restrictive Aliasing,” ACM Transactions on Internet Technology (TOIT), Vol. 4, No. 1, 2004, pp. 1-30.

[22]   F. Li, H. Mo-Han and G. Pawel, “The Community Behavior of Spammers” 2011.

[23]   K. S. Xu, M. Kliger, Y. Chen, P. J. Woolf and A. O. Hero, “Revealing Social Networks of Spammers through Spectral Clustering,” IEEE International Conference on Communications, Dresden, 14-18 June 2009, pp. 1-6. doi:10.1109/ICC.2009.5199418

[24]   K. S. Xu, M. Kliger and A. O. Hero, “Tracking Communities of Spammers by Evolutionary Clustering,” 2011.

[25]   Laboratory CSAIL MIT in USA, 2011.

[26]   Computer Laboratory Faculty Cambridge University in UK, 2011.

[27]   National Center for Scientific Research, “Demokritos,” 2011.

[28]   D. Mertz, “Spam Filtering Techniques,” 2002.

[29]   R. Segal, J. Crawford, J. Kephart and B. Leib, “SpamGuru: An Enterprise Anti-Spam Filtering System,” IBM Thomas J. Watson Research Center.

[30]   Microsoft Antispam Technologies.

[31]   Symantec Antispam Protection for E-Mail.

[32]   Kasperskiy Ant-Spam.

[33]   Anti-Spam Research Group.

[34]   The Internet Engineering Task Force.

[35]   Spam Events.

[36]   S. A. Nazirova, “Anti-Spam Module for Filtering the Outgoing Correspondence,” in Russian, Transactions of ANAS, Informatics and Control Problems, Vol. XXVIII, No. 3, 2008, pp. 158-162.

[37]   S. A. Nazirova, “New Anti Spam Methods,” Proceedings on the Second International Conference on Problems of Cybernetics and Informatics, Baku, 10-12 September 2008, pp. 89-92.

[38]   Spam URL Realtime Block Lists.

[39]   Razor’s homepage.

[40]   Pyzor’s homepage.

[41]   DCC Spam Control Delayed Your E-Mail.

[42]   Symantec Brightmail Anti-Spam.

[43]   Yandex, “Some Automatic Spam Detection Methods”.

[44]   Microsoft Sender ID Framework.

[45]   Sender Policy Framework.

[46]   J. Klensin, “RFC-2821: Simple Mail Transfer Protocol,” April 2001.

[47]   T.-J. Liu, W.-L. Tsao and C.-L. Lee, “A High Performance Image-Spam Filtering System,” Ninth International Symposium on Distributed Computing and Applications to Business, Engineering and Science, 10-12 August 2010, Hong Kong, pp. 445-449. doi:10.1109/DCABES.2010.97

[48]   M. Soranamageswari and C. Meena, “Statistical Feature Extraction for Classification of Image Spam Using Artificial Neural Networks,” Second International Conference on Machine Learning and Computing, Bangalore, 9-11 February, 2010, pp. 101-105. doi:10.1109/ICMLC.2010.72

[49]   Bag of Words Model.

[50]   K. Li, Z. Zhong and L. Ramaswamy, “Privacy-Aware Collaborative Spam Filtering,” IEEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 5, May 2009, pp. 725-739. doi:10.1109/TPDS.2008.143

[51]   F. Weidong and D. Shoubin, “Addressing Interest Diversity in P2P Based Collaborative Spam Filtering,” Fifth International Conference on Grid and Cooperative Computing Workshops, Hunan, October 2006, pp. 163-169. doi:10.1109/GCCW.2006.16

[52]   J. S. Kong, B. A. Rezaei, N. Sarshar, V. P. Roychowdhury and P. O. Boykin, “Collaborative Spam Filtering Using E-Mail Networks,” IEEE Computer Society on Computer, Vol. 39, No. 8, 2006, pp. 67-73.

[53]   A. Gray and M. Haahr, “Personalised, Collaborative Spam Filtering,” Proceedings of the First Conference on Email and Anti-Spam (CEAS), Mountain View, 30-31 July 2004.

[54]   R. M. Alguliyev and S. H. Nazirova, “Multilayer and Multiagent Automated Email Filtration System,” Telecommunications and Radioengeneering, Vol. 67, No. 12, pp. 1089-1095.

[55]   P. A. Chirita, J. Diederich and W. Nejdl, “MailRank: Using Ranking for Spam Detection,” Proceedings of the 14th ACM International Conference on Information and Knowledge Management, Bremen, 31 October-5 November 2005.

[56]   R. Bhuleskar, A. Sherlekar and A. Pandit, “Hybrid Spam E-Mail Filtering,” 2009 First International Conference on Computational Intelligence, Communication Systems and Networks, Indore, 23-25 July 2009, pp. 302-307. doi:10.1109/CICSYN.2009.34

[57]   Google Message Security Postini Services.

[58]   R. M. Alguliyev and S. H. Nazirova, “Architecture of Hierarchical Intellectual Nation-Wide System of Struggle against Spam,” in Russian, Information Technologies, Moscow, No. 8, 2006, pp. 32-36.

[59]   R. M. Alguliyev and S. H. Nazirova, “Mechanism of Formation and Realisation of Anti-Spam Policy,” in Russian, Telecommunications, Moscow, No. 12, 2009, pp. 6-10.