An Innovative Approach for Attribute Reduction in Rough Set Theory

ABSTRACT

The Rough Sets
Theory is used in data mining with emphasis on the treatment of uncertain or vague
information. In the case of classification, this theory implicitly calculates
reducts of the full set of attributes, eliminating those that are redundant or
meaningless. Such reducts may even serve as input to other classifiers other
than Rough Sets. The typical high dimensionality of current databases precludes
the use of greedy methods to find optimal or suboptimal reducts in the search
space and requires the use of stochastic methods. In this context, the
calculation of reducts is typically performed by a genetic algorithm, but other
metaheuristics have been proposed with better performance. This work proposes
the innovative use of two known metaheuristics for this calculation, the
Variable Neighborhood Search, the Variable Neighborhood Descent, besides a
third heuristic called Decrescent Cardinality Search. The last one is a new
heuristic specifically proposed for reduct calculation. Considering some
databases commonly found in the literature of the area, the reducts that have been obtained present lower cardinality,* i.e.*,* *a lower number of attributes.

Cite this paper

Pessoa, A. and Stephany, S. (2014) An Innovative Approach for Attribute Reduction in Rough Set Theory.*Intelligent Information Management*, **6**, 223-239. doi: 10.4236/iim.2014.65022.

Pessoa, A. and Stephany, S. (2014) An Innovative Approach for Attribute Reduction in Rough Set Theory.

References

[1] Pawlak, Z. (1982) Rough Sets. International Journal of Computing and Information Sciences, 11, 341-356.

http://www.springerlink.com/index/r5556398717921x5.pdf

[2] Komorowski, J., Pawlak, Z., Polkowski, L. and Skowron, A. (1999) Rough Sets: A Tutorial. Warsaw.

http://secs.ceas.uc.edu/~mazlack/dbm.w2011/Komorowski.RoughSets.tutor.pdf

[3] Ohrn, A. (1999) Discernibility and Rough Sets in Medicine: Tools and Applications. Norwegian University of Science and Technology, Trondheim.

[4] Jensen, R. and Shen, Q. (2003) Finding Rough Set Reducts with Ant Colony Optimization. Proceedings of the 2003 UK Workshop on Computational Intelligence, 15-22.

http://users.aber.ac.uk/rkj/pubs/papers/antRoughSets.pdf

[5] Hedar, A.-R., Wang, J. and Fukushima, M. (2008) Tabu Search for Attribute Reduction in Rough Set Theory. Soft Computing, 12, 909-918.

http://www.springerlink.com/index/c4181mh5l23816u3.pdf

[6] Pessoa, A.S.A., Stephany, S. (2012) Feature Selection with New Metaheuristics in the Rough Sets Theory. XXXIV National Congress of Computational and Applied Mathematics, 1, 1-7.

[7] Hansen, P. and Mladenovic, N. (2003) A Tutorial on Variable Neighborhood Search. Montreal.

http://yalma.fime.uanl.mx/~roger/work/teaching/mecbs5122/5-VNS/VNS-tutorial-G-2003-46.pdf

[8] Han, J., Sanchez, R. and Hu, X. (2005) Feature Selection Based on Relative Attribute Dependency: An Experimental Study. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, 3641, 214-223.

http://www.springerlink.com/index/743agvmf3etlaq0p.pdf http://dx.doi.org/10.1007/11548669_23

[9] Zadeh, L.A. (1965) Fuzzy Sets. Information and Control, 8, 338-353.

http://dx.doi.org/10.1016/S0019-9958(65)90241-X

[10] Shafer, G. (1976) A Mathemathical Theory of Evidence. Princeton University Press, Princeton.

[11] Dempster, A.P. (1967) Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 38, 325-339.

http://dx.doi.org/10.1214/aoms/1177698950

[12] Zadeh, L.A. (1999) Fuzzy Sets as a Basis for a Theory of Possibility. Fuzzy Sets and Systems, 100, 9-34.

http://dx.doi.org/10.1016/S0165-0114(99)80004-9

[13] Nicoletti, M. do C. and Uchôa, J.Q. (1997) Rough Sets under the Perspective of Pertinence Function. 3rd Brazilian Symposium on Intelligent Automation, Vitoria, Brazil, 307-313, in Portuguese.

[14] Chouchoulas, A. and Shen, Q. (2001) Rough Set-Aided Keyword Reduction for Text Categorization. Applied Artificial Intelligence, 15, 843-873.

http://www.tandfonline.com/doi/abs/10.1080/088395101753210773

[15] Jensen, R. and Shen, Q. (2005) Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches. IEEE Transactions on Knowledge and Data Engineering, 16, 1457-1471.

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1350758

[16] Holland, J.H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.

http://www.citeulike.org/group/664/article/400721

[17] Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston.

http://www.mendeley.com/research/genetic-algorithms-in-search-optimization-and-machine-learning/

[18] Kirkpatrick, S., Gelatt, C.D. and Vecchi, M.P. (1983) Optimization by Simulated Annealing. Science, 220, 671-680.

http://www.sciencemag.org/content/220/4598/671.short http://dx.doi.org/10.1126/science.220.4598.671

[19] Glover, F. and McMillan, C. (1986) The General Employee Scheduling Problem. An Integration of MS and AI. Computers Operations Research, 13, 563-573.

http://dx.doi.org/10.1016/0305-0548(86)90050-X

[20] Colorni, A., Dorigo, M. and Maniezzo, V. (1991) Distributed Optimization by Ant Colonies. European Conference on Artificial Life, Paris, France 134-142.

[21] Hansen, P. and Mladenovic, N. (1997) Variable Neighborhood Search. Computers & Operations Research, 24, 1097-1100.

http://www.sciencedirect.com/science/article/pii/S0305054897000312

http://dx.doi.org/10.1016/S0305-0548(97)00031-2

[22] Blake, C.L. and Merz, C.J. (1998) UCI Repository of Machine Learning Databases. University of California, Oakland.

http://www.ics.uci.edu/~mlearn

[1] Pawlak, Z. (1982) Rough Sets. International Journal of Computing and Information Sciences, 11, 341-356.

http://www.springerlink.com/index/r5556398717921x5.pdf

[2] Komorowski, J., Pawlak, Z., Polkowski, L. and Skowron, A. (1999) Rough Sets: A Tutorial. Warsaw.

http://secs.ceas.uc.edu/~mazlack/dbm.w2011/Komorowski.RoughSets.tutor.pdf

[3] Ohrn, A. (1999) Discernibility and Rough Sets in Medicine: Tools and Applications. Norwegian University of Science and Technology, Trondheim.

[4] Jensen, R. and Shen, Q. (2003) Finding Rough Set Reducts with Ant Colony Optimization. Proceedings of the 2003 UK Workshop on Computational Intelligence, 15-22.

http://users.aber.ac.uk/rkj/pubs/papers/antRoughSets.pdf

[5] Hedar, A.-R., Wang, J. and Fukushima, M. (2008) Tabu Search for Attribute Reduction in Rough Set Theory. Soft Computing, 12, 909-918.

http://www.springerlink.com/index/c4181mh5l23816u3.pdf

[6] Pessoa, A.S.A., Stephany, S. (2012) Feature Selection with New Metaheuristics in the Rough Sets Theory. XXXIV National Congress of Computational and Applied Mathematics, 1, 1-7.

[7] Hansen, P. and Mladenovic, N. (2003) A Tutorial on Variable Neighborhood Search. Montreal.

http://yalma.fime.uanl.mx/~roger/work/teaching/mecbs5122/5-VNS/VNS-tutorial-G-2003-46.pdf

[8] Han, J., Sanchez, R. and Hu, X. (2005) Feature Selection Based on Relative Attribute Dependency: An Experimental Study. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, 3641, 214-223.

http://www.springerlink.com/index/743agvmf3etlaq0p.pdf http://dx.doi.org/10.1007/11548669_23

[9] Zadeh, L.A. (1965) Fuzzy Sets. Information and Control, 8, 338-353.

http://dx.doi.org/10.1016/S0019-9958(65)90241-X

[10] Shafer, G. (1976) A Mathemathical Theory of Evidence. Princeton University Press, Princeton.

[11] Dempster, A.P. (1967) Upper and Lower Probabilities Induced by a Multivalued Mapping. The Annals of Mathematical Statistics, 38, 325-339.

http://dx.doi.org/10.1214/aoms/1177698950

[12] Zadeh, L.A. (1999) Fuzzy Sets as a Basis for a Theory of Possibility. Fuzzy Sets and Systems, 100, 9-34.

http://dx.doi.org/10.1016/S0165-0114(99)80004-9

[13] Nicoletti, M. do C. and Uchôa, J.Q. (1997) Rough Sets under the Perspective of Pertinence Function. 3rd Brazilian Symposium on Intelligent Automation, Vitoria, Brazil, 307-313, in Portuguese.

[14] Chouchoulas, A. and Shen, Q. (2001) Rough Set-Aided Keyword Reduction for Text Categorization. Applied Artificial Intelligence, 15, 843-873.

http://www.tandfonline.com/doi/abs/10.1080/088395101753210773

[15] Jensen, R. and Shen, Q. (2005) Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches. IEEE Transactions on Knowledge and Data Engineering, 16, 1457-1471.

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1350758

[16] Holland, J.H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.

http://www.citeulike.org/group/664/article/400721

[17] Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Boston.

http://www.mendeley.com/research/genetic-algorithms-in-search-optimization-and-machine-learning/

[18] Kirkpatrick, S., Gelatt, C.D. and Vecchi, M.P. (1983) Optimization by Simulated Annealing. Science, 220, 671-680.

http://www.sciencemag.org/content/220/4598/671.short http://dx.doi.org/10.1126/science.220.4598.671

[19] Glover, F. and McMillan, C. (1986) The General Employee Scheduling Problem. An Integration of MS and AI. Computers Operations Research, 13, 563-573.

http://dx.doi.org/10.1016/0305-0548(86)90050-X

[20] Colorni, A., Dorigo, M. and Maniezzo, V. (1991) Distributed Optimization by Ant Colonies. European Conference on Artificial Life, Paris, France 134-142.

[21] Hansen, P. and Mladenovic, N. (1997) Variable Neighborhood Search. Computers & Operations Research, 24, 1097-1100.

http://www.sciencedirect.com/science/article/pii/S0305054897000312

http://dx.doi.org/10.1016/S0305-0548(97)00031-2

[22] Blake, C.L. and Merz, C.J. (1998) UCI Repository of Machine Learning Databases. University of California, Oakland.

http://www.ics.uci.edu/~mlearn