Learn More about Your Data: A Symbolic Regression Knowledge Representation Framework

ABSTRACT

In this paper, we propose a flexible knowledge representation framework which utilizes Symbolic Regression to learn and mathematical expressions to represent the knowledge to be captured from data. In this approach, learning algorithms are used to generate new insights which can be added to domain knowledge bases supporting again symbolic regression. This is used for the generalization of the well-known regression analysis to fulfill supervised classification. The approach aims to produce a learning model which best separates the class members of a labeled training set. The class boundaries are given by a separation surface which is represented by the level set of a model function. The separation boundary is defined by the respective equation. In our symbolic approach, the learned knowledge model is represented by mathematical formulas and it is composed of an optimum set of expressions of a given superset. We show that this property gives human experts options to gain additional insights into the application domain. Furthermore, the representation in terms of mathematical formulas (e.g., the analytical model and its first and second derivative) adds additional value to the classifier and enables to answer questions, which sub-symbolic classifier approaches cannot. The symbolic representation of the models enables an interpretation by human experts. Existing and previously known expert knowledge can be added to the developed knowledge representation framework or it can be used as constraints. Additionally, the knowledge acquisition framework can be repeated several times. In each step, new insights from the search process can be added to the knowledge base to improve the overall performance of the proposed learning algorithms.

In this paper, we propose a flexible knowledge representation framework which utilizes Symbolic Regression to learn and mathematical expressions to represent the knowledge to be captured from data. In this approach, learning algorithms are used to generate new insights which can be added to domain knowledge bases supporting again symbolic regression. This is used for the generalization of the well-known regression analysis to fulfill supervised classification. The approach aims to produce a learning model which best separates the class members of a labeled training set. The class boundaries are given by a separation surface which is represented by the level set of a model function. The separation boundary is defined by the respective equation. In our symbolic approach, the learned knowledge model is represented by mathematical formulas and it is composed of an optimum set of expressions of a given superset. We show that this property gives human experts options to gain additional insights into the application domain. Furthermore, the representation in terms of mathematical formulas (e.g., the analytical model and its first and second derivative) adds additional value to the classifier and enables to answer questions, which sub-symbolic classifier approaches cannot. The symbolic representation of the models enables an interpretation by human experts. Existing and previously known expert knowledge can be added to the developed knowledge representation framework or it can be used as constraints. Additionally, the knowledge acquisition framework can be repeated several times. In each step, new insights from the search process can be added to the knowledge base to improve the overall performance of the proposed learning algorithms.

KEYWORDS

Classification; Symbolic Regression; Knowledge Management; Data Mining; Pattern Recognition

Classification; Symbolic Regression; Knowledge Management; Data Mining; Pattern Recognition

Cite this paper

I. Schwab and N. Link, "Learn More about Your Data: A Symbolic Regression Knowledge Representation Framework,"*International Journal of Intelligence Science*, Vol. 2 No. 4, 2012, pp. 135-142. doi: 10.4236/ijis.2012.224018.

I. Schwab and N. Link, "Learn More about Your Data: A Symbolic Regression Knowledge Representation Framework,"

References

[1] R. O. Duda, P. E. Hart and D. G. Stork, “Pattern Classification,” 2nd ed., Wiley Interscience, 2000.

[2] R. E. Steuer, “Multiple Criteria Optimization: Theory, Computations, and Application,” John Wiley & Sons, New York, 1986.

[3] J. K. Kishore, L. M. Patnaik, V. Mani and V. K. Agrawal “Application of Genetic Programming for Multicategory Pattern Classification,” IEEE Transactions on Evolutionary Computation, 4 (3), 2000, pp. 242-258.

[4] D. Robinson, “Implications of Neural Networks for How We Think about Brain Function,” in Behavioral and Brain Science, 15, 1992, pp. 644-655.

[5] J. H. Holland, K. J. Holyoak, R. E. Nisbett and P. R. Thagard, “Induction: Processes of Inference, Learning, and Discovery,” a Bradford Book, Cambridge, MA, USA, 1989.

[6] P. Smolensky, “On the Proper Treatment of Connectionism,” in Behavioral and Brain Sciences, 11, 1988, pp. 1-74.

[7] T. Erfani, S and S.V. Utyuzhnikov, “Directed Search Domain: A Method for Even Generation of Pareto Frontier in Multiobjective Optimization,” Journal of Engineering Optimization, Vol. 43, No. 5, 2011, pp. 1- 18.

[8] D. A. Freedman, “Statistical Models: Theory and Practice,” Cambridge University Press, 2005.

[9] M. O'Neill and C. Ryan, “Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language,” Kluwer Academic Publishers, Dordrecht Netherlands, 2003.

[10] J. R. Koza, “Genetic Programming: On the Programming of Computers by Means of Natural Selection,” Cambridge, MA, USA: MIT Press, 1992.

[11] I. Schwab and N. Link, “Reusable Knowledge from Symbolic Regression Classification,” Genetic and Evolutionary Computing (ICGEC 2011), 2011.

[12] W. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” Bulletin of Mathematical Biophysics, 1943, pp. 115-133.

[13] M. Kotanchek, G. Smits and A. Kordon, “Industrial Strength Genetic Programming,” In: GP Theory and Practice, R. Riolo and B. Worzel, Kluwer, 2003.

[14] G. Smits and M. Kotanchek, “Pareto-Front Exploitation in Symbolic Regression,” In: GP Theory and Practice, R. Riolo and B. Worzel, Kluwer, 2004.

[15] M. Schmidt and H. Lipson, “Discovering a Domain Alphabet”, Genetic and Evolutionary Computation Conference (GECCO'09), 2009, pp. 1083-1090.

[16] T. Hastie, R. Tibshirani and J. Friedman, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction,” New York: Springer-Verlag, 2001.

[17] K. Lang and M. Witbrock, “Learning to tell two spirals apart,” Proceedings of 1988 Connectionists Models Summer School. Morgan Kaufmann, San Mateo CA, 1989, pp. 52-59.

[18] R. Setiono, “A Neural Network Construction Algorithm which Maximizes the Likelihood Function,” Connection Science, vol. 7, no 2, 1995, pp. 147-166.

[19] S. J. Haberman, “Generalized Residuals for Log-Linear Models,” Proceedings of the 9th International Biometrics Conference, Boston, 1976, pp. 104-122.

[20] J. M. Landwehr, D. Pregibon and A. C. Shoemaker, “Graphical Models for Assessing Logistic Regression Models,” Journal of the American Statistical Association 79, 1984, pp. 61-83.

[21] W.D. Lo, “Logistic Regression Trees,” Ph.D. dissertation, Department of Statistics, University of Wisconsin, Madison, WI, 1993.

[22] A. Frank and A. Asuncion, “UCI Machine Learning Repository,” 2010. [alt http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

[1] R. O. Duda, P. E. Hart and D. G. Stork, “Pattern Classification,” 2nd ed., Wiley Interscience, 2000.

[2] R. E. Steuer, “Multiple Criteria Optimization: Theory, Computations, and Application,” John Wiley & Sons, New York, 1986.

[3] J. K. Kishore, L. M. Patnaik, V. Mani and V. K. Agrawal “Application of Genetic Programming for Multicategory Pattern Classification,” IEEE Transactions on Evolutionary Computation, 4 (3), 2000, pp. 242-258.

[4] D. Robinson, “Implications of Neural Networks for How We Think about Brain Function,” in Behavioral and Brain Science, 15, 1992, pp. 644-655.

[5] J. H. Holland, K. J. Holyoak, R. E. Nisbett and P. R. Thagard, “Induction: Processes of Inference, Learning, and Discovery,” a Bradford Book, Cambridge, MA, USA, 1989.

[6] P. Smolensky, “On the Proper Treatment of Connectionism,” in Behavioral and Brain Sciences, 11, 1988, pp. 1-74.

[7] T. Erfani, S and S.V. Utyuzhnikov, “Directed Search Domain: A Method for Even Generation of Pareto Frontier in Multiobjective Optimization,” Journal of Engineering Optimization, Vol. 43, No. 5, 2011, pp. 1- 18.

[8] D. A. Freedman, “Statistical Models: Theory and Practice,” Cambridge University Press, 2005.

[9] M. O'Neill and C. Ryan, “Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language,” Kluwer Academic Publishers, Dordrecht Netherlands, 2003.

[10] J. R. Koza, “Genetic Programming: On the Programming of Computers by Means of Natural Selection,” Cambridge, MA, USA: MIT Press, 1992.

[11] I. Schwab and N. Link, “Reusable Knowledge from Symbolic Regression Classification,” Genetic and Evolutionary Computing (ICGEC 2011), 2011.

[12] W. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” Bulletin of Mathematical Biophysics, 1943, pp. 115-133.

[13] M. Kotanchek, G. Smits and A. Kordon, “Industrial Strength Genetic Programming,” In: GP Theory and Practice, R. Riolo and B. Worzel, Kluwer, 2003.

[14] G. Smits and M. Kotanchek, “Pareto-Front Exploitation in Symbolic Regression,” In: GP Theory and Practice, R. Riolo and B. Worzel, Kluwer, 2004.

[15] M. Schmidt and H. Lipson, “Discovering a Domain Alphabet”, Genetic and Evolutionary Computation Conference (GECCO'09), 2009, pp. 1083-1090.

[16] T. Hastie, R. Tibshirani and J. Friedman, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction,” New York: Springer-Verlag, 2001.

[17] K. Lang and M. Witbrock, “Learning to tell two spirals apart,” Proceedings of 1988 Connectionists Models Summer School. Morgan Kaufmann, San Mateo CA, 1989, pp. 52-59.

[18] R. Setiono, “A Neural Network Construction Algorithm which Maximizes the Likelihood Function,” Connection Science, vol. 7, no 2, 1995, pp. 147-166.

[19] S. J. Haberman, “Generalized Residuals for Log-Linear Models,” Proceedings of the 9th International Biometrics Conference, Boston, 1976, pp. 104-122.

[20] J. M. Landwehr, D. Pregibon and A. C. Shoemaker, “Graphical Models for Assessing Logistic Regression Models,” Journal of the American Statistical Association 79, 1984, pp. 61-83.

[21] W.D. Lo, “Logistic Regression Trees,” Ph.D. dissertation, Department of Statistics, University of Wisconsin, Madison, WI, 1993.

[22] A. Frank and A. Asuncion, “UCI Machine Learning Repository,” 2010. [alt http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.