Intelligent Evidence-Based Management for Data Collection and Decision-Making Using Algorithmic Randomness and Active Learning

ABSTRACT

We describe here a comprehensive framework for intelligent information management (IIM) of data collection and decision-making actions for reliable and robust event processing and recognition. This is driven by algorithmic information theory (AIT), in general, and algorithmic randomness and Kolmogorov complexity (KC), in particular. The processing and recognition tasks addressed include data discrimination and multilayer open set data categorization, change detection, data aggregation, clustering and data segmentation, data selection and link analysis, data cleaning and data revision, and prediction and identification of critical states. The unifying theme throughout the paper is that of “compression entails comprehension”, which is realized using the interrelated concepts of randomness vs. regularity and Kolmogorov complexity. The constructive and all encompassing*active learning* (AL) methodology, which mediates and supports the above theme, is context-driven and takes advantage of statistical learning, in general, and semi-supervised learning and transduction, in particular. Active learning employs *explore* and *exploit* actions characteristic of closed-loop control for evidence accumulation in order to revise its prediction models and to reduce uncertainty. The set-based similarity scores, driven by algorithmic randomness and Kolmogorov complexity, employ strangeness / typicality and p-values. We propose the application of the IIM framework to critical states prediction for complex physical systems; in particular, the prediction of cyclone genesis and intensification.

We describe here a comprehensive framework for intelligent information management (IIM) of data collection and decision-making actions for reliable and robust event processing and recognition. This is driven by algorithmic information theory (AIT), in general, and algorithmic randomness and Kolmogorov complexity (KC), in particular. The processing and recognition tasks addressed include data discrimination and multilayer open set data categorization, change detection, data aggregation, clustering and data segmentation, data selection and link analysis, data cleaning and data revision, and prediction and identification of critical states. The unifying theme throughout the paper is that of “compression entails comprehension”, which is realized using the interrelated concepts of randomness vs. regularity and Kolmogorov complexity. The constructive and all encompassing

KEYWORDS

Active Learning, Algorithmic Information Theory, Algorithmic Randomness, Evidence-Based Management, Kolmogorov Complexity, P-Values, Transduction, Critical States Prediction

Active Learning, Algorithmic Information Theory, Algorithmic Randomness, Evidence-Based Management, Kolmogorov Complexity, P-Values, Transduction, Critical States Prediction

Cite this paper

nullH. Wechsler and S. Ho, "Intelligent Evidence-Based Management for Data Collection and Decision-Making Using Algorithmic Randomness and Active Learning,"*Intelligent Information Management*, Vol. 3 No. 4, 2011, pp. 142-159. doi: 10.4236/iim.2011.34018.

nullH. Wechsler and S. Ho, "Intelligent Evidence-Based Management for Data Collection and Decision-Making Using Algorithmic Randomness and Active Learning,"

References

[1] S. Emmott, “Towards 2020 Science,” MS Research, Cambridge, 2006.

[2] U. Neisser, “Cognition and Reality,” The American Journal Psychology, Vol. 90, No. 3, 1977, pp. 541-543.

[3] H. Wechsler, “Computational Vision,” Introduction, Academic Press, Cambridge, 1990.

[4] C. E. Shannon and W. Weaver, “The Mathematical Theory of Communication,” University of Illinois Press, Urbana-Champaign, 1949.

[5] R. J. Solo-monoff, “A Formal Theory of Inductive Inference,” Information and Control, Vol. 7, 1964, pp. 1-22, 224-254.

[6] M. Li and P. Vitanyi, “An Introduction to Kolmogorov Complexity and Its Applications,” 3rd Edition, Springer Verlag, Berlin, 2008. doi:10.1007/978-0-387-49820-1

[7] T. M. Cover and J. A. Thomas, “Elements of Information Theory,” 2nd Edition, Wiley, New York, 2006.

[8] V. Vapnik, “Statistical Learning Theory,” Springer, Dor- drecht, 1998.

[9] V. Vovk, A. Gammerman and G. Shafer, “Algorithmic Learning in a Random World,” Springer, Dordrecht, 2005.

[10] O. Chapelle, B. Scholkopf and A. Zien (Eds.), “Semi- Supervised Learning,” Massachusetts Institute of Technology Press, Cambridge, 2006.

[11] B. Settles, “Ac-tive Learning Literature Survey,” University of Wisconsin, Madison, 2010.

[12] R. Galliers and D. Leidner (Eds.), “Strategic Information Management,” 4th Edi-tion, Routledge, Cornwall, 2009, pp. 1-2.

[13] G. Schreiber et al., “Knowledge Engineering and Management,” Massachusetts Institute of Technology Press, Cambridge, 2000.

[14] E. M. Awad and H. M. Ghaziri, “Knowl-edge Management,” Upper Saddle Rive, 2004.

[15] A. Doucet, and A. Johansen, “Technical Report, Department of Statistics, University of British Columbia,” 2008. http://www.cs.ubc.ca/%7Earnaud/doucet_johansen_tutorialPF.pdfU,

[16] A. Ganek and T. Corbi, “The Dawning of the Auto-nomic Computing Era,” IBM Systems Journal, Vol. 42, No. 1, 2003, pp. 5-18. doi:10.1147/sj.421.0005

[17] A. Darwiche, “Modeling and Reasoning with Bayesian Networks,” Cam-bridge University Press, Cambridge, 2009.

[18] J. Pearl, “Cau-sality,” 2nd Editon, Cambridge University Press, Cambridge, 2009.

[19] N. Schmid and H. Wechsler, “Information Theo-retical and Statistical Learning Theory Characterizations of Biometric Recognition Systems,” SPIE Electronic Imaging: Media Forensics and Security, San Jose, 2010.

[20] H. Simon, “The Science of the Artificial,” Massachusetts Institute of Technology Press, Cambridge, 1982.

[21] S. Nayar and T. Poggio (Eds.), “Early Visual Learning,” Oxford University Press, Oxford, 1995.

[22] T. Winograd and F. Flores, “Under-standing Computer and Cognition,” Addison Wesley, Boston, 1988.

[23] H. B. Barlow, “Unsupervised Learning,” Neural Computation, Vol. 1, No.3, 1989, pp. 295 311.doi:10.1162/neco.1989.1.3.295

[24] Y. D. Rubinstein and T. Hastie, “Discriminative vs. Informative Learning,” Knowledge and Data Discovery, 1997, pp. 49-53.

[25] T. Jebara, “Dis-criminative, Generative and Imitative Learning,” MIT Press, Cambridge, 2002.

[26] C. H. Bennett, P. Gacs, M. Li, P. M. B. Vitanyi and W H. Zurek, “Information Distance,” IEEE Transactions on Information Theory, Vol. 44, No. 4, 1998, pp. 1407-1423.doi:10.1109/18.681318

[27] K. Proedrou, I. Nouretdinov, V. Vovk and A. Gammerman, “Transductive Confidence Ma-chines for Pattern Recognition,” Royal Holloway, University of London, 2001.

[28] M. Kukar, “Quality Assessment of Indi-vidual Classifications in Machine Learning and Data Mining,” Knowledge and Information Systems, Vol. 9, No. 3, 2006, 364-384. doi:10.1007/s10115-005-0203-z

[29] M. Kukar and I. Ko-nonenko, “Reliable Classifications with Machine Learning,” Proceedings European Conference on Mashine Learning, Banff, Canada, 2002, pp. 219-231.

[30] R. G. Bachrach, A, Navot and N. Tishby, “Margin Based Feature Selection-Theory and Algorithms,” Conference on Machine Learning, Banff, Canada, 2004.

[31] T. M. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE on Information The-ory, Vol. IT-13, 1967, pp. 21-27.

[32] S. S. Ho and H. Wechsler, “Query by Transduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 9, 2008, pp. 1557-1571. doi:10.1109/TPAMI.2007.70811

[33] T. Melluish, C. Su-anders, I. Nouretdinov and V. Vovk, “The Typicalness Framework: A Comparison with the Bayesian Approach,” Royal Holloway, University of London, 2001.

[34] A. Gam-merman, V. Vovk, and V. Vapnik, “Learning by Transduc-tion,” Elsevier Publish, New York, 1998, pp. 148-155.

[35] T. Poggio, R. Rifkin, S. Mukherjee and P. Niyogi, “General Conditions for Predictivity of Learning Theory,” Nature, Vol. 428, 2004, pp. 419-422. doi:10.1038/nature02341

[36] T. Poggio and S. Smale, “The Mathematics of Learning: Dealing with Data,” Notices of the American Mathematica Society, 2003, pp. 537-544.

[37] F. Li and H. Wechsler, “Open Set Face Recognition Using Transduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 11, 2005, pp. 1686-1698,.doi:10.1109/TPAMI.2005.224

[38] J. Hamm and D. L. Lee, “Grassman Discriminant Analysis: A Unifying View of Subspace-Based Learning,” 25th International Conference on Machine Learning, Helsinki, 2008.

[39] S. S. Ho and H. Wechsler, “A Martingale Framework for Detecting Changes in the Data Generating Model in Data Streams,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 12, 2010, pp. 2113-2127, doi:10.1109/TPAMI.2010.48

[40] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Conference on Computer Vision and Pattern Recognition, Kauai, 2001.

[41] Y. Freund and R. E. Shapire, “Experiments with a New Boosting Algorithm,” 13th International Conference on Machine Learning, Bari, 1996, pp. 148-156.

[42] F. H. Friedman, T. Hastie and R. Tibshirani, “Additive Logistic Regression: A Statistical View of Boosting,” Annals of Statistics, Vol. 28, 2000, pp. 337-407.doi:10.1214/aos/1016218223

[43] V. Vapnik, “The Nature of Statistical Learning Theory,” 2nd Edition, Springer, Berlin, 2000.

[44] F. Li and H. Wechsler, “Face Authentication Using Recognition-by-Parts, Boosting and Transduction,” International Journal of Artificial Intelligence and Pattern Recognition, Vol. 23, No. 3, 2009, pp. 545-573.doi:10.1142/S0218001409007193

[45] I. S. Dhillon, Y. Guan and B. Kulis, “Kernel K-Means, Spectral Clustering and Normalized Cuts,” Proceedings of the Conference on Knowledge and Data Discovery, Seattle, Western Australian, 2004.

[46] M. Filippone, F. Camastra, F. Masulli and S. Rovetta, “A Survey of Kernel and Spectral Methods for Clustering.” Pattern Recognition, Vol. 41, No. 1, 2008, pp. 176-190.doi:10.1016/j.patcog.2007.05.018

[47] A. Y. Ng, M. I. Jordan and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm, NIPS 14,” Massachusetts Institute of Technology Press, Cambridge, 2002.

[48] X. Zhu, Z. Ghahramani and L. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proceeding 20th International Conference on Machine Learning, Washington DC, 2003.

[49] D. Pyle, “Data Preparation for Data Mining,” Morgan Kaufmann, Waltham, Massachusetts, 1999.

[50] P. J. Huber, “Robust Statistics,” Wiley, NewYork, 2004.

[51] J. Weston, R. Collobert, F. Sinz, L. Bottou and V. Vapnik, “Inference with the Universum,” Proceeding of the 23rd International Conference on Machine Learning, 2006, NewYork, pp. 1009-1016.

[52] V. Vapnik, “Estimation of Dependence Based on Empirical Data (2nd. ed.),” Springer Verlag, Berlin, 2006.

[53] C. L. Vale, J. F. Tierney and L. A. Stewart, “Effects of Adjusting for Censoring on Meta-Analysis of Time-to-Event Outcomes,” International Journals of Epidemiology, Vol. 31, 2002, pp. 107-111. doi:10.1093/ije/31.1.107

[54] T. Kohonen, “Self-Organizing Maps,” 2nd Edition, Springer Verlag, Berlin, 1996.

[55] S.-S. Ho and A. Talukder, “Automated Cyclone Discovery and Tracking Using Knowledge Sharing in Multiple Heterogeneous Satellite Data,” Proceeding KDD, 2008, pp. 928-936.

[56] A. Panangadan, S.-S. Ho, and A. Talukder, “Cyclone Tracking Using Multiple Satellite Image Sources,” Proceeding GIS, 2009, pp. 428-431. doi:10.1145/1653771.1653836

[57] H. Ding, G. Trajcevski, P. Scheuermann, X. Wang and E. J. Keogh, “Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures,” Proceeding of Large Data Base, Vol. 1, No. 2, 2008, pp. 1542-1552.

[58] D. J. Berndt and J. Clifford, “Using Dynamic Time Warping to Find Patterns in Time Series,” KDD Workshop, 1994, pp. 359-370.

[59] M. Vlachos, D. Gunopulos and G. Kollios, “Discovering Similar Multidimensional Trajectories,” Procedeeing ICDE, 2002, pp. 673-684.

[60] S.-S. Ho, W. Tang and W. T. Liu, “Tropical Cyclone Event Sequence Similarity Search Via Dimensionality Reduction and Metric Learning,” Procedeeing KDD, 2010, pp. 135-144.

[61] D. J. Galas, M. Nykter, G. W. Carter, N. D. Price and I. Shmulevich, “Biological Information as Set-Based Complexity,” IEEE Transaction on Information Theory, Vol. 56, No. 2, 2010, pp. 667-667. doi:10.1109/TIT.2009.2037046

[62] M. Gell-Mann, “The Quark and the Jaguar: Adventures in the Simple and the Complex,” Freeman, New York, 1994, p. 392.

[63] B. J. Balas and P. Sinha, “Region-Based Representations for Face Recognition,” ACM Transactions on Applied Perception, Vol. 3, No. 4, 2006, pp. 354-375.doi:10.1145/1190036.1190038

[64] F. Emmert-Streib and M. Dehmer (Eds.), “Information Theory and Statistical Learning,” Springer, Berlin, 2009, pp. 1-3.

[65] J. Weston, F. Perez-Cruz, O. Bousquet, O. Chapelle, A. Elisseeff and B. Scholkopf, “Feature Selection and Transduction for Prediction of Molecular Bioactivity for Drug Design,” Bioinformatics, Vol. 19, No. 6, 2003, pp. 764-771.

[66] N. Basit and H. Wechsler, “Computational Mutagenesis and Protein Function Prediction Using Computational Geometry,” Journal of Biomedical Science and Engineering, 2011.

[67] G. Kotonya and I. Sommerville, “Requirements Engineering,” Wiley, NewYork, 1998.

[1] S. Emmott, “Towards 2020 Science,” MS Research, Cambridge, 2006.

[2] U. Neisser, “Cognition and Reality,” The American Journal Psychology, Vol. 90, No. 3, 1977, pp. 541-543.

[3] H. Wechsler, “Computational Vision,” Introduction, Academic Press, Cambridge, 1990.

[4] C. E. Shannon and W. Weaver, “The Mathematical Theory of Communication,” University of Illinois Press, Urbana-Champaign, 1949.

[5] R. J. Solo-monoff, “A Formal Theory of Inductive Inference,” Information and Control, Vol. 7, 1964, pp. 1-22, 224-254.

[6] M. Li and P. Vitanyi, “An Introduction to Kolmogorov Complexity and Its Applications,” 3rd Edition, Springer Verlag, Berlin, 2008. doi:10.1007/978-0-387-49820-1

[7] T. M. Cover and J. A. Thomas, “Elements of Information Theory,” 2nd Edition, Wiley, New York, 2006.

[8] V. Vapnik, “Statistical Learning Theory,” Springer, Dor- drecht, 1998.

[9] V. Vovk, A. Gammerman and G. Shafer, “Algorithmic Learning in a Random World,” Springer, Dordrecht, 2005.

[10] O. Chapelle, B. Scholkopf and A. Zien (Eds.), “Semi- Supervised Learning,” Massachusetts Institute of Technology Press, Cambridge, 2006.

[11] B. Settles, “Ac-tive Learning Literature Survey,” University of Wisconsin, Madison, 2010.

[12] R. Galliers and D. Leidner (Eds.), “Strategic Information Management,” 4th Edi-tion, Routledge, Cornwall, 2009, pp. 1-2.

[13] G. Schreiber et al., “Knowledge Engineering and Management,” Massachusetts Institute of Technology Press, Cambridge, 2000.

[14] E. M. Awad and H. M. Ghaziri, “Knowl-edge Management,” Upper Saddle Rive, 2004.

[15] A. Doucet, and A. Johansen, “Technical Report, Department of Statistics, University of British Columbia,” 2008. http://www.cs.ubc.ca/%7Earnaud/doucet_johansen_tutorialPF.pdfU,

[16] A. Ganek and T. Corbi, “The Dawning of the Auto-nomic Computing Era,” IBM Systems Journal, Vol. 42, No. 1, 2003, pp. 5-18. doi:10.1147/sj.421.0005

[17] A. Darwiche, “Modeling and Reasoning with Bayesian Networks,” Cam-bridge University Press, Cambridge, 2009.

[18] J. Pearl, “Cau-sality,” 2nd Editon, Cambridge University Press, Cambridge, 2009.

[19] N. Schmid and H. Wechsler, “Information Theo-retical and Statistical Learning Theory Characterizations of Biometric Recognition Systems,” SPIE Electronic Imaging: Media Forensics and Security, San Jose, 2010.

[20] H. Simon, “The Science of the Artificial,” Massachusetts Institute of Technology Press, Cambridge, 1982.

[21] S. Nayar and T. Poggio (Eds.), “Early Visual Learning,” Oxford University Press, Oxford, 1995.

[22] T. Winograd and F. Flores, “Under-standing Computer and Cognition,” Addison Wesley, Boston, 1988.

[23] H. B. Barlow, “Unsupervised Learning,” Neural Computation, Vol. 1, No.3, 1989, pp. 295 311.doi:10.1162/neco.1989.1.3.295

[24] Y. D. Rubinstein and T. Hastie, “Discriminative vs. Informative Learning,” Knowledge and Data Discovery, 1997, pp. 49-53.

[25] T. Jebara, “Dis-criminative, Generative and Imitative Learning,” MIT Press, Cambridge, 2002.

[26] C. H. Bennett, P. Gacs, M. Li, P. M. B. Vitanyi and W H. Zurek, “Information Distance,” IEEE Transactions on Information Theory, Vol. 44, No. 4, 1998, pp. 1407-1423.doi:10.1109/18.681318

[27] K. Proedrou, I. Nouretdinov, V. Vovk and A. Gammerman, “Transductive Confidence Ma-chines for Pattern Recognition,” Royal Holloway, University of London, 2001.

[28] M. Kukar, “Quality Assessment of Indi-vidual Classifications in Machine Learning and Data Mining,” Knowledge and Information Systems, Vol. 9, No. 3, 2006, 364-384. doi:10.1007/s10115-005-0203-z

[29] M. Kukar and I. Ko-nonenko, “Reliable Classifications with Machine Learning,” Proceedings European Conference on Mashine Learning, Banff, Canada, 2002, pp. 219-231.

[30] R. G. Bachrach, A, Navot and N. Tishby, “Margin Based Feature Selection-Theory and Algorithms,” Conference on Machine Learning, Banff, Canada, 2004.

[31] T. M. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE on Information The-ory, Vol. IT-13, 1967, pp. 21-27.

[32] S. S. Ho and H. Wechsler, “Query by Transduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 9, 2008, pp. 1557-1571. doi:10.1109/TPAMI.2007.70811

[33] T. Melluish, C. Su-anders, I. Nouretdinov and V. Vovk, “The Typicalness Framework: A Comparison with the Bayesian Approach,” Royal Holloway, University of London, 2001.

[34] A. Gam-merman, V. Vovk, and V. Vapnik, “Learning by Transduc-tion,” Elsevier Publish, New York, 1998, pp. 148-155.

[35] T. Poggio, R. Rifkin, S. Mukherjee and P. Niyogi, “General Conditions for Predictivity of Learning Theory,” Nature, Vol. 428, 2004, pp. 419-422. doi:10.1038/nature02341

[36] T. Poggio and S. Smale, “The Mathematics of Learning: Dealing with Data,” Notices of the American Mathematica Society, 2003, pp. 537-544.

[37] F. Li and H. Wechsler, “Open Set Face Recognition Using Transduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 11, 2005, pp. 1686-1698,.doi:10.1109/TPAMI.2005.224

[38] J. Hamm and D. L. Lee, “Grassman Discriminant Analysis: A Unifying View of Subspace-Based Learning,” 25th International Conference on Machine Learning, Helsinki, 2008.

[39] S. S. Ho and H. Wechsler, “A Martingale Framework for Detecting Changes in the Data Generating Model in Data Streams,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 12, 2010, pp. 2113-2127, doi:10.1109/TPAMI.2010.48

[40] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Conference on Computer Vision and Pattern Recognition, Kauai, 2001.

[41] Y. Freund and R. E. Shapire, “Experiments with a New Boosting Algorithm,” 13th International Conference on Machine Learning, Bari, 1996, pp. 148-156.

[42] F. H. Friedman, T. Hastie and R. Tibshirani, “Additive Logistic Regression: A Statistical View of Boosting,” Annals of Statistics, Vol. 28, 2000, pp. 337-407.doi:10.1214/aos/1016218223

[43] V. Vapnik, “The Nature of Statistical Learning Theory,” 2nd Edition, Springer, Berlin, 2000.

[44] F. Li and H. Wechsler, “Face Authentication Using Recognition-by-Parts, Boosting and Transduction,” International Journal of Artificial Intelligence and Pattern Recognition, Vol. 23, No. 3, 2009, pp. 545-573.doi:10.1142/S0218001409007193

[45] I. S. Dhillon, Y. Guan and B. Kulis, “Kernel K-Means, Spectral Clustering and Normalized Cuts,” Proceedings of the Conference on Knowledge and Data Discovery, Seattle, Western Australian, 2004.

[46] M. Filippone, F. Camastra, F. Masulli and S. Rovetta, “A Survey of Kernel and Spectral Methods for Clustering.” Pattern Recognition, Vol. 41, No. 1, 2008, pp. 176-190.doi:10.1016/j.patcog.2007.05.018

[47] A. Y. Ng, M. I. Jordan and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm, NIPS 14,” Massachusetts Institute of Technology Press, Cambridge, 2002.

[48] X. Zhu, Z. Ghahramani and L. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proceeding 20th International Conference on Machine Learning, Washington DC, 2003.

[49] D. Pyle, “Data Preparation for Data Mining,” Morgan Kaufmann, Waltham, Massachusetts, 1999.

[50] P. J. Huber, “Robust Statistics,” Wiley, NewYork, 2004.

[51] J. Weston, R. Collobert, F. Sinz, L. Bottou and V. Vapnik, “Inference with the Universum,” Proceeding of the 23rd International Conference on Machine Learning, 2006, NewYork, pp. 1009-1016.

[52] V. Vapnik, “Estimation of Dependence Based on Empirical Data (2nd. ed.),” Springer Verlag, Berlin, 2006.

[53] C. L. Vale, J. F. Tierney and L. A. Stewart, “Effects of Adjusting for Censoring on Meta-Analysis of Time-to-Event Outcomes,” International Journals of Epidemiology, Vol. 31, 2002, pp. 107-111. doi:10.1093/ije/31.1.107

[54] T. Kohonen, “Self-Organizing Maps,” 2nd Edition, Springer Verlag, Berlin, 1996.

[55] S.-S. Ho and A. Talukder, “Automated Cyclone Discovery and Tracking Using Knowledge Sharing in Multiple Heterogeneous Satellite Data,” Proceeding KDD, 2008, pp. 928-936.

[56] A. Panangadan, S.-S. Ho, and A. Talukder, “Cyclone Tracking Using Multiple Satellite Image Sources,” Proceeding GIS, 2009, pp. 428-431. doi:10.1145/1653771.1653836

[57] H. Ding, G. Trajcevski, P. Scheuermann, X. Wang and E. J. Keogh, “Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures,” Proceeding of Large Data Base, Vol. 1, No. 2, 2008, pp. 1542-1552.

[58] D. J. Berndt and J. Clifford, “Using Dynamic Time Warping to Find Patterns in Time Series,” KDD Workshop, 1994, pp. 359-370.

[59] M. Vlachos, D. Gunopulos and G. Kollios, “Discovering Similar Multidimensional Trajectories,” Procedeeing ICDE, 2002, pp. 673-684.

[60] S.-S. Ho, W. Tang and W. T. Liu, “Tropical Cyclone Event Sequence Similarity Search Via Dimensionality Reduction and Metric Learning,” Procedeeing KDD, 2010, pp. 135-144.

[61] D. J. Galas, M. Nykter, G. W. Carter, N. D. Price and I. Shmulevich, “Biological Information as Set-Based Complexity,” IEEE Transaction on Information Theory, Vol. 56, No. 2, 2010, pp. 667-667. doi:10.1109/TIT.2009.2037046

[62] M. Gell-Mann, “The Quark and the Jaguar: Adventures in the Simple and the Complex,” Freeman, New York, 1994, p. 392.

[63] B. J. Balas and P. Sinha, “Region-Based Representations for Face Recognition,” ACM Transactions on Applied Perception, Vol. 3, No. 4, 2006, pp. 354-375.doi:10.1145/1190036.1190038

[64] F. Emmert-Streib and M. Dehmer (Eds.), “Information Theory and Statistical Learning,” Springer, Berlin, 2009, pp. 1-3.

[65] J. Weston, F. Perez-Cruz, O. Bousquet, O. Chapelle, A. Elisseeff and B. Scholkopf, “Feature Selection and Transduction for Prediction of Molecular Bioactivity for Drug Design,” Bioinformatics, Vol. 19, No. 6, 2003, pp. 764-771.

[66] N. Basit and H. Wechsler, “Computational Mutagenesis and Protein Function Prediction Using Computational Geometry,” Journal of Biomedical Science and Engineering, 2011.

[67] G. Kotonya and I. Sommerville, “Requirements Engineering,” Wiley, NewYork, 1998.