Intelligent Biometric Information Management

Author(s)
Harry Wechsler

ABSTRACT

We advance here a novel methodology for robust intelligent biometric information management with inferences and predictions made using randomness and complexity concepts. Intelligence refers to learning, adap- tation, and functionality, and robustness refers to the ability to handle incomplete and/or corrupt adversarial information, on one side, and image and or device variability, on the other side. The proposed methodology is model-free and non-parametric. It draws support from discriminative methods using likelihood ratios to link at the conceptual level biometrics and forensics. It further links, at the modeling and implementation level, the Bayesian framework, statistical learning theory (SLT) using transduction and semi-supervised lea- rning, and Information Theory (IY) using mutual information. The key concepts supporting the proposed methodology are a) local estimation to facilitate learning and prediction using both labeled and unlabeled data; b) similarity metrics using regularity of patterns, randomness deficiency, and Kolmogorov complexity (similar to MDL) using strangeness/typicality and ranking p-values; and c) the Cover – Hart theorem on the asymptotical performance of k-nearest neighbors approaching the optimal Bayes error. Several topics on biometric inference and prediction related to 1) multi-level and multi-layer data fusion including quality and multi-modal biometrics; 2) score normalization and revision theory; 3) face selection and tracking; and 4) identity management, are described here using an integrated approach that includes transduction and boosting for ranking and sequential fusion/aggregation, respectively, on one side, and active learning and change/ outlier/intrusion detection realized using information gain and martingale, respectively, on the other side. The methodology proposed can be mapped to additional types of information beyond biometrics.

We advance here a novel methodology for robust intelligent biometric information management with inferences and predictions made using randomness and complexity concepts. Intelligence refers to learning, adap- tation, and functionality, and robustness refers to the ability to handle incomplete and/or corrupt adversarial information, on one side, and image and or device variability, on the other side. The proposed methodology is model-free and non-parametric. It draws support from discriminative methods using likelihood ratios to link at the conceptual level biometrics and forensics. It further links, at the modeling and implementation level, the Bayesian framework, statistical learning theory (SLT) using transduction and semi-supervised lea- rning, and Information Theory (IY) using mutual information. The key concepts supporting the proposed methodology are a) local estimation to facilitate learning and prediction using both labeled and unlabeled data; b) similarity metrics using regularity of patterns, randomness deficiency, and Kolmogorov complexity (similar to MDL) using strangeness/typicality and ranking p-values; and c) the Cover – Hart theorem on the asymptotical performance of k-nearest neighbors approaching the optimal Bayes error. Several topics on biometric inference and prediction related to 1) multi-level and multi-layer data fusion including quality and multi-modal biometrics; 2) score normalization and revision theory; 3) face selection and tracking; and 4) identity management, are described here using an integrated approach that includes transduction and boosting for ranking and sequential fusion/aggregation, respectively, on one side, and active learning and change/ outlier/intrusion detection realized using information gain and martingale, respectively, on the other side. The methodology proposed can be mapped to additional types of information beyond biometrics.

KEYWORDS

Authentication, Biometrics, Boosting, Change Detection, Complexity, Cross-Matching, Data Fusion, Ensemble Methods, Forensics, Identity Management, Imposters, Inference, Intelligent Information Management, Margin gain, MDL, Multi-Sensory Integration, Outlier Detection, P-Values, Quality, Randomness, Ranking, Score Normalization, Semi-Supervised Learning, Spectral Clustering, Strangeness, Surveillance, Tracking, Typicality, Transduction

Authentication, Biometrics, Boosting, Change Detection, Complexity, Cross-Matching, Data Fusion, Ensemble Methods, Forensics, Identity Management, Imposters, Inference, Intelligent Information Management, Margin gain, MDL, Multi-Sensory Integration, Outlier Detection, P-Values, Quality, Randomness, Ranking, Score Normalization, Semi-Supervised Learning, Spectral Clustering, Strangeness, Surveillance, Tracking, Typicality, Transduction

Cite this paper

nullH. Wechsler, "Intelligent Biometric Information Management,"*Intelligent Information Management*, Vol. 2 No. 9, 2010, pp. 499-511. doi: 10.4236/iim.2010.29060.

nullH. Wechsler, "Intelligent Biometric Information Management,"

References

[1] T. D. Wilson, “Information Management,” in: J. Feather and P. Sturges Eds., International Encyclopedia of Information and Library Science, Routledge, London, 2003, pp. 263-278.

[2] N. Schmid and H. Wechsler, “Information Theoretical (IT) and Statistical Learning Theory (SLT) Characterizations of Biometric Recognition Systems,” SPIE Electronic Imaging: Media Forensics and Security, San Jose, CA, Vol. 7541, 2010, pp. 75410M-75410M-13.

[3] N. Poh, T. Bourlai, J. Kittler et al., “Benchmark Quality-Dependent and Cost-Sensitive Score-Level Multimodal Biometric Fusion Algorithms,” The IEEE Transaction on Information Forensics and Security, Vol. 4, No. 4, 2009, pp. 849-866.

[4] N. Poh, T. Bourlai and J. Kittler, “A Multimodal Biometric Test Bed for Quality-dependent, Cost-Sensitive and Client-Specific Score-Level Fusion Algorithms,” Pattern Recognition, Vol. 43, No. 3, 2010, pp. 1094-1105.

[5] H. B. Barlow, “Unsupervised Learning,” Neural Computation, Vol. 1, 1989, pp. 295-311.

[6] Y. D. Rubinstein and T. Hastie, “Discriminative Versus Informative Learning,” Knowledge and Data Discovery (KDD), 1997, pp. 49-53.

[7] J. Gonzalez-Rodriguez, P. Rose, D. Ramos, D. T. Toledano and J. Ortega-Garcia, “Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition,” IEEE Transaction on Audio, Speech and Language Processing, Vol. 15, No. 7, 2007, pp. 2104-2115.

[8] C. Champed and D. Meuwly, “The Inference of Identity in Forensic Speaker Recognition,” Speech Communication, Vol. 31, No. 2-3, 2000, pp. 193-203.

[9] D. Dessimoz and C. Champod, “Linkages between Biometrics and Forensic Science,” in A. K. Jain, Ed., Handbook of Biometrics, Springer, New York, 2008.

[10] B. Black, F. J. Ayala and C. Saffran-Brinks, “Science and the Law in the Wake of Daubert: A New Search for Scientific Knowledge,” Texas Law Review, Vol. 72, No. 4, 1994, pp. 715-761.

[11] M. Li and P. Vitanyi, “An Introduction to Kolmogorov Complexity and Its Applications,” 2nd Edition, Springer- Verlag, Germany, 1997.

[12] S. S. Ho and H. Wechsler, “Query by Transduction,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 30, No. 9, 2008, pp. 1557-1571.

[13] T. Melluish, C. Saunders, A. Gammerman, and V. Vovk, “The Typicalness Framework: A Comparison with the Bayesian Approach,” TR-CS, Royal Holloway College, University of London, 2001.

[14] F. Li and H. Wechsler, “Open Set Face Recognition Using Transduction,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 27, No. 11, 2005, pp. 1686- 1698.

[15] V. Vapnik, “Statistical Learning Theory,” Springer, New York, 1998.

[16] O. Chapelle, B. Scholkopf and A. Zien (Eds.), “Semi- Supervised Learning,” MIT Press, USA, 2006.

[17] T. M. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE Transaction on Information Theory, Vol. IT-13, 1967, pp. 21-27.

[18] Y. Freund and R. E. Shapire, “Experiments with a New Boosting Algorithm,” Proceedings of 13th International Conference on Machine Learning (ICML), Bari, Italy, 1996, pp. 148-156.

[19] F. H. Friedman, T. Hastie and R. Tibshirani, “Additive Logistic Regression: A Statistical View of Boosting,” Annals of Statistics, Vol. 28, 2000, pp. 337-407.

[20] V. Vapnik, “The Nature of Statistical Learning Theory” 2nd Edition, Springer, New York, 2000.

[21] F. Li and H. Wechsler, “Face Authentication Using Recognition-by-Parts, Boosting and Transduction,” Interna- tional Journal of Artificial Intelligence and Pattern Recognition (IJPRAI ), Vol. 23, No. 3, 2009, pp. 545-573.

[22] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), Kauai, Hawaii, 2001, pp. I-511-I-518.

[23] R. O. Duda, P. E. Hart and D. G. Sork, “Pattern Classification,” 2nd Edition, Wiley, New York, 2000.

[24] A. Adler, “Sample Images Can be Independently Regenerated from Face Recognition Templates,” 2003. http://www.site.uotawa.ca/~adler/publications/2003/adler-2003-fr-templates.pdf.

[25] T. Poggio and S. Smale, “The Mathematics of Learning: Dealing with Data” Notices of American Mathematical Socity, Vol. 50, No. 5, 2003, pp. 537-544.

[26] T. Poggio, R. Rifkin, S. Mukherjee and P. Niyogi, “General Conditions for Predictivity of Learning Theory,” Nature, Vol. 428, No. 6981, 2004, pp. 419-422.

[27] I. S. Dhillon, Y. Guan and B. Kulis, “Kernel k-means, Spectral Clustering and Normalized Cuts,” Proceedings of the Conference on Knowledge and Data Discovery (KDD), Seattle, WA, 2004.

[28] A. Y. Ng, M. I. Jordan and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm,” Neural Information Processing Systems (NIPS) 14, MIT Press, Boston, MA, 2002, pp. 849-856.

[29] X. Zhu, Z. Ghahramani and L. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proceedings of the 20th International Conference on Machine Learning (ICML), Washington, DC, 2003, pp. 912-919.

[30] M. F. Balcan, A. Blum, P. P. Choi, J. Lafferty, B. Pantano, M. R. Rwebangira and X. Zhu, “Person Identification in Webcam Images: An Application of Semi-Supervised Learning,” Proceedings of 22nd ICML Workshop on Learning with Partially Classified Training Data, Bonn, Germany, 2005, pp. 1-9.

[31] K. Duh and K. Kirchhoff, “Learning to Rank with Partially Labeled-Data,” SIGIR, Singapore, 2008, pp. 20-27.

[32] D. Rao and D. Yarowsky, “Ranking and Semi-Supervised Classification on Large-Scale Graphs Using Map Reduce,” Proceedings of the Workshop on Graph-based Methods for Natural Language Processing (ACL-IJCNLP), Singapore, 2009, pp. 58-65.

[33] S. S. Ho and H. Wechsler, “A Martingale Framework for Detecting Changes in the Data Generating Model in Data Streams,” IEEE Transaction on Pattern Analysis and Machine Intelligence, No. 99, 2010 (to appear).

[34] B. J. Balas and P. Sinha, “Region-Based Representations for Face Recognition,” ACM Transactions on Applied Perception, Vol. 3, No. 4, 2006, pp. 354-375.

[35] H. Wechsler, “Linguistics and Face Recognition,” Journal of Visual Languages and Computation, Vol. 20, No. 3, 2009, pp. 145-155.

[1] T. D. Wilson, “Information Management,” in: J. Feather and P. Sturges Eds., International Encyclopedia of Information and Library Science, Routledge, London, 2003, pp. 263-278.

[2] N. Schmid and H. Wechsler, “Information Theoretical (IT) and Statistical Learning Theory (SLT) Characterizations of Biometric Recognition Systems,” SPIE Electronic Imaging: Media Forensics and Security, San Jose, CA, Vol. 7541, 2010, pp. 75410M-75410M-13.

[3] N. Poh, T. Bourlai, J. Kittler et al., “Benchmark Quality-Dependent and Cost-Sensitive Score-Level Multimodal Biometric Fusion Algorithms,” The IEEE Transaction on Information Forensics and Security, Vol. 4, No. 4, 2009, pp. 849-866.

[4] N. Poh, T. Bourlai and J. Kittler, “A Multimodal Biometric Test Bed for Quality-dependent, Cost-Sensitive and Client-Specific Score-Level Fusion Algorithms,” Pattern Recognition, Vol. 43, No. 3, 2010, pp. 1094-1105.

[5] H. B. Barlow, “Unsupervised Learning,” Neural Computation, Vol. 1, 1989, pp. 295-311.

[6] Y. D. Rubinstein and T. Hastie, “Discriminative Versus Informative Learning,” Knowledge and Data Discovery (KDD), 1997, pp. 49-53.

[7] J. Gonzalez-Rodriguez, P. Rose, D. Ramos, D. T. Toledano and J. Ortega-Garcia, “Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition,” IEEE Transaction on Audio, Speech and Language Processing, Vol. 15, No. 7, 2007, pp. 2104-2115.

[8] C. Champed and D. Meuwly, “The Inference of Identity in Forensic Speaker Recognition,” Speech Communication, Vol. 31, No. 2-3, 2000, pp. 193-203.

[9] D. Dessimoz and C. Champod, “Linkages between Biometrics and Forensic Science,” in A. K. Jain, Ed., Handbook of Biometrics, Springer, New York, 2008.

[10] B. Black, F. J. Ayala and C. Saffran-Brinks, “Science and the Law in the Wake of Daubert: A New Search for Scientific Knowledge,” Texas Law Review, Vol. 72, No. 4, 1994, pp. 715-761.

[11] M. Li and P. Vitanyi, “An Introduction to Kolmogorov Complexity and Its Applications,” 2nd Edition, Springer- Verlag, Germany, 1997.

[12] S. S. Ho and H. Wechsler, “Query by Transduction,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 30, No. 9, 2008, pp. 1557-1571.

[13] T. Melluish, C. Saunders, A. Gammerman, and V. Vovk, “The Typicalness Framework: A Comparison with the Bayesian Approach,” TR-CS, Royal Holloway College, University of London, 2001.

[14] F. Li and H. Wechsler, “Open Set Face Recognition Using Transduction,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 27, No. 11, 2005, pp. 1686- 1698.

[15] V. Vapnik, “Statistical Learning Theory,” Springer, New York, 1998.

[16] O. Chapelle, B. Scholkopf and A. Zien (Eds.), “Semi- Supervised Learning,” MIT Press, USA, 2006.

[17] T. M. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE Transaction on Information Theory, Vol. IT-13, 1967, pp. 21-27.

[18] Y. Freund and R. E. Shapire, “Experiments with a New Boosting Algorithm,” Proceedings of 13th International Conference on Machine Learning (ICML), Bari, Italy, 1996, pp. 148-156.

[19] F. H. Friedman, T. Hastie and R. Tibshirani, “Additive Logistic Regression: A Statistical View of Boosting,” Annals of Statistics, Vol. 28, 2000, pp. 337-407.

[20] V. Vapnik, “The Nature of Statistical Learning Theory” 2nd Edition, Springer, New York, 2000.

[21] F. Li and H. Wechsler, “Face Authentication Using Recognition-by-Parts, Boosting and Transduction,” Interna- tional Journal of Artificial Intelligence and Pattern Recognition (IJPRAI ), Vol. 23, No. 3, 2009, pp. 545-573.

[22] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), Kauai, Hawaii, 2001, pp. I-511-I-518.

[23] R. O. Duda, P. E. Hart and D. G. Sork, “Pattern Classification,” 2nd Edition, Wiley, New York, 2000.

[24] A. Adler, “Sample Images Can be Independently Regenerated from Face Recognition Templates,” 2003. http://www.site.uotawa.ca/~adler/publications/2003/adler-2003-fr-templates.pdf.

[25] T. Poggio and S. Smale, “The Mathematics of Learning: Dealing with Data” Notices of American Mathematical Socity, Vol. 50, No. 5, 2003, pp. 537-544.

[26] T. Poggio, R. Rifkin, S. Mukherjee and P. Niyogi, “General Conditions for Predictivity of Learning Theory,” Nature, Vol. 428, No. 6981, 2004, pp. 419-422.

[27] I. S. Dhillon, Y. Guan and B. Kulis, “Kernel k-means, Spectral Clustering and Normalized Cuts,” Proceedings of the Conference on Knowledge and Data Discovery (KDD), Seattle, WA, 2004.

[28] A. Y. Ng, M. I. Jordan and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm,” Neural Information Processing Systems (NIPS) 14, MIT Press, Boston, MA, 2002, pp. 849-856.

[29] X. Zhu, Z. Ghahramani and L. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proceedings of the 20th International Conference on Machine Learning (ICML), Washington, DC, 2003, pp. 912-919.

[30] M. F. Balcan, A. Blum, P. P. Choi, J. Lafferty, B. Pantano, M. R. Rwebangira and X. Zhu, “Person Identification in Webcam Images: An Application of Semi-Supervised Learning,” Proceedings of 22nd ICML Workshop on Learning with Partially Classified Training Data, Bonn, Germany, 2005, pp. 1-9.

[31] K. Duh and K. Kirchhoff, “Learning to Rank with Partially Labeled-Data,” SIGIR, Singapore, 2008, pp. 20-27.

[32] D. Rao and D. Yarowsky, “Ranking and Semi-Supervised Classification on Large-Scale Graphs Using Map Reduce,” Proceedings of the Workshop on Graph-based Methods for Natural Language Processing (ACL-IJCNLP), Singapore, 2009, pp. 58-65.

[33] S. S. Ho and H. Wechsler, “A Martingale Framework for Detecting Changes in the Data Generating Model in Data Streams,” IEEE Transaction on Pattern Analysis and Machine Intelligence, No. 99, 2010 (to appear).

[34] B. J. Balas and P. Sinha, “Region-Based Representations for Face Recognition,” ACM Transactions on Applied Perception, Vol. 3, No. 4, 2006, pp. 354-375.

[35] H. Wechsler, “Linguistics and Face Recognition,” Journal of Visual Languages and Computation, Vol. 20, No. 3, 2009, pp. 145-155.