Functional Analysis of Chemometric Data

ABSTRACT

The objective of this paper is to present a review of different calibration and classification methods for functional data in the context of chemometric applications. In chemometric, it is usual to measure certain parameters in terms of a set of spectrometric curves that are observed in a finite set of points (functional data). Although the predictor variable is clearly functional, this problem is usually solved by using multivariate calibration techniques that consider it as a finite set of variables associated with the observed points (wavelengths or times). But these explicative variables are highly correlated and it is therefore more informative to reconstruct first the true functional form of the predictor curves. Although it has been published in several articles related to the implementation of functional data analysis techniques in chemometric, their power to solve real problems is not yet well known. Because of this the extension of multivariate calibration techniques (linear regression, principal component regression and partial least squares) and classification methods (linear discriminant analysis and logistic regression) to the functional domain and some relevant chemometric applications are reviewed in this paper.

Cite this paper

A. Aguilera, M. Escabias, M. Valderrama and M. Aguilera-Morillo, "Functional Analysis of Chemometric Data,"*Open Journal of Statistics*, Vol. 3 No. 5, 2013, pp. 334-343. doi: 10.4236/ojs.2013.35039.

A. Aguilera, M. Escabias, M. Valderrama and M. Aguilera-Morillo, "Functional Analysis of Chemometric Data,"

References

[1] F. Ferraty and P. View, “The Functional Nonparametric Model and Application to Spectrometric Data,” Computational Statistics, Vol. 17, No. 4, 2002, pp. 545-564.

http://dx.doi.org/10.1007/s001800200126

[2] H. Zhu and D. Cox, “A Functional Generalized Linear Model with Curve Selection in Cervical Pre-Cancer Diagnosis Using Fluorescence Spectroscopy,” IMS Lecture Notes-Monograph Series Optimality: The Third Erich L. Lehmann Symposium, Vol. 57, 2009, pp. 173-189.

http://dx.doi.org/10.1214/09-LNMS5711

[3] C. Preda, G. Saporta and C. Lévéder, “PLS Classification for Functional Data,” Computational Statistics, Vol. 22, No. 2, 2007, pp. 223-235. http://dx.doi.org/10.1007/s00180-007-0041-4

[4] J. O. Ramsay and B. W. Silverman, “Functional Data Analysis,” 2nd Edition, Springer, Berlin, 2005.

[5] J. O. Ramsay and B. W. Silverman, “Applied Functional Data Analysis,” Springer, Berlin, 2002.

[6] F. Ferraty and P. View, “Nonparametric Functional Data Analysis. Theory and Practice,” Springer, Berlin, 2006.

[7] M. J. Valderrama, A. M. Aguilera and F. A. Ocana, “Predicción Dinámica Mediante Análisis de Datos Funcionales, Hespérides-La Muralla,” 2000.

[8] W. Saeys, B. De Ketelaere and P. Dairus, “Potential Applications of Functional Data Analysis in Chemometrics,” Journal of Chemometrics, Vol. 22, No. 5, 2008, pp. 335344.

http://dx.doi.org/10.1002/cem.1129

[9] C. Jiang and E. B. Martin, “Functional Data Analysis for the Development of a Calibration Model for Near-Infrared,” Computer Aided Chemical Engineering, Vol. 25, 2008, pp. 683-688.

http://dx.doi.org/10.1016/S1570-7946(08)80119-8

[10] C. Preda and G. Saporta, “PLS Regression on a Stochastic Process,” Computational Statistics and Data Analysis, Vol. 48, No. 1, 2005, pp. 149-158. http://dx.doi.org/10.1016/j.csda.2003.10.003

[11] A. M. Aguilera, M. Escabias, C. Preda and G. Saporta, “Using Basis Expansions for Estimating Functional PLS Regression. Applications with Chemometric Data,” Chemometrics and Intelligent Laboratory Systems, Vol. 104, No. 2, 2010, pp. 289-305.

http://dx.doi.org/10.1016/j.chemolab.2010.09.007

[12] N. Kramer, A.-L. Boulesteix and G. Tutz, “Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data,” Chemometrics and Intelligent Laboratory Systems, Vol. 94, No. 1, 2008, pp. 60-69. http://dx.doi.org/10.1016/j.chemolab.2008.06.009

[13] A. M. Aguilera and M. C. Aguilera-Morillo, “Comparative Study of Different B-Spline Approaches for Functional Data,” Mathematical and Computer Modelling, 2013, in press.

http://dx.doi.org/10.1016/j.mcm.2013.04.007

[14] A. M. Aguilera and M. C. Aguilera-Morillo, “Penalized PCA Approaches for B-Spline Expansions of Smooth Functional Data,” Applied Mathematics and Computation, Vol. 219, No. 14, 2013, pp. 7805-7819. http://dx.doi.org/10.1016/j.amc.2013.02.009

[15] M. C. Aguilera-Morillo, A. M. Aguilera, M. Escabias and M. J. Valderrama, “Penalized Spline Approaches for Functional Logit Regression,” TEST, Vol. 22, No. 2, 2013, pp. 251-277.

http://dx.doi.org/10.1007/s11749-012-0307-1

[16] A. M. Aguilera and M. Escabias, “Solving Multicollinearity in Functional Multinomial Logit Models for Nominal and Ordinal Responses,” In: S. Dabo-Niang and F. Ferraty, Eds., Functional and Operatorial Statistics, Springer, Berlin, 2008. http://dx.doi.org/10.1007/978-3-7908-2062-1_2

[17] M. Escabias, A. M. Aguilera and M. J. Valderrama, “Principal Component Estimation of Functional Logistic Regression: Discussion of Two Different Approaches,” Journal of Nonparametric Statistics, Vol. 16, No. 3-4, 2004, pp. 365-384. http://dx.doi.org/10.1080/10485250310001624738

[18] M. Escabias, A. M. Aguilera and M. J. Valderrama, “Functional PLS Logit Regression Model,” Computational Statistics and Data Analysis, Vol. 51, No. 10, 2007, pp. 48914902.

http://dx.doi.org/10.1016/j.csda.2006.08.011

[19] M. Escabias and A. M. Aguilera, “Functional PCA and Base-Line Logit Models with Applications to the Classification of NIR Spectra,” Journal of Classification, 2013, under revision.

[20] F. Ferraty, A. Goia and P. Vieu, “Nonparametric Functional Methods: New Tools for Chemometric Analysis,” In: W. Hardle, M. Yuichi and P. Vieu, Eds., Statistical Methods for Biostatistics and Related Fields, Springer, Berlin, 2007, pp. 245-263.

[21] F. Ferraty and P. View, “Curves Discrimination: A Nonparametric Functional Approach,” Computational Statistics and Data Analysis, Vol. 44, No. 1-2, 2003, pp. 161173. http://dx.doi.org/10.1016/S0167-9473(03)00032-X

[22] M. C. Aguilera-Morillo and A. M. Aguilera, “P-Spline Estimation of Functional Classification Methods for Improving the Quality in the Food Industry,” Communications in Statistics—Simulation and Computation, 2013, in press.

[23] A. Delaigle, P. Hall and N. Bathia, “Componentwise Classification and Clustering of Functional Data,” Biometrika, Vol. 99, No. 2, 2012, pp. 299-313. http://dx.doi.org/10.1093/biomet/ass003

[24] P. R. Bouzas, N. Ruiz-Fuentes, A. Matilla, A. M. Aguilera and M. J. Valderrama, “A Cox Model for Radioactive Counting Measure: Inference on the Intensity Process,” Chemometrics and Intelligent Laboratory Systems, Vol. 103, No. 2, 2010, pp. 116-121.

http://dx.doi.org/10.1016/j.chemolab.2010.06.002

[25] H. Tan and S. D. Brown, “Multivariate Calibration of Spectral Data Using Dual-Domain Regression Analysis,” Analytica Chimica Acta, Vol. 490, No. 1-2, 2003, pp. 291301.

http://dx.doi.org/10.1016/S0003-2670(03)00351-9

[26] P. J. Brown, T. Fearn and M. Vannucci, “Bayesian Wavelet Regression on Curves with Application to a Spectroscopic Calibration Problem,” Journal of the American Statistical Association, Applications and Case Studies, Vol. 96, No. 454, 2001, pp. 398-408. http://dx.doi.org/10.1198/016214501753168118

[27] P. T. Reiss and R. T. Ogden, “Functional Principal Component Regression and Functional Partial Least Squares,” Journal of the American Statistical Association, Vol. 102, No. 479, 2007, pp. 984-996.

http://dx.doi.org/10.1198/016214507000000527

[28] F. A. Ocana, A. M. Aguilera and M. Escabias, “Computational Considerations in Functional Principal Component Analysis,” Computational Statistics, Vol. 22, No. 3, 2007, pp. 449-465.

http://dx.doi.org/10.1007/s00180-007-0051-2

[1] F. Ferraty and P. View, “The Functional Nonparametric Model and Application to Spectrometric Data,” Computational Statistics, Vol. 17, No. 4, 2002, pp. 545-564.

http://dx.doi.org/10.1007/s001800200126

[2] H. Zhu and D. Cox, “A Functional Generalized Linear Model with Curve Selection in Cervical Pre-Cancer Diagnosis Using Fluorescence Spectroscopy,” IMS Lecture Notes-Monograph Series Optimality: The Third Erich L. Lehmann Symposium, Vol. 57, 2009, pp. 173-189.

http://dx.doi.org/10.1214/09-LNMS5711

[3] C. Preda, G. Saporta and C. Lévéder, “PLS Classification for Functional Data,” Computational Statistics, Vol. 22, No. 2, 2007, pp. 223-235. http://dx.doi.org/10.1007/s00180-007-0041-4

[4] J. O. Ramsay and B. W. Silverman, “Functional Data Analysis,” 2nd Edition, Springer, Berlin, 2005.

[5] J. O. Ramsay and B. W. Silverman, “Applied Functional Data Analysis,” Springer, Berlin, 2002.

[6] F. Ferraty and P. View, “Nonparametric Functional Data Analysis. Theory and Practice,” Springer, Berlin, 2006.

[7] M. J. Valderrama, A. M. Aguilera and F. A. Ocana, “Predicción Dinámica Mediante Análisis de Datos Funcionales, Hespérides-La Muralla,” 2000.

[8] W. Saeys, B. De Ketelaere and P. Dairus, “Potential Applications of Functional Data Analysis in Chemometrics,” Journal of Chemometrics, Vol. 22, No. 5, 2008, pp. 335344.

http://dx.doi.org/10.1002/cem.1129

[9] C. Jiang and E. B. Martin, “Functional Data Analysis for the Development of a Calibration Model for Near-Infrared,” Computer Aided Chemical Engineering, Vol. 25, 2008, pp. 683-688.

http://dx.doi.org/10.1016/S1570-7946(08)80119-8

[10] C. Preda and G. Saporta, “PLS Regression on a Stochastic Process,” Computational Statistics and Data Analysis, Vol. 48, No. 1, 2005, pp. 149-158. http://dx.doi.org/10.1016/j.csda.2003.10.003

[11] A. M. Aguilera, M. Escabias, C. Preda and G. Saporta, “Using Basis Expansions for Estimating Functional PLS Regression. Applications with Chemometric Data,” Chemometrics and Intelligent Laboratory Systems, Vol. 104, No. 2, 2010, pp. 289-305.

http://dx.doi.org/10.1016/j.chemolab.2010.09.007

[12] N. Kramer, A.-L. Boulesteix and G. Tutz, “Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data,” Chemometrics and Intelligent Laboratory Systems, Vol. 94, No. 1, 2008, pp. 60-69. http://dx.doi.org/10.1016/j.chemolab.2008.06.009

[13] A. M. Aguilera and M. C. Aguilera-Morillo, “Comparative Study of Different B-Spline Approaches for Functional Data,” Mathematical and Computer Modelling, 2013, in press.

http://dx.doi.org/10.1016/j.mcm.2013.04.007

[14] A. M. Aguilera and M. C. Aguilera-Morillo, “Penalized PCA Approaches for B-Spline Expansions of Smooth Functional Data,” Applied Mathematics and Computation, Vol. 219, No. 14, 2013, pp. 7805-7819. http://dx.doi.org/10.1016/j.amc.2013.02.009

[15] M. C. Aguilera-Morillo, A. M. Aguilera, M. Escabias and M. J. Valderrama, “Penalized Spline Approaches for Functional Logit Regression,” TEST, Vol. 22, No. 2, 2013, pp. 251-277.

http://dx.doi.org/10.1007/s11749-012-0307-1

[16] A. M. Aguilera and M. Escabias, “Solving Multicollinearity in Functional Multinomial Logit Models for Nominal and Ordinal Responses,” In: S. Dabo-Niang and F. Ferraty, Eds., Functional and Operatorial Statistics, Springer, Berlin, 2008. http://dx.doi.org/10.1007/978-3-7908-2062-1_2

[17] M. Escabias, A. M. Aguilera and M. J. Valderrama, “Principal Component Estimation of Functional Logistic Regression: Discussion of Two Different Approaches,” Journal of Nonparametric Statistics, Vol. 16, No. 3-4, 2004, pp. 365-384. http://dx.doi.org/10.1080/10485250310001624738

[18] M. Escabias, A. M. Aguilera and M. J. Valderrama, “Functional PLS Logit Regression Model,” Computational Statistics and Data Analysis, Vol. 51, No. 10, 2007, pp. 48914902.

http://dx.doi.org/10.1016/j.csda.2006.08.011

[19] M. Escabias and A. M. Aguilera, “Functional PCA and Base-Line Logit Models with Applications to the Classification of NIR Spectra,” Journal of Classification, 2013, under revision.

[20] F. Ferraty, A. Goia and P. Vieu, “Nonparametric Functional Methods: New Tools for Chemometric Analysis,” In: W. Hardle, M. Yuichi and P. Vieu, Eds., Statistical Methods for Biostatistics and Related Fields, Springer, Berlin, 2007, pp. 245-263.

[21] F. Ferraty and P. View, “Curves Discrimination: A Nonparametric Functional Approach,” Computational Statistics and Data Analysis, Vol. 44, No. 1-2, 2003, pp. 161173. http://dx.doi.org/10.1016/S0167-9473(03)00032-X

[22] M. C. Aguilera-Morillo and A. M. Aguilera, “P-Spline Estimation of Functional Classification Methods for Improving the Quality in the Food Industry,” Communications in Statistics—Simulation and Computation, 2013, in press.

[23] A. Delaigle, P. Hall and N. Bathia, “Componentwise Classification and Clustering of Functional Data,” Biometrika, Vol. 99, No. 2, 2012, pp. 299-313. http://dx.doi.org/10.1093/biomet/ass003

[24] P. R. Bouzas, N. Ruiz-Fuentes, A. Matilla, A. M. Aguilera and M. J. Valderrama, “A Cox Model for Radioactive Counting Measure: Inference on the Intensity Process,” Chemometrics and Intelligent Laboratory Systems, Vol. 103, No. 2, 2010, pp. 116-121.

http://dx.doi.org/10.1016/j.chemolab.2010.06.002

[25] H. Tan and S. D. Brown, “Multivariate Calibration of Spectral Data Using Dual-Domain Regression Analysis,” Analytica Chimica Acta, Vol. 490, No. 1-2, 2003, pp. 291301.

http://dx.doi.org/10.1016/S0003-2670(03)00351-9

[26] P. J. Brown, T. Fearn and M. Vannucci, “Bayesian Wavelet Regression on Curves with Application to a Spectroscopic Calibration Problem,” Journal of the American Statistical Association, Applications and Case Studies, Vol. 96, No. 454, 2001, pp. 398-408. http://dx.doi.org/10.1198/016214501753168118

[27] P. T. Reiss and R. T. Ogden, “Functional Principal Component Regression and Functional Partial Least Squares,” Journal of the American Statistical Association, Vol. 102, No. 479, 2007, pp. 984-996.

http://dx.doi.org/10.1198/016214507000000527

[28] F. A. Ocana, A. M. Aguilera and M. Escabias, “Computational Considerations in Functional Principal Component Analysis,” Computational Statistics, Vol. 22, No. 3, 2007, pp. 449-465.

http://dx.doi.org/10.1007/s00180-007-0051-2