Coffee is one of the most appreciated and consumed crops in the world, which is why there is a potential risk to find arabica coffee (an expensive variety) adulterated with canephora coffee, also known as robusta (a cheaper variety). Currently, the generalized method for coffee classification is caffeine detection by High-Performance Liquid Chromatography (HPLC), which requires the sample to be submitted to a chemical pre-treatment and a time-consuming preliminary analysis.
However, almost any kind of sample can be easily analyzed by using Laser Induced Breakdown Spectroscopy (LIBS), in a considerably short time, with no need for chemical pretreatments; some successful instances are presented in      . In LIBS, the sample is impacted with laser pulses, which generates a plasma, composed of excited species, mostly atoms, whose emissions provide information about the elemental composition of the sample  . A LIBS installation is typically composed of: a laser, used to excite the simple and produce a plasma; a focus lens; an optical fiber, used to collect the optical signal emitted by the plasma; a spectrometer, which receives and diffracts light from the plasma; a photodiode and a delay device that allow the opening of the spectrometer window to be controlled; and a computer with which the spectra taken is visualized and stored. LIBS has, however, a major disadvantage: spectra of the same sample may have significant differences between them, due to experimental factors that are difficult to control, such as laser energy; to solve this problem it is advisable to normalize the spectra.
Studies on coffee using LIBS have been already reported  -  , and the study referenced in  is the only one in which the use of the technique to differentiate both varieties of coffee is mentioned. But none of them refers the use of calibration curves for the detection of the degree of adulteration of arabica coffee with robusta coffee, nor to the use of an Artificial Neural Network (ANN) to differentiate the spectra of arabica and robusta coffee.
An Artificial Neural Network (ANN) is an artificial information processing system whose operation is based on that of the human brain. The ANN is composed of artificial neurons that emulate the operation of a biological neuron. ANNs are then able to learn and improve their performance, to generalize, abstract and classify information, and to recognize patterns   .
The experimentation reported in this paper is divided into two parts: The first is the construction of calibration curves for the detection of the degree of simulated adulteration in coffee, using green coffee pills with different concentrations of arabica and robusta varieties. And the second one is the training of an ANN with spectra of arabica and robusta roasted coffee beans, as well as its subsequent test to find out if the correct classification of an unknown group is possible.
The purpose of this research was to demonstrate that LIBS can be applied to solve various problems related with coffee authentication.
2. Experimental Design and Samples
2.1. The LIBS Installation
The LIBS installation used were composed of the following elements:
・ A multipulse Nd:YAG laser, which emits shots composed of a train of six pulses at a wavelength of 1064 nm, with an average energy of 450 mJ.
・ A focus lens with a focal length of 5 cm.
・ An optical fiber located at 5 mm of the sample with 41.5˚ of inclination.
・ A spectrometer, operating in the range of 200 to 800 nm.
・ A photodiode and a delay device that control the opening of the spectrometer so that it occurs 2 μs after the impact of the laser with the sample.
・ A computer for visualizing and storing spectra.
2.2. The Samples
For the first phase of the experimentation reported in this paper, two samples of green coffee beans, one arabica and one robusta, were used. The green coffee of the arabica variety came from the state of Hidalgo, and the robusta variety from the state of Veracruz, both located in the Mexican Republic.
For the development of the second stage of the reported experimentation, three types of roasted coffee beans, medium-high grade, were used. The first two samples were from arabica coffee, one from the state of Chiapas, Mexico, and the other from Colombia. The sample of robusta coffee came from the state of Veracruz, Mexico. The samples were stored at room temperature and under hermetical conditions until use. Grains had dimensions of approximately 1 cm long, 0.8 cm wide and 0.3 cm high. Samples were used without any previous pretreatment.
2.3. The Artificial Neural Network (Multilayer Perceptron)
The Multilayer Perceptron was configured in the MATLAB program in its version R2014a, this was chosen for the relative simplicity in its configuration and use, compared to other models of Artificial Neural Network.
The Multilayer Perceptron, and the secondary functions that were programmed to optimize the whole process, were configured to feed on a total of one hundred spectra. The secondary functions were programmed to save time in the treatment of the worked spectra, before they were fed to the Multilayer Perceptron.
Once the spectra are introduced, they are normalized. Normalization allows to transform all the worked spectra to a same scale, in such a way that the signal of greater intensity has a value of one, and the one of less intensity, a value of zero. With the normalization it is possible to reduce the problems derived from the high sensitivity of LIBS, which causes that the spectra of the same sample can have great differences among them due to systematic experimental factors that are difficult to control.
Subsequently, the normalized spectra of each group (arabica and robusta) were averaged with each other to obtain an average spectrum characteristic of their variety. Then, by subtracting the average spectrum of robusta from the average arabica spectrum, a difference spectrum is obtained. In the difference spectrum, signals with positive values correspond to the elements with greater presence in the arabica roasted coffee, and the signals with negative values correspond to the elements with greater presence in the robusta roasted coffee. With the difference spectrum it is possible to know which are the signals whose intensities vary more from one spectrum to another. A spectrum taken under the experimental conditions reported in this paper has a total of 3648 wavelengths, most of which correspond to the “background” thereof, ie they do not contain important information for the identification and differentiation of the spectra. That is why it is important to recognize the wavelengths that correspond to the lines and bands that most contribute to correctly separating the spectra of both varieties.
However, even in the difference spectrum, most of the points are in the surroundings of the center, ie near the zero value. It is necessary, then, to establish a selection criteria to decrease the number of points to be taken into account when working with the Multilayer Perceptron. The proposed criteria is as follows: a set amount of error standard deviations is established from the mean of the difference spectrum, and those points whose signals are located beyond these standard deviations in both directions are taken into account. For the work reported in this paper, different standard deviations (0.5, 1, 1.5 and 2) were tested to determine the best results. The Multilayer Perceptron was then instructed to take into account the number of standard deviations that would allow the best separation of the classes at the training stage. The Multilayer Perceptron was also configured so that it was possible to vary the number of neurons in the hidden layer.
The algorithm used to carry out the iterations is the Scaled Conjugate Gradient, since this method is useful for solving systems with a large number of elements.
Finally, among the results obtained after the classification of two classes, are: a Confusion Matrix, which allows to know the number of elements of each class that was correctly assigned to its group; and a Performance Chart, which allows us to know the evolution of entropy in the system.
3.1. Parameter Setting
The distance between the sample and the optical fiber input, and the delay time between the impact of the laser with the sample and the opening of the spectrometer were evaluated with the aim of setting the better operating conditions for taking spectra. Three distances were tested: 5 mm, 1 cm and 1.5 cm. Moreover, eight delay times were tested: 2, 3, 4, 5, 6, 7, 8 and 9 μs. Best operating conditions were those that produced more intense spectra, in which it is easier to identify spectral lines. The parameter setting was performanced with roasted coffee beans from Chiapas, Mexico.
3.2. On the Calibration Curves
3.2.1. On the Green Coffee Pills
The green coffee beans were sprayed into a commercial coffee grinder, in portions of approximately 5 grams in 5 minute cycles, until the particles reached a diameter of about 0.5 mm. The equipment was washed and dried before each use, to avoid contaminations. In order to simulate the adulteration of arabica green coffee with robusta green coffee, four blends of 1 gram each were prepared, varying the percentage of robusta content in 20%, 40%, 60% and 80% (0.2, 0.4, 0.6 and 0.8 grams of robusta). The mixtures were homogenized in a commercial coffee grinder under stable conditions in 5 minute cycles. Each mixture was placed in a cylindrical mold for later compression in a hydraulic press of 20 tons. The compression cycles were twenty minutes each. Enterelly green arabica coffee and robusta green coffee pills were also formed.
3.2.2. On the Spectra
All spectra were taken on the same day to guarantee the same conditions of temperature, atmospheric pressure and humidity. Three spectra were taken per pill, rotating them to ensure that the shots hit different points.
The spectra belonging to each tablet were normalized based on the highest intensity peak, so that all the signals had intensities in the interval [0,1]. Subsequently, normalized spectra were averaged to obtain a single representative spectrum of each sample. Lines were identified with the USA-ARL  and NIST  databases.
3.2.3. On the Calibration Curves
The relative intensities of each identified line were taken, and intensity ratios were obtained by dividing each of these intensities among the others. This information was used to form calibration curves that would allow to identify trends of growth or decrease by increasing the degree of adulteration of arabica green coffee with robusta green coffee. These curves were linearly adjusted to determine, by means of the value of the determination coefficient R2, which are useful for determining the degree of adulteration of arabica green coffee with robusta green coffee.
3.3. On the Multilayer Perceptron
3.3.1. On the Spectra
The spectra were taken in the area of the internal hard endosperm of different coffee beans, this area was chosen to perform the analysis as being the most uniform on the coffee bean, as shown in Figure 1. Grains were constantly slided to impact at diferent points with the laser shots, with the aim of the laser shots impacting at different points, this to ensure that the spectra obtained were as close as possible to each other.
Figure 1. Parts of the coffee bean.
Fifty specters were taken from the sample of arabica coffee from Chiapas, fifty from the robusta coffee from Veracruz, and one hundred from arabica coffee from Colombia. Spectra were not treated individually prior to their entry into the ANN.
3.3.2. Training the Multilayer Perceptron
Fifty spectra of the roasted arabica coffee from Chiapas and fifty of the robusta roasted coffee from Veracruz were used for the training of the Multilayer Perceptron. The spectra of arabica were assigned to class one, and robusta to class zero. Of the one hundred entered spectra, 80% were used for training, 10% for validation, and 10% for test. It was at this stage that were evaluated the different combinations of standard deviations and number of neurons in the hidden layer, in order to find the best conditions to classify the spectra.
3.3.3. Testing the Multilayer Perceptron
Once the Multilayer Perceptron was trained, 100 spectra of roasted arabica coffee from Colombia were used to test the ANN to verify if it was capable of assign the set of unknown spectra to the correct group (in this case, class one, corresponding to arabica coffee).
4.1. Parameter Setting
The highest relative intensities in spectra were found whit the optical fiber placed at a distance of 5 mm from the sample, and with 2 μs of delay time between the impact of the laser with the sample and the opening of the spectrometer, respectively. In Figure 2 is shown the effect of distance between the input of the optical fiber and the sample. And, in Figure 3, changes in the relative intensity of Ca II at 392.4 nm in dependence of the delay time can be observed as an example of the general behavior of the spectra taken with different delay times.
4.2. Calibration Curves
Signals identified in both green coffee varieties were: Mg II (at 279.9 and 280.3 nm), Fe I (at 317.1, 372.8, 421.9, 429.4, 517.8 and 558.5 nm), Cr I (at 335.5 and
Figure 2. Influence of the distance between the optical fiber and the sample on the relative intensity of spectra.
Figure 3. Influence of the delay time on the relative intensity of spectra (using as example the line of Ca II at 392.4 nm).
462.6 nm), Ca II (at 392.4 and 395.9 nm), Mn I (at 399.1 and 399.5 nm), Sr I (at 407.1 nm), Fe II (at 442.7, 444.6 and 567.3 nm), Ba I (at 553.2 nm), Na I (at 588.7 nm), H I (at 656.9 nm) and KI (at 765.7 and 768.9 nm). The Ca I lines at 643.3 and 645.6 nm were observed only in robusta coffee, and the Mn I line at 357.9 nm was observed only in arabica. These last three lines can be useful to differentiate both varieties of green coffee, however, they are not observed in the spectra of the mixtures, so they do not contribute to detect the adulteration of arabica coffee with robusta coffee nor the degree of the same.
Measurement of the signal intensities of the lines found in the spectra of each of the mixtures was made, in the case of the elements to which more than one line corresponds, the most prominent was taken into account. Each of the intensities was divided among all the others in order to obtain relations of intensity. With the information obtained, both the intensities and the intensity ratios, calibration curves were constructed and a linear adjustment was made to each one. Seven of the mentioned calibration curves presented R2 determination coefficients above 0.9000 after the linear adjustment, which is why they are considered useful for the purpose of this research. These calibration curves, which can be observed in Figures 4-10, correspond to the intensity changes in the Ca II lines at 392.4 nm, Sr I at 407.1 nm, N I at 500.5 nm and Na I at 588.7 nm, as well as the Ca II (392.4 nm)/N I (500.5 nm), Sr I (407.1 nm)/N I (500.5 nm)and N I (500.5 nm)/Na I (588.7 nm) intensity ratios, with the increase in the percentage of robusta green coffee in arabica green coffee. In all cases, the bars show an error of 5%.
It is observed that the signal of all lines, as well as all intensity ratios except for N/Na, are decreasing with the increase of the robusta green coffee content in the mixture. It is also observed that all the useful calibration curves come from the information obtained from four signals: Ca II (392.4 nm), Sr I (407.1 nm), N I (500.5 nm) and Na I (588.7 nm), which means that the identification of arabica green coffee adulterated with robusta green coffee and the determination of the degree of adulteration of the product are possible with the study of the same.
4.3. Optimal Conditions for the Multilayer Perceptron
One hundred spectra were used to train the Multilayer Perceptron, fifty of which were to the roasted arabica coffee beans from Chiapas, and the remaining fifty to robusta roasted coffee beans from Veracruz. The arabica spectra were assigned
Figure 4. Calibration curve of Ca II (392.4 nm) for different contents of robusta coffee.
Figure 5. Calibration curve of Sr I (407.1 nm) for different contents of robusta coffee.
Figure 6. Calibration curve of N I (500.5 nm) for different contents of robusta coffee.
Figure 7. Calibration curve of Na I (588.7 nm) for different contents of robusta coffee.
Figure 8. Calibration curve of the Ca II (392.4 nm)/N I (500.5 nm) intensity ratio for different contents of robusta coffee.
Figure 9. Calibration curve of the Sr I (407.1 nm)/N I (500.5 nm) intensity ratio for different contents of robusta coffee.
Figure 10. Calibration curve of the N I (500.5 nm)/Na I (588.7 nm) intensity ratio for different contents of robusta coffee.
to class one, and the robusta to class zero. The 80% of the spectra were used for training, 10% for validation, and 10% for testing.
In order to determine the best parameters for working with the Multilayer Perceptron, different tests were performed by varying the number of neurons in the hidden layer and the number of standard deviations of error from the mean of the difference spectrum.
It was found that, in order to achieve the best performance with the Multilayer Perceptron created, it is convenient to use a single neuron in the hidden layer and to consider the intensities of the wavelengths that are beyond the 0.5 standard deviations of error from the mean of the difference spectrum shown in Figure 11. This means that, in order to improve the classification, it is convenient to consider only 424 wavelengths of the 3648 that make up each fed spectra. This decision was made after analyzing the Confusion Matrix and the Performance Chart displayed after each test, taking into account the number of spectra of each group that were correctly classified.
The lines that distinguish the arabica roasted coffee from the robusta roasted coffee are, then: Mg II at 279.9 and 280.3 nm, Fe I at 317.1 nm, Ca II at 392.4 and 395.9 nm, Fe I at 421.9 nm, N I at 500.5 nm, Fe I at 517.8 nm, Fe I at 558.5 nm, Fe II at 567.3 nm, Na I at 588.7 nm and HI at 656.9 nm.
As shown in Figure 12, under the mentioned conditions, 100% accuracy in the separation of the fed data is achieved after 82 iterations. All the spectra were assigned to their target groups, ergo, the spectra were learned by the Multilayer Perceptron during the training stage.
Figure 13 shows the Performance Graph of the Multilayer Perceptron. This graph shows that the best performance of the Multilayer Perceptron for the mentioned conditions occurs after the iteration number 82. It is observed that at this point the training, validation and test curves converge, which means that all parts of the system reached the best performance in the same iteration in which the whole system reached it.
Figure 11. Most important differences between arabica and robusta roasted coffee spectra.
Figure 12. Confusion Matrix obtained in the training of the ANN.
Figure 13. Performance Chart obtained in the training phase.
Furthermore, under the mentioned conditions, the lowest value assigned to spectra from the class 1 is 1, and the highest value assigned to spectra from the class 0 is 0. This means that the Multilayer Perceptron has placed each of the fifty values belonging to each group in its class, without mistakes. Thus, it is confirmed that the designed Multilayer Perceptron is able to correctly separate and classify two groups of different spectra.
4.4. Resolution of the Unknown Spectra Group
After verifying that the designed Multilayer Perceptron can be satisfactorily trained, it was necessary to test it with an unknown group of samples different from those used for its training. Fifty spectra of arabica roasted coffee beans from Colombia were used for this purpose. Since these spectra were of coffee of the arabica variety, it was expected that they were classified in class 1.
The result was that, to all the unknown spectra tested, were assigned the weight 0.7163. As this value is higher than 0.5000, spectra are considered to be correctly identified as roasted coffee of the arabica variety, since this value is closer to 1 than to 0. However, the obtained error is significant and this can be observed in Figure 14.
It is important to mention that this error may be due in large part to the different provenances of arabica coffee used for training (from Chiapas, Mexico) and arabica coffee used as unknown (from Colombia), since the composition of the soil where coffee is grown influences its future chemical composition. Taking this into consideration, it is remarkable that the Multilayer Perceptron proposed in this work has managed to identify all the spectra as arabica coffee.
Mixtures of arabica green coffee from the state of Hidalgo with different percentages of robusta green coffee from the state of Veracruz were prepared to simulate adulteration. With the prepared mixtures, pills were manufactured and then analyzed with Laser-Induced Breakdown Spectroscopy (LIBS). With the relative intensities and intensity ratios obtained from the spectra of each pill, it was possible to construct seven calibration curves useful for the determination of the adulteration degree of arabica green coffee with robusta green coffee. These calibration curves are obtained by measuring the relative intensity of the Ca II
Figure 14. Confusion Matrix obtained in the testing of Multilayer Perceptron with unknown spectra.
(392.4 nm), Sr I (407.1 nm), N I (500.5 nm) and Na I (588.7 nm) lines in the normalized spectra, and with the intensity ratios of Ca II (392.4 nm)/N I (500.5 nm), Sr I (407.1 nm)/N I (500.5 nm) and N I (500.5 nm)/Na I (588.7 nm).
Moreover, spectra were taken from three different samples of roasted coffee beans using the Laser-Induced Breakdown Spectroscopy (LIBS) technique: fifty spectra of arabica coffee from the state of Chiapas, Mexico; fifty of robusta coffee from the state of Veracruz, Mexico; and one hundred arabica coffee from Colombia.
A Multilayer Perceptron was configured, which was fed with fifty spectra of arabica roasted coffee from Chiapas and fifty spectra of robusta roasted coffee from Veracruz for training. The arabica coffee spectra were assigned to class 1, and the robusta ones to class 0. Once the Multilayer Perceptron was trained, it was tested with an unknown group of one hundred spectra of arabica roasted coffee from Colombia. The Multilayer Perceptron assigned to all spectra the value of 0.7163 and, since this value is closer to 1 than 0, spectra are considered to be correctly identified as arabica. However, the confusion matrix showed, as its name implies, a confusion of 50%; this can be attributed to the different origins of the samples. Considering that two samples of the same coffee cultivated in different soils may have important compositional differences, it is a notable achievement that the designed Multilayer Perceptron correctly identified the unknown sample as arabica coffee.
It has been demonstrated, then, that LIBS can be applied to perform green and roasted coffee authentication analyzes in the future, although it is still necessary to increase the number of reported results.
It is essential to test the Multilayer Perceptron with a greater number of samples of roasted coffee of both varieties and different origins, and to repeat the procedure with samples of green coffee. Likewise, it is convenient to construct more calibration curves using samples of arabica and robusta coffee with different origins, and even using roasted coffee, as well as to increase the number of mixtures analyzed to evaluate the behavior of the curves towards the extremes of the same ones.
 Santos, D., Nunes, L.C., de Carvalho, G.G.A., Gomes, M.S., de Souza, P.F., Leme, F.O., dos Santos, L.G.C. and Krug, F.J. (2012) Laser Induced Breakdown Spectroscopy for Analysis of Plant Materials: A Review. Spectrochimica Acta Part B: Atomic Spectroscopy, 71-72, 3-13.
 Moreira Osorio, L., Ponce Cabrera, L.V., Arronte García, M.A., Flores Reyes, T. and Ravelo, I. (2011) Portable LIBS System for Determining the Composition of Multi-layer Structures on Objects of Cultural Value. XVII Reunión Iberoamericana de óptica & X Encuentro de óptica, Láseres y Aplicaciones, Lima, 274, 1.
 Alvira, F.C., Bilmes, G.M., Flores, T. and Ponce, L. (2015) Laser-Induced Breakdown Spectroscopy (LIBS) Quality Control and Origin Identification of Handmade Manufactured Cigars. Applied Optics, 69, 1205-1209.
 Ponce, L., Flores, T., Alvira, F., Bilmes, G.M. and Sosa, M. (2016) Laser-Induced Breakdown Spectroscopy Determination of Toxic Metals in Fresh Fish. Applied Optics, 55, 254-258.
 Flores, T., Ponce, L., Arronte, M. and de Posada, E. (2009) Free-Running and Q: Switched LIBS Measurements during the Laser Ablation of Prickle Pears Spines. Optics and Lasers in Engineering, 47, 578-583.
 Varão, T., Filippe, J.M., Milori, D.M.B.P., Ferreira, E.J., Gomes-Neto, J.A. and Ferreira, E.C. (2014) Correlações Entre O Conteúdo De Nea Qualidade Do Café Através Do Monitoramento De Linhas De N Por Espectrometria De Emissão óptica Com Plasma Induzido Laser (LIBS). [Correlations between Coffee Quality and Content through N-Line Monitoring by Laser Induced Breakdown Spectroscopy (LIBS).] Simpósio Nacional de Instrumentaçao Agropecuária, São Carlos, 233-236.
 Gondal, M.A., Baig, U., Dastageer, M.A. and Sarwar, M. (2016) Determination of Elemental Composition of Coffee Using UV-Pulsed Laser Induced Breakdown Spectroscopy. Proceedings of the Fifth Saudi International Meeting on Frontiers of Physics, Gizan, Saudi Arabia, 1742, 030007-1-030007-5.
 Nufiqurakhmah, N., Nasution, A. and Suyanto, H. (2016) Laser-Induced Breakdown Spectroscopy (LIBS) for Spectral Characterization of Regular Coffee Beans and Luwak Coffee Beans. 2nd International Seminar on Photonics, Optics, and Its Applications, Bali, Indonesia, 10150, 101500M-1-101500M-7.
 Wirani, A. P., Nasution, A. and Suyanto, H. (2016) Spectral Identifiers From Roasting Process of Arabica and Robusta Green Beans Using Laser-Induced Breakdown Spectroscopy (LIBS). 2nd International Seminar on Photonics, Optics, and Its Applications, Bali, Indonesia, 10150, 101501A-1-101501A-6.
 Anggraeni, K., Nasution, A. and Suyanto, H. (2016) Recognition of Spectral Identifier from Green Coffee Beans of Arabica and Robusta Varieties Using Laser-Induced Breakdown Spectroscopy. 2nd International Seminar on Photonics, Optics, and Its Applications, Bali, Indonesia, 10150, 1015019-1-1015019-6.
 Basogain Olabe, X. (1998) Redes Neuronales Artificiales y Sus Aplicaciones. [Artificial Neural Networks and Their Applications.] Publicaciones de la Escuela Superior de Ingeniería de Bilbao, España.
 Matich, D. J. (2001) Redes Neuronales: Conceptos Básicos y Aplicaciones. [Neural Networks: Basic Concepts and Applications.] Universidad Tecnológica Nacional—Facultad Regional Rosario, Departamento de Ingeniería Química, Argentina.