Yellow water is a by-product of the brewing process of Luzhou-flavor liquor. It is also known as yellow pulp water. Its characteristics are: brownish yellow viscous liquid, and rich in beneficial microorganisms, such as alcohols, aldehydes, acids, esters etc.  . At present, the main research and utilization of yellow water in the wine industry is as follows: 1) Judging the quality of fermentation by using the content of yellow water; 2) Esterifying liquid for fermentation; 3) for mixing new earthworms after mixing with mud and mother grains; 4) direct distillation to obtain base wine   .
The content of yellow water not only reflects the information of fermentation; in addition, it also contains extremely rich organic matter, which has a very high utilization value. The detection and analysis of the yellow water component is beneficial to more comprehensive monitoring of the fermentation process and to promote the full utilization of yellow water.
At present, for the yellow water detection, chemical titration and liquid chromatography  , gas chromatography and mass spectrometry are mainly used . Their disadvantages are that other solvents need to be configured, the detection time is long, and the sample pretreatment is cumbersome .
The sensor detects information about a specific object, such as temperature, light, pressure, etc., and reacts them into electrical signals according to certain rules. When the detected object changes, it will cause a change in the resistance value of the sensor, and the output voltage signal will also be different. In addition, the response signals of different sensors are also different, so it is possible to construct a sensor array, such as an electronic tongue  . The sensor’s response signal is conditioned and output, and the relationship between the sample object and the response electrical signal is analyzed through signal processing and pattern recognition, thereby quickly detecting the sample. Its characteristics are: the sample does not need pre-treatment, the sample consumption is small, the operation is simple, and the measurement is convenient and fast. It can comprehensively evaluate the taste information of the sample to identify the sample, and also quantitatively analyze some components.
The main research progress in the field of sensor arrays is as follows. Tian et al. used a multi-frequency pulse Sensor array to detect and analyze wines of different ages to predict the age of wine, using principal component analysis (PCA) and partial least squares-discrimination analysis (PLS-DA) for pattern recognition to distinguish between different age samples . Rudnitskaya et al. used electronic tongue combined with high performance liquid chromatography (HPLC) to predict and analyze the age of wine and the acids and esters contained in it. PCA regression model was used to show that the electronic tongue can be compared. Goodly detect its concentration . Winquist et al. established an online detection system for milk using an electronic tongue, using PCA to distinguish milk of different quality and origin . Du Hongfu et al. used the electronic tongue combined with HPLC to analyze the fermentation process of vinegar, and used BP neural network to establish a prediction model for acetic acid and lactic acid, the main components of vinegar . At present, the electronic tongue has been widely used in the food industry, such as Chinese vinegar, red wines, and organic acids    , but there are few studies on the parameter detection and analysis of yellow water.
2. Materials and Methods
2.1. Instruments and Equipment
During the fermentation of liquor, the dissolved oxygen is gradually reduced, and the starch in the wine cell is converted into glucose, yeast, lactic acid bacteria and the like into alcohol and acid by the mold, and the ionization degree of these products is different. In this study, based on the change of dissolved oxygen in the fermentation process and the change of the conductivity of yellow water caused by the product, a sensor array composed of an oxygen electrode and a conductivity electrode was designed. Figure 1 shows the PCB diagram of the device.
2.2. Materials and Reagents
The experimental sample is a sample of yellow water collected from a winery in Yibin, Sichuan. The main experimental reagents are sodium hydroxide, copper sulfate, and phenolphthalein.
2.3. Experimental Methods and Conditions
2.3.1. Determination of Physical and Chemical Indicators
The total acid content was determined by potentiometer; the ethanol content was obtained by alcohol; the reducing sugar content was determined by Ferien reagent titration; the starch was first hydrolyzed to reducing sugar, and then the Ferien reagent titration method was used (DB34T 1728-2012, GBT 10345-2007)   , the final calculation of starch content.
Figure 1. PCB schematic of the sensor array.
2.3.2. Sensor Array Measurement
The sensor array was preheated in a 3.5 mol/L KCl solution prior to detection. The pretreated sample was then tested and the electrode was washed with 0.01 mol/L KCl solution before each measurement. The sensor array detection parameter setting is set to start from +1 V, the step-down voltage is 0.2 V until −1 V; the precision is set to 10−3; and the parallel measurement is performed at 3 frequencies of 1 Hz, 10 Hz, 100 Hz, repeated three times, recorded and saved data.
2.4. Analysis Method
2.4.1. Principal Component Analysis
Principal component analysis (PCA) is the linear transformation of multiple variables into a small number of integrated variables to represent the raw data. The advantage is to eliminate the impact of correlation and fully extract effective information and reduce workload. The disadvantage is that the accuracy of the expression is reduced, and only the main component dimension is significantly reduced and a large amount of original information is retained to reflect the advantages of PCA.
Principles and steps of PCA: Assume that the number of samples is m, each with n features, DATA (m ∙ n).
1) Standardization: Find the mean and standard deviation of each feature separately. The normalized matrix DATA adjust (m ∙ n) is obtained by subtracting the average value of each data in DATA and dividing by each standard deviation, and the origin is located at the center of each sample point.
2) Find the covariance matrix C.
3) Find the eigenvalue of the covariance matrix C, λ.
Combined with the maximum variance theory, the eigenvector corresponding to the largest eigenvalues is the projection direction containing the most signals. The largest pre-k bits are selected, the eigenvectors are calculated and normalized, and k eigenvectors are obtained as eigenvector matrices composed of column vectors Eigencector (n ∙ k).
4) Mapping the original data to obtain dimensionally reduced data.
2.4.2. Discriminant Function Analysis
Discriminant function analysis (DFA) is a statistical method for discriminating the type of sample. The data obtained by the sensor detection samples are recombined to maximize the difference between the components while keeping the difference within the group small, so that the distance between the centers of the groups is maximized, thereby establishing a discriminant function for discriminating and distinguishing the samples. DFA classification is good and easy to implement. It is one of the commonly used pattern recognition methods for electronic nose and sensor arrays.
Commonly used discriminating methods are: distance discriminant method, Fisher discriminant method, Bayes discriminant method, stepwise discriminant analysis, and the like. The discriminant analysis method of distance has no specific requirements for various types of distribution, and is judged according to various center of gravity (average of each group). For a given observation, if it is closest to the center of the i-th class, it is determined to be from the i-th class.
The discriminant function analysis based on the distance from the sample to each parent has the advantages of less discriminant function and simple calculation. And there is no special requirement for the data as a whole, so this paper chooses the distance discriminant function for sample analysis.
3. Results and Analysis
3.1. Test Results
Samples with numbers 1 to 12 were trained, and samples 13 to 16 were used as prediction sets. Physical and chemical measurements are obtained as shown in Table 1.
3.2. Analysis Results
3.2.1. Principal Component Analysis
Data collection was performed on 16 samples, and after the bad points were removed, the original data was obtained. The data of each sample was composed of 6 sensors × 3 frequencies × 40 samples per frequency. The 48 sets of data obtained by repeating the first 12 samples each were used as a training set, and the remaining 4 sets were repeated 4 times to obtain 16 sets of data as a test set.
Table 1. Content of substances in yellow water.
In the SPSS, the principal component analysis is used to reduce the dimension to obtain 11-dimensional data. The KMO (Kaiser-Meyer-Olkin) and Bartlett tests show that the KMO value is greater than 0.6 and the sig value is less than 0.05, indicating that the results obtained by principal component analysis meet the requirements. The cumulative variance contribution rate of the first 4 dimensions is 86%, and the front three-dimensional principal component analysis chart is shown in Figure 2. It can be seen from the figure that the principal component obtained by dimensionality reduction is significantly representative of the original sample data.
3.2.2. Discriminant Function Analysis
In the SPSS, the linear discrimination analysis (LDA) is performed on the first 12 sample data, and the discriminant function analysis graph is obtained by using the stepwise discriminant method, as shown in Figure 3.
The discrimination index (DI) in the figure reached 99.9%. It can be seen that the clustering trend of the sample on the discriminant classification map is obvious, and the distance between different sample spaces is far, indicating that the discriminant function can effectively classify the yellow pulp water samples of different fermentation conditions according to their different components. Moreover, the discrimination results of 12 different samples were consistent with the actual ones, indicating that no error determination occurred.
Figure 2. Principal component analysis.
Figure 3. Discriminant function analysis diagram.
3.2.3. Multiple Linear Regression Analysis
Linear multivariate regression analysis was performed in SPSS. The obtained principal component factors were used as independent variables, and total sugar, total acid, starch and alcohol were sequentially used as dependent variables, and stepwise regression was selected. The regression obtained shows that the adjusted R2 reaches 70%, the linear equation reflects 70% of the actual data, and the DW (Durbin-Watson) statistic is 1.69 close to 2, indicating that there is no sequence correlation in the obtained data; the sig data are less than 0.05, indicating that the significant influence of the independent variable on the dependent variable meets the requirements; VIF (Variance inflation factor) is less than 10, indicating that there is no collinearity between the variables; in addition, the residual diagnosis is basically consistent with the positive distribution.
The fit of the model and the simulated sample is as follows：
Figure 4 is the acidity fitting pattern.
Figure 5 is the starch fitting pattern.
Figure 6 is the sugar fitting pattern.
Figure 7 is the alcohol fitting pattern.
As can be seen from the figure, the fit of each model data is sufficient, and the model can accurately represent the sample data.
3.3. Sample Prediction
The prediction result unknown to the model is evaluated by calculating the prediction standard deviation.
Figure 4. Acidity fitting pattern.
Figure 5. Atarch fitting pattern.
Figure 6. Sugar fitting pattern.
among them: ―Fit or predicted value; ―Fit the number of samples; ―Sample measured value.
The prediction results of the remaining 4 groups of samples are shown in Table 2. The predicted standard deviations are: alcohol content 0.43, acidity 0.39, starch 0.60 reducing sugar 0.45.
4. Conclusions and Prospects
From the experimental results of PCA and DFA, the sensor array can distinguish different samples of yellow water. The correction decision coefficient of the model has a more complete explanation. The final prediction results show that in addition to the slightly larger prediction error of starch, the sensor array’s prediction of alcohol, acidity and reducing sugar basically meets the needs of actual production testing. This shows that the sensor array can be applied to the yellow water detection, especially the online detection in the fermentation process, and the important parameters of the fermentation can be obtained in real time.
Figure 7. Alcohol fitting pattern.
Table 2. Forecast results.