Potassium is the essential nutrient during the growth and development process of fruit trees   . The hyperspectral remote sensing technology has been widely used in recent years. It has the advantages of high spectral resolution, high spectral information, simple analysis  and so on, which make up for the lack of traditional chemical analysis methods. At present, the use of hyperspectral techniques to estimate plant potassium is concentrated in wheat  , sugarcane  , tobacco  , citrus   , rice   and so on. The vegetation index can reflect the plant information better, enhance the response ability to vegetation, and facilitate the study of plant biochemical components and hyperspectral relationship. It is widely used for biomass remote sensing estimation. The estimation of plant biochemical nutrient by vegetation index was applied to chlorophyll content   and leaf area index  . Xing et al.  used the spectral technique to establish a partial least squares regression (PLS) model based on the least square sum of errors, and to estimate the total potassium content of the leaves of Red Fuji apple. Chai et al.  established first-order differential linearity (X630) model based on 630 nm spectrum by hyperspectral technique is the optimal estimation model of total potassium content in Korla pear leaves. Yi et al.  established the PLS regression model of leaf potassium content by the second order differential value of the reflection curve of the orange leaves can be used to predict the potassium content of Pengan 100 Jin orange leaves. He et al.  found that the power model established by the ground chlorophyll index can accurately estimate the chlorophyll “a” content, and the exponential model constructed by the modified normalization difference index can accurately estimate the chlorophyll “b” content. At present, there are few studies on the inversion of potassium content in apple leaves by vegetation index hyperspectral technique. In this study, 96 samples of fruit trees were collected from apple trees in Qixia County, and their potassium contents and hyperspectral reflectance were measured. The vegetation indices of RVI, DVI and NDVI were constructed by the original spectral reflectance. Attempting to establish a predictive regression model based on RF to provide technical support for spectral diagnosis of potassium nutrition in leaves of apple trees.
2. Materials and Methods
2.1. Test Sample Collection
The study area is located in Qixia County, Yantai City, Shandong Province (37˚0'05"N - 37˚032"N, 12˚00'33"E - 12˚10'15"E). The city is located in the hinterland of Jiaodong Peninsula, with a total area of 2017 km2. Its terrain is mountain hills. The city’s climate is warm temperate monsoon semi-humid climate. It’s average annual temperature is 11.3˚C and the average annual rainfall is 754 mm, frost-free period of each year is 207 d. The total length of cultivation in this area is 2690 h, the total area of Qixia apple orchard is 4.4 × 105 hm2, the annual output is 1.45 × 107 t. The time of collecting leaf samples is new shoots period of apple trees (5.20.2014). At the time of collection, 24 orchards were randomly selected. Four apple trees were randomly selected from each orchard. So a total of 96 trees were selected. Then, 8 pieces of similar size and healthy leaves were collected in the middle of the growth branches of each apple tree. Finally, put the sample into the preservation bag, number, and then put it into the preservation box back to the laboratory.
2.2. Hyperspectral Data Acquisition and Determination of Potassium Content
Spectral measurements were performed using ASDFieldSpec4 with a band of 350 nm - 2500 nm. The spectral resolution is 3 nm and the sampling interval is 1.4 nm in the range of 350 nm - 1000 nm. The spectral resolution is 8 nm and the sampling interval is 2 nm in the range of 1000 nm - 2500 nm. The whole measurement process is carried out in a dark room which can control the illumination condition. At the time of determination, the leaf samples should be measured on the black rubber mat (the reflectivity is approximately 0). The angle of the spectrometer is 25˚, The probe is vertically aligned with the middle of the blade, and at 0.10 m from the sample surface. The light source is spectrometer’s own 50 W halogen lamp, the distance from the sample center is 0.50 m. Each sample was observed in 10 spectra and averaged as the spectral reflectance value of the sample. Make whiteboard correction in time when measuring.
After the spectral data collection was completed, the leaves were placed in an oven at 80˚C, sterilized for 20 min, then cooled to 60˚C and dried to constant weight. The potassium content was determined by flame photometric method.
2.3. Data Processing and Modeling
Using ViewSpec software to convert the leaf DN value into spectral reflectance, and using software EXCEL, SPSS to achieve further processing and analysis. In order to reduce the difference of light intensity, the unevenness of the sample and the influence of the instrument noise on the spectral characteristics of the target. In addition, in order to extract the spectral information fully, reflect the difference of plant growth status better and study the relationship between plant biochemical composition and hyperspectral, ratio vegetation index (RVI), difference vegetation index (DVI) and normalized vegetation index (NDVI) is constructed, which can be calculated as
where Rλ1 and Rλ2 represent spectral reflectance at wavelengths λ1 and λ2, respectively.
Correlation analysis between the three vegetation indices which constructed by the spectral reflectance of 96 apple leaves and potassium contents, and then use the stepwise regression analysis method to select the sensitive wavelength or wavelength combination. Using the selected spectral parameters as independent variables, 72 samples were randomly selected to establish a RF regression model, and the remaining 24 were used for the model test. The effect of the fitting between the estimated and the measured values is evaluated by using the coefficient of determination (R2), root mean square error (RMSE) and relative error (RE%) to test the stability and applicability of the model. The larger the R2, the smaller RMSE and RE%, the better the prediction effect of the model. Their equations can be calculated as
where is the measured potassium content value, is the estimated value, is the average value of, and the number of samples is n.
3.1. Spectral Characteristics of Apple Leaves with Different Potassium Contents
The level of potassium content has an important effect on the spectral reflectance of leaves  , showing obvious regularity in the observed wavelength range. 96 samples were sorted according to the potassium content from less to more, and were divided into 4 groups according to the order. The average of the spectral data and the potassium content was calculated for each group (Table 1). The hyperspectral curves of four different potassium levels with wavelength were obtained, which were K1, K2, K3 and K4, respectively. It is show on in Figure 1.
As we can see from Figure 1, the overall spectral reflectance also increased with the increase in the level of potassium nutrition. In the visible light 350 nm - 680 nm, near infrared 800 nm - 1300 nm, 1400 nm - 1850 nm, 1900 nm - 2500 nm reflectivity increased significantly, the remaining bands are smaller.
3.2. Correlation Analysis between Vegetation Index and Potassium Content
Combine any two wavelengths in the band range (350 nm - 2500 nm), put them into Equations (1)-(3) respectively to obtain different vegetation indices, and
Table 1. Different potassium content and spectral reflectance of apple leaves.
Figure 1. Spectral reflectance of different potassium content of apple leaves.
then use software Matlab to analyze their correlation with the potassium content. The contour map shows in Figures 2-4.
The horizontal and vertical coordinates of the contour map are wavelengths, and the color difference in the middle of the map indicates the level of correlation. Since the wavelengths of the vegetation index are random, any two wavelengths are combined twice, the contour map is symmetrically distributed along the diagonal. From Figures 2-4, it is clear that which wavelengths or bands of the different vegetation indices are better correlated.
3.2.1. Correlation Analysis between RVI and Potassium Content
The ratio vegetation index RVI is simple and can reduce the environmental noise effectively. It has been widely used in the analysis of vegetation spectrum  .
From Figure 2, we can see that wavelength combination of visible light and near infrared wavelength band of 700 nm - 1300 nm have high correlation, they are all reached 0.37 or more, shows the dark red and dark blue. Light green and light blue in the whole picture of the larger proportion, indicating that RVI correlation coefficient has smaller value, is mostly between −0.1 - 0.1 in the wavelength range of 350 nm - 2500 nm. The correlation coefficient was the highest at RVI (749, 1215), reaching 0.4321. Stepwise regression analysis was used to select the ratio vegetation index with larger correlation coefficient, as shown in Table 2.
3.2.2. Correlation Analysis between DVI and Potassium Content
The difference of contour map between the difference vegetation index DVI and the ratio vegetation index RVI is very big (Figure 3). The areas with high correlation between the DVI and the content of potassium are distributed rather than
Figure 2. A contour map of the correlation coefficient (R) between potassium content and RVI based on two wavelength combinations.
Figure 3. A contour map of the correlation coefficient (R) between potassium content and DVI based on two wavelength combinations.
Figure 4. A contour map of the correlation coefficient (R) between potassium content and NDVI based on two wavelength combinations.
clustered in a range of wavelength ranges. The correlation coefficient between band of 700 nm - 800 nm and the band of 350 nm - 750 nm, 1100 nm - 1250 nm,
Table 2. Correction coefficients between RVI and potassium content of apple leaves.
1900 nm - 2000 nm, 2450 nm - 2500 nm are all large, they are all reached 0.45 or more. DVI (364,740) has the highest correlation, reaching 0.5355. The difference vegetation index used for establishing model was selected, as shown in Table 3.
3.2.3. Correlation Analysis between NDVI and Potassium Content
Normalized Vegetation Index NDVI is an important indicator to reflect vegetation growth and activity, and is also an important indicator of crop yield  .
The combination of sensitive bands between NDVI and RVI is similar (Figure 4). The correlation coefficient between NDVI (749, 1215) and potassium content is the largest, which is the same as that of RVI, which is 0.4317. Similarly, the NDVI used for establishing model is selected as shown in Table 4.
3.3. Establishment of Estimation Model of Potassium Content for Apple Leaves
Random Forest (RF) is a natural non-linear modeling tool proposed by Breiman and Cutler in 2001  , which is a new model replacing traditional machine learning methods such as neural networks. It has excellent performance in the face of a large number of data. RF does not need to worry about the multicollinearity problems faced by general regression analysis and is not be used as a variable choice which can be used to invert the potassium content of apple leaves in hyperspectral data. A large number of theoretical and practical studies have proved that RF regression analysis has a high prediction accuracy, and its number of regression numbers selected depends mainly on two important parameters: Ntree and Mtry, where Ntree is the number of decision trees, Mtry is the number of sample predictors at each split node   . The random forest regression analysis was performed in data processing software DPS. The RVI, DVI and NDVI in the Tables 2-4 were taken as independent variables. The parameters Ntress = 300 and Mtry = 3. The proportion of forest training samples was 50%, and the evaluation model of potassium content in apple leaves was established respectively.
Table 3. Correction coefficients between DVI and potassium content of apple leaves.
Table 4. Correction coefficients between NDVI and potassium content of apple leaves.
The measured values of the 72 training samples were fitted and the random forest estimates were fitted. The data obtained from the three vegetation indices were created a scatter plot as shown in Figure 5.
The determination coefficient of DVI in the three vegetation indices was the highest, reaching 0.8995, R2 of RVI and NDVI were 0.8875 and 0.8817, respectively. The RMSE and RE% of DVI are the smallest, respectively reaching 0.0791 and 0.0617, the RMSE of RVI and NDVI were 0.1126 and 0.1154, RE% were 0.0858 and 0.0875. As we can see that the random forest model of potassium content in apple leaves established by difference vegetation index is better.
3.4. Validation of Estimation Model of Potassium Content for Apple Leaves
In order to test the adaptability and accuracy of the random forest regression model, the scatter plot of measured value and estimated values of 24 independent test samples were established, as shown in Figure 6.
The R2 of three vegetation indices are all above 0.77 and their RMSE are all below 0.13. However, the results of the model established by DVI are better than the other two vegetation indices. The maximum coefficient of determination is
Figure 5. Comparison of the measured and estimated value of potassium content of apple leaves (n = 72): (a) RVI; (b) DVI; (c) NDVI.
0.8205, the mean square error is 0.1071, and the relative error is 0.0756. The results of the verification are very satisfactory. It can be seen that the random forest regression model has good adaptability for complex data, and the model is stable.
The correlation of the three vegetation indices used for modeling was all higher than 0.259, which reached the most significant level (P < 0.01). The correlation between DVI and potassium content was the highest and the correlation coefficient of DVI (364,740) is 0.5355. It can be seen that the difference vegetation index is better for establishing model to estimate the potassium content of leaves. Whether the other vegetation indices are more conducive to hyperspectral estimation of leaf potassium content remains to be studied.
Contour map can select the band which has high correlation coefficient with potassium content in leaves of fruit trees. It has obvious advantages in dealing with a large amount of information and can be widely used in data processing. The random forest model performed well in leaf nutrient modeling estimates   . The RF model based on vegetation index established in this study can
Figure 6. Comparison of the measured and estimated value of potassium content of apple leaves: (a) RVI; (b) DVI; (c) NDVI (n = 24).
provide a more accurate prediction of potassium content in apple leaves and provide a relatively fas way. It has certain guiding significance and reference value for the rapid, real-time nutrition and growth detection of apple trees.
The content of potassium in leaves of apple leaves varies with the growth period. The leaf samples of this study are needed from the vigorous growth of new shoots period of apple, whether this model is applicable to the estimation of potassium content in other phenological periods need further exploration and research.
The amount of potassium content will affect the number of reflectivity; there are obvious influence at visible light 350 nm - 680 nm, near infrared 800 nm - 1300 nm, 1400 nm - 1850 nm, 1900 nm - 2500 nm. In general, the hyperspectral reflectivity increases with the increase of leaf potassium content.
The random forest regression model established by difference vegetation index DVI is the best, and the fitting precision is the highest. R2 is 0.8995, RMSE and RE% are 0.0791 and 0.0617, respectively. The coefficient of determination is 0.8205, the root mean square error is 0.1071, and the relative error is 0.0756, which is stable.
This paper was supported by the National Nature Science Foundation of China (41671346, 41271369), Funds of Shandong “Double Tops” Program (SYL2017XTTD02) and agriculture big data project of Shandong Agricultural University (75016).