Biosurfactants are amphiphilic compounds produced mainly by aerobic microorganisms, such as bacteria, yeasts and filamentous fungi  , with wide use in detergents, laundry formulations, household cleaning products, cosmetics, herbicides, or pesticides, besides in food, pharmaceutical, textile, paper and petroleum industries, among others   . Bacillus species produce a broad spectrum of lipopeptide biosurfactants. Among them, surfactin, a lipoheptapeptide produced by Bacillus subtilis strains, is one of the most effective biosurfactants known  .
Biosurfactants were becoming the focus of extensive researches and applications  , because it present many advantages, such as high environmental compatibility, biodegradability and produced from renewable raw materials, besides, they have specific activity at extreme temperature, pH, salinity, and the ability to synthesize them from renewable food stocks   . These advantages have made the biosurfactants focus of many research and industrial applications  .
The use of biosurfactant is not widely encouraged yet, because of the cost involved in production and purification    . The biotechnological processes underlying microbial surfactants production should be based on the supplementation in culture broth with cheap substrates, such as waste or byproducts from the agro-industry, making commercialization possible   . Thus, in order to reduce the production costs, biosurfactant produced by Bacillus strains has been studied using different substrates, such as molasses  , cashew apple juice  , residual glycerol  , residue from processing of pineapple  or agro-industrial by-product corn steep liquor  . However, although several kinds of agro-industrial waste have been evaluated as substrates for the biosurfactants production, the waste from candy industry was not evaluated yet.
The waste from candy of industry consists mainly of sugars (glucose, sucrose and fructose), natural colorings, flavorings and anti-wetting agent. Thus, for there to be proper disposal, waste must pass through the primary and secondary treatments. The primary consists of a physical-chemical treatment which are part of the static and settling tank sieve. The secondary is a biological treatment and are part of the anaerobic stabilization ponds, activated sludge reactor and the settler. These treatments are costly and cumbersome due to the high investments in equipment for this purpose. The use of this waste as raw material in biosurfactant production is encouraged since adds value to the residue with lower production costs, since it is not necessary to heat treatment process. Therefore, it is very interesting from an economic point of view and environmental preservation to use the industrial waste bullets for biosurfactant production.
Response surface methodology (RSM) is a classical method to develop models through regression coefficients and its significance is established due to analysis of variance. This statistical approach is largely implemented as seen    . RSM is a modeling taken into account relationship between factors in experimental domain described by least squares. This implies, in most of cases, in sensitive models to variation in experimental errors, estimating no well experimental data, appropriately. Alternatively, artificial neural network can be used to improve the predictions of steady behavior and have several advantages over statistical methods. ANN has been successfully, comparing with statistical model, implemented in modeling optimization process, such as     .
In this context, this study aims to identify maximum biosurfactant production through fermentation by Bacillus subtilis using alternative substrates, i.e., glycerol from biodiesel production process combined with waste from candy industry. The waste concentrations interactions were assessed by experimental design strategies. RSM and ANN analysis of optimum points were carried out and models were developed to predict dry weight and crude biosurfactant concentrations. The crude biosurfactant produced was used in oil spreading to reveal applications on remediation.
2. Material and Methods
2.1. Inoculum Preparation and Standardization
Bacillus subtilis CBMAI 369 (ATCC) was obtained from the Brazilian Collection of Environmental and Industrial Microorganisms at Research Center for Chemistry, Biology and Agriculture-CPQBA/State University of Campinas, São Paulo, Brazil. The culture was maintained in Nutrient Broth (Difco) and initially a pre-inoculum was prepared in 15 mL Nutrient Broth in 50 mL Erlenmeyer flask, and incubation in an orbital shaker for 6 h at 37˚C and 100 rpm. Then, the inoculum (100 mL of sterile nutrient broth in a 250-mL Erlenmeyer flask) received the pre-inoculum culture (10 mL) and it was incubated for 16 h at same conditions.
2.2. Biomass and Crude Biosurfactant Production
At the end of the assays, a sample of 30 mL from the culture broth was centrifuged (10,000 rpm, 10 min, 4˚C). The biomass obtained was dried at 50˚C for 24 h and the weight evaluated.
The biosurfactant produced was precipitated from cell-free supernatant by acidification until pH 2.0 using 6N HCl and it was held at 7˚C overnight. Next, it was centrifuged (10,000 rpm, 10 min, 4˚C). The supernatant was then discarded and the precipitate was washed with acidified water and saved. All assays were performed in duplicate.
2.3. Application of Crude Biosurfactant in Oil Spreading
According to described by  oil spreading was evaluated by adding 20 mL distilled water on a Petri dish followed by addition of 50 µL of oil to its surface. Then, 40 µL of cell-free culture broth was dropped on the crude oil surface and the diameter of clear zone produced on the oil surface was assessed and compared to a negative control (culture medium).
2.4. Response Surface Methodology (RSM)
The biosurfactant production was investigated using the following waste substrates: waste of candy industry (X1) and glycerol from biodiesel production (X2). An experimental design tool was used in order to find optimal conditions for the biosurfactant production. All designs were developed and analyzed by STATISTICA 7 software based on Shapiro-Wilk, Kolmogorov-Smirnov, p-value and analysis of variance. The desired response was the dry weight (g/L) and crude biosurfactant (mg/L). To evaluate the combined effect of two different medium components, a central composite rotatable design of 22 plus 3 center points plus 4 axial points totaling 11 runs, according to Table 1.
The experiments were performed 100 mL fermentation medium in 250 mL Erlenmeyer flasks in an orbital shaker, at 100 rpm, 37˚C, for 96 h. The values of the dependent response (dry weight and crude biosurfactant) were the mean of two replications.
2.5. RSM Models
A second-order polynomial regression (Equation (1)) was used in this study for the estimation of all main and joint effects while central and axial points were for providing replication and curvature terms in the model.
where and are the input variables which are known to affect the response and , , , , are the relevant constants of the effects. Analysis of variance (ANOVA) was evaluated to validate the RSM model.
The ANOVA tables were built from the second-order polynomial coefficients and a probability value of <0.1 was used as criterion for statistical significance.
2.6. Modeling with Artificial Neural Network (ANN)
ANN was used to obtain the relationship between media components (X1 and X2) and dependent variables (dry weight and crude biosurfactant) through
Table 1. Values used in central composite rotatable design (CCRD).
steady model. The experimental data were divided into three sets: training (60%), test (20%) and validation (20%) to avoid over-parameterization. The values of input and output data were normalized between −1 and 1 to avoid any numerical overflow. The hyperbolic, logistic and linear functions were used as activation functions in hidden and output layers.
When a network is able to perform as well on validation set inputs as on set training set inputs, the goal was reached. The training by ANN consists to better adjusting weights to minimize the error between the observed and predicted outputs. The training process was done by specific algorithms, such as: trainlm that updates weight and bias according to Levenberg-Marquardt optimization; traingdx that updates weight and bias values according to gradient descent momentum and an adaptive learning rate; trainbr that updates the weight and bias values according Levenberg-Marquardt optimization and minimizes a combinations of squared errors and weights, the process is called Bayesian regularization; traincgb that updates weight and bias values according to conjugate gradient backpropagation with Powell-Beale restarts; and trainoss that updates weight and bias values according to the one-step secant method.
The number of neurons in the hidden layer was defined based on amount of neurons in input layer without variation to avoid increasing the number of effective parameters.
The performance of models was evaluated by coefficient of determination (R2) and the analysis of statistical indices curves were through mean squared error (MSE) defined according to Equation (2):
where N represents the total number of patterns in corresponding set (training), represents the ith neural network target (observed data) and represents the ith neural network response (predicted data).
3. Results and Discussion
3.1. Biosurfactant Production Investigation
In present work, it was determined the best culture broth for biosurfactant production through the relationship between dry weight and crude biosurfactant (responses). For that purpose, and due to the fermentation, experimental central composite rotatable design (CCRD) was used to investigate the dry weight and crude biosurfactant to determine the significance of process parameters and their interactions. Thus, the scenario of possibilities among the variables in the CCRD 22 was used in addition to three central points and 4 axial points, totaling 11 runs. This methodology consists in to evaluate the most assays through matrix of experimental design, showed in Table 2. The complex nature of biological process, especially when using waste substrates, can be seen in the assays 3 and 5 through standard deviation from crude biosurfactant.
Table 2. CCRD combinations of factors and the response variables.
The results of the table indicated there was biosurfactant production in the conditions 3 and 5. It is suggested the composition of culture broth affected the growth microbial by presence of any element in combined assays. When the waste of candy industry concentration increased, the results showed responses zero, indicating that the excess of the glucose concentration affected negatively the biosurfactant production.  examined different concentrations of glucose and concluded that 40 g/L was the best concentration and with higher glucose concentrations, biosurfactants production was significantly decreased.
The assay 5 was the only with absence of waste of candy industry that produced biosurfactant. This, probably, is due to glycerol (from biodiesel produced by soybean oil) used as carbon and mineral (calcium, phosphorus, magnesium and sodium) sources.
The waste of candy negatively affects the biosurfactant production (Table 2) for the two studied variables. The negative influence may be explained by over glucose concentration (in waste of candy) present in the culture broth, which inhibited the microorganism growth.  confirmed the enhancing glucose concentration negatively affects biosurfactant production. On other hand, raw glycerol demonstrated positive effects for dry weight and crude biosurfactant, which indicate enhancing its concentration. The interactions between the variables (1Lby2L) in the two responses have positive effect, proving that the combination of them is important, waste of candy to lowest level (−1.41 to −1) while raw glycerol to highest level (+1 to +1.41), reaching the best responses.
Based on these results, the matrix was evaluated, enabling the calculation of regression coefficient with p-value limit 0.10. The behavior of dry weight and crude biosurfactant was assessed, for practical purposes, two models were adjusted through re-parameterization, to make it as simple as possible, with the fewest possible parameters, without losing its accuracy (Equations (3) and (4)):
The analysis of variance (ANOVA) was performed to ensure confidence of the generated model to dry weigh and crude biosurfactant (Table 3).
ANOVA shows that the model is valid and highly significant, as is evident from the fisher F test, explaining 86.72% for dry weigh and 90.81% for crude biosurfactant of the behavior of the variables and Fcal is three and almost five times larger than Ftab, respectively. The models were acceptable and similar to the model developed in this study.
The graph of the response surface represented the optimization domain of the statistical model. The Figure 1 shows the graph of the response surface, developed in this study, for the dry weight and crude biosurfactant, besides graph of the contour curves.
Figure 1. Response surface and contour curves graphs: (a) dry weight predictions; (b) crude biosurfactant predictions.
Table 3. Analysis of variance (ANOVA) for the dry weight and crude biosurfactant.
DW: F4; 6; 0.10 = 3.18; Correlation Coefficient: R2 = 86.72%. CB: F4; 6; 0.10 = 3.18; Correlation Coefficient: R2 = 90.81%.
Even the models with good agreement, the investigations about the optimal point were carried out via conditions determined first matrix (Table 2), waste of candy was conducted from 0% to 3.6% v/v while raw glycerol was conducted from 15% to 25% v/v. Thus, another experimental domain was evaluated, according to Table 4.
The matrix with new scenario of investigation can be seen in Table 5.
The changes made in experimental domain were able to reach response different of zero (seen previously). From new CCRD results, the assay 2 showed highest value of crude biosurfactant (around 670 mg/L) and assay 6 showed highest value of dry weight (around 43.21 g/L).
Based on matrix, the calculation of regression coefficient with p-value limit 0.10 allowed evaluating polynomial models. The behavior of dry weight and crude biosurfactant was assessed, for practical purposes, two models were adjusted through re-parameterization (as previously), to make it as simple as possible, with the fewest possible parameters, without losing its accuracy (Equations (5) and (6)):
Table 4. Values used in central composite rotatable design (CCRD).
Table 5. CCRD combinations of factors and the response variable.
Therefore, the results of the polynomial model in the form of analysis ANOVA was analyzed in these new scenarios. Table 6 shows the calculated values.
The ANOVA of the models (dry weight and crude biosurfactant) showed that F-test were 0.17 and 4.12, not suitable for the models. These results indicated that the regression model was insignificant, because the lack of fit showed higher values. The fit of the model was evaluated by the determination of coefficient R2 values, 0.88 and 0.73, confirming no good agreement of models. Although these results are not promising, the model can indicate through surface response where the optimal point is, Figure 2.
The CCRD can validate with other models, for this purpose it was developed strategies of the use of artificial neural network (ANN) as predictor model.
Table 6. Analysis of variance (ANOVA) for the crude biosurfactant and reduction ratio of surface tension.
DW: F5; 5; 0.10 = 3.45; Correlation Coefficient: R2 = 88.41%. CB: F4; 6; 0.10 = 3.18; Correlation Coefficient: R2 = 73.31%.
Figure 2. Response surface and contour curves graphs: (a) dry weight predictions; (b) crude biosurfactant predictions.
3.2. ANN-Based Modeling
The experiments used as input data for developing an ANN based model is given in Table 5 through CRRD combinations. The experiments were conducted in duplicate thus, the total data set of 33 points divided into a training set of 25 and a test set of 8 data points. The outputs for each model were given by dry weight and crude biosurfactant (seen in Table 5), which demonstrate the functional relationship between media component (waste of candy and glycerol) and biosurfactant production. The number of neurons in hidden layer was fixed on 4 for every situation in modeling to ensure that number of effective parameters were not higher than number of vector in input layer, discarding the appearance of overfitting. All of topologies of ANN model were 2-4-1. It was implemented different training algorithms, as seen in Figures 3-12 (expressed by dispersion and regression graph). The Figures 3-6 represent all the conditions of model-prediction of dry weight (g/L) using logsig as activation function.
Although the most of situation of modeling has shown good values of correlation coefficient, the situation of Figure 3 was chosen, R2 of 0.998, besides of MSE 0.1579. The MSE was considered small and comparable magnitudes of the average prediction error (seen all dry weight predictions), which suggest that the model possesses good approximation and generalization characteristics.
Figure 3. Predicted data and regression graph of test ANN using 2 × 4 × 1 topology and trainlm algorithm.
Figure 4. Predicted data and regression graph of test ANN using 2 × 4 × 1 topology and traingdx algorithm.
Figure 5. Predicted data and regression graph of test ANN using 2 × 4 × 1 topology and trainbr algorithm.
Figure 6. Predicted data and regression graph of test ANN using 2 × 4 × 1 topology and traincgb algorithm.
Figure 7. Predicted data and regression graph of test ANN using 2 × 4 × 1 topology and trainoss algorithm.
Figure 8. Dispersion and regression graph of test ANN using 2 × 4 × 1 topology and trainlm algorithm.
Figure 9. Dispersion and regression graph of test ANN using 2 × 4 × 1 topology and traingdx algorithm.
Figure 10. Dispersion and regression graph of test ANN using 2 × 4 × 1 topology and traingdx algorithm.
Figure 11. Dispersion and regression graph of test ANN using 2 × 4 × 1 topology and traincgb algorithm.
Figure 12. Dispersion and regression graph of test ANN using 2 × 4 × 1 topology and trainoss algorithm.
The Figures 8-12 represent all the conditions of model-prediction of crude biosurfactant using tansig as activation function.
To de second model was chosen as previously, by the best values of correlation coefficient and MSE. The situation plotted in Figure 9, ANN using 2 × 4 × 1 topology and traingdx algorithm, was considered to form model with R2 of 0.982 and MSE 0.067.
The performance of both of ANN-models was consistent as it resulted in similar values of predicted and observed data. The results obtained are very important, because they very clearly reveal the sufficiency and representativeness of waste of candy and glycerol concentrations v/v as relevant input variables for prediction. To prove the steady prediction performance, it was shown ANN and RSM predictions (Table 7).
The predictions performance of the ANN models for the experimental design data set confirms theirs superior generalization capacity when comparing RSM models. Analysis of the results demonstrated that the neural modeling approach is a useful tool for accurate modeling of two dependent variables and has shown a sum of errors of 2.30 and 88.48 for de dry weight and crude predictions while for RSM model sum of errors were 43.40 and 560.50, respectively.
 developed a similar strategy to investigate bioethanol production. It was used RSM and ANN models for bioethanol yield and volume fraction. The results showed that ANN was better than RSM in data fitting with correlation coefficient of 1 and 0.98 and absolute average deviation of 0.09% and 1.67%, respectively.
Table 7. Experimental values and model-predicted values of dry weight (DW) and crude biosurfactant (CB).
3.3. Validation in Optimal Points
The optimum values were found to be 3.2% (v/v) for waste of candy and 16% (v/v) raw glycerol concentrations. The maximum dry weight and crude biosurfactant in these optimum conditions was 25.60 ± 5.0 g/L and 668 ± 40 mg/L, respectively. The models were used to compare with the observed data. To RSM models were reached 33.36 g/L of dry weight and 731.24 mg/L of crude biosurfactant and to ANN models were 27.45 g/l and 671.56 mg/L, respectively. The validation experiments confirm that ANN models are powerful approach to predict steady behavior of biosurfactant production, because their predictions are within of experimental errors.
Fermentation process are very complex, especially when using waste substrates, it is believed that the performance of RSM models had not good statistical significance due to the great variation of experimental errors, high non-linearity. ANNs are known by the accuracy, the generalization ability and the robustness of the models, in these types of study theirs use is more appropriate.
It is important to highlight, in this study, that production of biosurfactant using only alternative sources (waste of candy and glycerol from biodiesel process) presented similar results to other researches that used synthetic culture broth, such as  . The authors evaluated biosurfactant production by Bacillus subtilis through response surface methodology, using as factors glucose, K2HPO4 and urea. The results showed a maximum predicted biosurfactant concentration of 2.93 g/L and experimental result was 3.1 g/L. Several works treat of biosurfactant production incorporating waste in the culture broth synthetic.  produced biosurfactant using by Bacillus subtilis LAMI005, using residual glycerol from biodiesel production as a carbon source. The culture medium was (in g/L): (NH4)2SO4 (1.0); Na2HPO4∙7H2O (7.2); KH2PO4 (3.0); NaCl (2.7); MgSO4∙7H2O (0.6); glycerol (20.0).
3.4. Application of Crude Biosurfactant in Oil Spreading
In order to confirm the presence of biosurfactant by using the optimum condition, experiments were conducted (Figure 13) simulating the recovery oil spreading in water.
The results revealed applications for produced biosurfactant. There is a little information about oil displacement areas brought about by biosurfactants produced by Bacillus subtilis in the literature. Nevertheless, it is noticed larger clear zone, compared with negative control, when added biosurfactant.  tested produced biosurfactant by Bacillus subtilis in application of the oil spreading.  also checked oil displacement area formed when added produced biosurfactant by Cunninghamella echinulata.
In order to identify biosurfactant production, the experimental central composite rotatable design (CCRD) was performed, evaluating interactions between
(a) (b) (c) (d)
Figure 13. Oil spreading test: (a) sample without negative control; (b) sample with negative control; (c) sample without crude biosurfactant; (d) sample with crude biosurfactant.
two alternative residues (waste of candy industry and glycerol from biodiesel process) without supplementations and the responses were dry weight (g/L) and crude biosurfactant (mg/L) in 96 h of fermentation. RSM and ANN models were employed to predict the mentioned responses of experimental matrix. ANN provided more accurate predictions than RSM seen by higher R2 and lower sum of errors from predicted values. Validation of optimum points were similar to predicted values by ANN models. To our knowledge, this is first study to report on use of combinations among two substrates based on waste of candy and glycerol from biodiesel for the purpose of biosurfactant production, besides, to develop a multiple criteria analysis based on statistical and intelligence modeling. An application in remediation of oil spreading was simulated and crude biosurfactant was able to produce a clear zone. Additionally, all the results indicated success to use waste, showing good agreement with environment. But there are lots of researches about this theme to be elucidated, such as: scale up assay, using the best conditions; to add others waste; to study the oxygen influence and kinetics parameters; and others.
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq-National Council of Technological and Scientific Development).
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES-Coor- dination for the Improvement of Higher Education Personnel).