Laser welding (LW) is a joining technique used to join together two or more parts of metals and alloys through the use of a laser beam. Frequently used in high volume, such as in the automotive industry, LW presents great and multiple benefits such as deep penetration, reduced heat affected zone, high welding rates and good precision. To exploit appropriately the benefits presented by LW, it is necessary to develop a comprehensive strategy to control the process in order to produce the suitable weld characteristics without being forced to use the traditional and fastidious trial and error procedures. The traditional industrial practice consists to execute a number of experiments by varying one welding parameter at a time in order to evaluate their effects. The parameters that have the greatest effects are used to control the process. However, as the process parameters are interrelated by nonlinear relationships, this procedure cannot lead to convincing results despite the prohibitive number of experiments, which leads to excessive time and costs. These problems can be avoided if appropriate prediction model is designed. For this reason, many studies are conducted using finite element method (FEM) models to predict the weld pool and to develop a better understanding of the process behaviour by offering the possibility to reveal what is happening inside the part. The multi-physical aspect of laser process is the main difficulty, as many phenomena are coupled and different scales of physics interact . However, LW models become very sophisticated with a lot of phenomena to be considered. This inevitably requires increasing and exorbitant computational times . The most recent simulation models using fluid flow, plasma and vapour simulated have produced a good prediction of the keyhole shape   . Nevertheless, despite the fact that LW process becomes more comprehensible with a lot of studies on related phenomena, it remains many phenomena to be studied and uncertainties to be overcome.
The use of artificial neural network (ANN) in modelling LW has been the issue of several studies. Sathiya et al. developed a model based on the ANN to predict weld geometry and tensile strength of the laser welded butt joints of AISI 904L . ANN was used for the establishing of the relationship between power, speed, and focal position as welding parameters, and the weld geometry with three different shielding gases (argon, helium and nitrogen). The proposed model was used for the optimization of the process parameters with genetic algorithm. The modelling results indicated that the model was in good agreement with the experimental results. Olabi used an ANN to predict penetration depth, fused zone width and heat affected zone width for welding medium carbon steel with CO2 laser . In this study, the ANN is used to provide additional data to complete an L9 Taguchi design with laser power, speed and focus position as parameters. Then the ANN model is used to simulate the optimal solution established with the Taguchi method. Despite the proposed network is trained with only 14 data, the predictions are still in good agreements with targeted results. Iskander et al. studied the use of ANN to predict depth and width of weld pool for pulsed Nd:YAG LW of aluminium . Welding speed, welding power, laser pulse energy and laser pulse duration are considered as process parameters. An estimation procedure has also been developed to convert the dimension of the weld pool into a weld profile based on the actual experimental weld profile. In other studies, the capacity and the adaptability of the ANN based prediction models were evaluated in conduction mode welds and on keyhole mode weld . However, the reliability of ANN was relatively limited. ANN model can accurately predict significantly change in weld pool profiles like between conduction mode and keyhole mode only if trained with good and appropriate data sets. In his work Jeng used an ANN with learning vector quantization to try to predict laser power and welding speed according to the thickness and the gap . The results demonstrated that the developed model predicted successfully the desired welding parameters giving the intended thickness and gap. The model also provides the weld quality by estimating the width, undercut and distortion. Chang applied a combined model of FEM and neural network to predict the weld bead shape with gap for overlap Nd:YAG LW of 304 stainless steel . The FEM model is used to determine the bead dimensions of the part without gap and used them as inputs in the ANN to predict the bead dimensions with gap. Three different process parameters combinations are considered as inputs for the ANN in order to select the appropriate variables configuration. The type 1 use all the parameters as inputs (focal length, energy, pulse time, sheet thickness, gap size, penetration depth and nugget size without gap), the type 2 use 4 inputs and the type 3 use only 3 inputs among the 4 of type 2. The learning is made with 100 experimental data. The results provide less than 10% error and shows that the mixture of FEM and ANN can be used to predict welds shape accurately. The type 2 and type 3 models achieve slightly better results showing that more parameters is not always the best option. There are no other relevant studies evaluating the impact of each welding parameters on the accuracy and robustness of ANN based predictive modelling.
Consequently, when prediction model is needed, ANN allows fast results and therefore offers many advantages especially in the case of computationally intensive predictions and real-time applications where FEM based models are very slow and not adapted . ANN models have been used with success to model many welding process including LW  . But the application of ANN for laser process is relatively limited. Producing an accurate ANN model requires very large data to ensure an efficient ANN learning and validation processes . Generation the needed data using experiment are rather long and expensive. So when experimentally validated, a 3D FEM based simulation models can be used to generate acceptable and cost effective ANN learning data.
The objective of this paper is to present a structured and comprehensive approach developed to design an effective ANN based model for predicting weld shape and dimensions (WSD) in LW of galvanized steel in butt joint configurations using a 3 kW Nd:Yag LW system. The proposed approach examines LW parameters and conditions known to have an influence on geometric characteristics of the welds and builds a quality prediction model step by step. The modelling procedure is based on a structured experimental investigation and exhaustive 3D FEM simulation efforts in order to identify the possible relationships between LW parameters (laser power, welding speed, fibre diameter and gap) and the weld geometrical characteristics such as depth of penetration (DOP) and bead width (BDW), and the sensitivity of these relationships to the welding process conditions. Using experimental, 3D simulation and various statistical analysis results, several prediction models are developed and evaluated. In order to carry out the models building procedure, an efficient modelling planning method combining neural networks, a multi-criteria assessment and various statistical analysis tools is adopted.
2. Proposed Modelling Strategy
Welding operations are dynamic processes with various nonlinearities and stochastic disturbances. The difficulty of building an effective prediction model lies in the selection of the appropriate modelling technique and the variables to be included in the model. These choices represent the basic and crucial ingredients of any modelling methodology. Selecting the model form and the modelling technique is not sufficient to produce the best model. Since deterministic models are typically valid only for a limited range of welding conditions, ANN present the best modelling alternative. While various neural techniques can be used in this approach, a multilayer network appears to be one of the most appropriate option for this type of application  . In order to determine efficiently and economically the best combination of variables to be included in the model, a structured design of experiment is used as a base for the modelling procedure . The selection of the best combination of variables is centered on comparing a complete model containing all variables and various models with a reduced number of variables. This process can be achieved by: 1) building a sufficient number of models, where each is designed with a subset of specifically selected variables, 2) evaluating the modeling and prediction performance of these models according to specific criteria, and finally, 3) estimating the effect of each modeling variable on the performance of the designed models in terms of variable contributions in reducing the modelling, validation and prediction errors by using appropriate statistical tools.
Many criteria can be used to assess whether a reduced model adequately represents the relationship between WSD and the LW parameters under various welding conditions. Measuring the performance of fitted models is based on the principle of reducing several statistical criteria. These include the residual sum of squared errors (SSE), the residual mean square error (MSE), the total squared error (Mallow’s Cp), and the coefficient of determination (R2). For the majority of modelling techniques, the model is determined by minimizing the residual sum of squares (SSE). All of the criteria, MSE, Cp, and R2, are a linear function of the SSE. The combination of variables that minimizes the SSE creates MSE and Cp as the minimum and R2 as the maximum under a fixed number of variables. Among these criteria, R2 does not have an extreme value and shows a gradual increasing trend when the number of variables in the model is increased. Thus, the use of R2 as a criterion for the selection of variables can allow some subjectivity. If p variables among q variables are selected, the residual mean square is MSEp = SSEp/(n − p − 1), where n is the total number of observations. The terms SSEp and n − p both decrease with an increase in the number of independent variables p. Therefore, MSEp have the ability to show an extreme value. In this study, the used criteria to evaluate the models consists in minimizing the training residual mean square error (MSEt) and the validation residual mean square error (MSEv) as well as the total residual mean square error (MSEtot) for each WSD attributes.
In order to extract rapidly a cost-effective and optimized combination of variables to be included in the WSD prediction model, an efficient experimental design method is used. Using full factorial design, an appropriate model can be designed by selecting the most sensitive group of variables that show high correlation with WSD. The model building procedure can be summarized in the following steps: 1) Collect data to train and verify the models. All parameters and conditions that may influence the process must be identified and considered; 2) Select the modeling technique and the performance criteria; 3) Select the appropriate matrix design for the required number of models. Rows of the matrix correspond to models and columns represent the variables to be included in each model. Every entry in the matrix is a value of 1 or 0 indicating whether the variable is included or not in the model; 4) Train and test the generated models and evaluate their performances according to the selected criteria; 5) Determine the effect of each variable on every performance index. These effects can be considered as rates of reduction of MSE values when a variable is input to the fitted model or not. Using these results, variables that contribute significantly to the models improvement according to the errors reduction are selected otherwise they are rejected; 6) Determine the final model configuration. When the variables providing the best information on the WSD are identified, the models can be built.
2.1. Artificial Neural Network Modeling
While various ANN models can be used in this approach, a multilayer feed-forward neural network seems to be one of the most appropriate choices because of its simplicity and flexibility. As shown in Figure 1, a neural network consists of N neurons, which are each connected to the neurons of the adjacent layers. A threshold value θj,l is associated with each neuron. The output of each neuron is determined by the level of the input signal in relation to the threshold value. These signals are modified by the connection weights Wi,j,c (also called synaptic strengths) between the neurons.
Let Ij,l be the input to the jth neuron on layer l, then the output of this neuron is given by:
Figure 1. Simple computational elements of the multiplayer feed-forward neural network.
where Oi,l−1 is the output of the ith processing neuron of layer l − 1, nl−1 is the number of neurons on layer l − 1, and Wi,j,l is the weight of the connection between neuron i on layer l − 1 and neuron j on layer l.
The ANN structure shown in Figure 1 provides a typical and useful example to illustrate the mechanism of the supervised learning process. In response to a pattern presented to the input layer, the ANN attempts to produce an associated pattern by its output layer. The hidden layers are employed to reject noises that are present in the input signals, so that the task of feature extraction can be performed effectively. The exemplar values input in the network are linearly mapped between 0 and 1 range. The network outputs will allow values between 0 and 1 which can be mapped back to full range.
So far as the training of the multilayer feed-forward neural network is concerned, the algorithm most widely used is known as error back-propagation. The ANN training by back propagation involves three stages: the feed forward of input training pattern, the calculation and back propagation of the associated error, and the adjustments of the weights. After training, application of the net involves only the computations of the feed forward phase. The performance of the network is determined by the mean squared error. Lower MSE corresponds to better learnability and predictability. In this study the Levenberg-Marquardt algorithm is used as a training function for the back propagation. This method involves an iterative improvement to weight values in order to minimize the MSE of the training data. The Levenberg-Marquardt algorithm is presented as a combination of the gradient descent and the Gauss-Newton minimization methods. This allows this algorithm to act like a gradient-descent method when the parameters are far from their optimal values and acts like Gauss-Newton method when the parameters are close to their optimal values.
2.2. Training and Validation Data
In any empirical modelling method, the quality of the resulting model depends mainly on the quality, the abundance and the richness of data used in the modelling process. The best data are generally those obtained by experimentation and reflecting as much as possible the real attributes of the physical phenomenon to be modeled. However, in many cases, experimentations can require prohibitive efforts and excessive costs. The use of a mixture of data from 3D modelling, 3D simulation and experimentations can be considered as economical and reliable alternative. In the proposed approach, the used data is a mixture provided by experimentation and 3D simulation. The used welding process 3D modelling of is based on heat transfer equations and metallurgical transformations using temperature dependent material properties and the enthalpy method to investigate the conduction and key-hole modes using surface and volumetric heat sources, respectively. Transition between the heat sources is carried out according to the power density and interaction time. The simulations are carried out using 3D finite element model implemented on commercial software. Experimental validation performed using low carbon galvanized steel in butt-joint configurations on a 3 kW Nd:YAG laser source reveals that the 3D modelling approach can provide not only a consistent and accurate estimation of the weld characteristics under variable welding parameters and conditions but also a comprehensive and quantitative analysis of process parameters effects. The factors and levels used to generate the data for training and validation are presented in Table 1. These factors and levels are chosen based on structured experimental investigations  and exhaustive 3D modelling and simulation .
3. Application of the Proposed Strategy
To appropriately exploit the benefits offered by LW, it is necessary to develop a comprehensive strategy to control the process variables in order to produce desired WSD without being forced to use the traditional and fastidious trial and error procedures. The development of a strategy to predict the WSD is indispensable. The success of building an effective prediction model is based on the careful choice of the appropriate modelling technique and the variables to be included in the model.
Table 1. Factors and levels for training and validation data.
To illustrate the proposed modelling approach, laser power, welding speed, fibre diameter and gap are considered as variables and potential candidate to be included in the model to predict depth of penetration and bead width of the weld. Before training the ANN models and executing the variables selection procedure, it is important to establish the size of the hidden layer and to optimize the training performances especially as the number of variable varies from one model to another. The idea is to approximate the relationship between the size of the hidden layer, the number of input variables and the complexity of the output to be estimated. For all trained models, an average error of less than 1% is used, irrespective of the hidden layer size. Consequently, to avoid long training and overfitting that could affect the models accuracy, the [(i) × (2i + 1) × (o)] network structure is selected where (i) and (o) are the number of inputs and outputs respectively. On the other hand, the starting weights have an influence on the ANN optimal configuration. Multiple random starting weight are used to avoid getting stuck in a local minimum. The selection of the best three among ten networks with random sets of starting weights and then the average performances of the three is used for further analysis in this study.
As is illustrated in Table 2, a total of 16 networks with different inputs combination are built following the full factorial design. The (1) and (0) numbers indicate whether the variables are used as input to the model or not, respectively. The data structure used to produce the designed models is showed in Table 3 and typical results representing the performances of the models as a function of the seven selection criteria are presented in Table 4.
Table 2. Proposed design of experiments.
Table 3. Typical training and testing data randomised sets for prediction model building.
Table 4. Typical modelling performances using MSE values.
Two statistical indices, derived from analysis of variance (ANOVA), are used to analyze the performance of the models: the percent (%) contributions and the average effects of variables included in each model. The % contribution of a variable reflects the portion of the observed total variation attributed to this variable. Ideally, the total % contribution of all considered variables must add up to 100. The difference from 100 represents the contribution of some other uncontrolled modeling variables and experimental errors. The graph of average effects is an interesting way to visualize and estimate approximately the effects of each variable on the modeling performances. As the modeling procedure is designed using a full factorial design, the estimates of the average effects will not be influenced. Both statistical indices are applied to all modeling performance criteria.
The modeling design reveals that a relatively accurate prediction models for DOP and BDW can be achieved using the selected ANN model architecture and shows that all models fitted the training and validation data relatively well as quantified by the mean square error values. For the sake of comparison, all the MSE values were calculated using normalized data. The results indicate that the DOP prediction errors are lower than on the BDW for all models. Remarkable results are achieved using model with gap, speed and power as inputs. Its performance is comparable to the model including the four variables. With only two inputs, the model with gap and speed produce good results. It is also possible to observe that with a power between 2 and 3 kW and diameter between 0.34 µm and 0.52 µm, the DOP and BDW estimations can be achieved with an average error less than 4%.
Using the modelling results, the average effects of each variable on the models performance are evaluated. Derived from ANOVA, Figure 2 shows the effects of the four variables on the training and validation MSE for DOP. These graphs demonstrate that tree variables have positive effects on the designed models. The most influential factor is the speed, followed by the gap then the power and finally the diameter. Also no interaction has been found for the DOP. The effect of diameter is negligible since it increases MSEv and at the same time decreases MSEt.
These results are confirmed by the average effect of each variable in terms of percent contribution to improving model accuracy. Table 5 reveals that the variable significantly reducing MSE values is the welding speed with about 75% contribution. Gap and power contribute by about 14% and 9% respectively. The F-values suggest that all the variables are significant. The results show also that the error contributions remain relatively low (under 1%). This implies that no important variable are omitted in the procedure. These results suggest that there are many options to consider in building an efficient prediction model for DOP. The contribution of power in decreasing MSEv is relatively low (about 2%). However, given the relationships that link power and speed added to the strong correlation between energy concentration and WSD, it is obviously required to consider power and speed in the proposed prediction model.
Figure 2. Effects of welding parameters on MSE variations in modelling DOP for training and validation data.
Figure 3 presents the effects of the four variables on the training and validation MSE values for BDW. These graphs show that the effects of gap and power are less significant for BDW than it was for DOP. The welding speed is the most impacting factor for BDW. The gap and diameter have negative effects on MSEv but these effects remain small and insignificant. These results are confirmed by the average effect in terms of percent contribution reported in Table 6. The welding speed is the most dominate contributor in reducing the MSE values with about 95%. Together, Gap, power and speed contribute for less than 5%. Here again the F-values reveal that all the variables are significant and the error contribution remains relatively low (1%) indicating that no important variables are omitted in the modelling procedure. Regarding MSEtot, the contributions of power and diameter are negligible. Speed and gap can be considered as relevant variables for the BDW model.
Figure 4 represents interactions found between power, diameter and gap for validation data when modelling BDW. The presence of power or diameter reduce
Table 5. % contributions of modelling variables in the performance of the designed models for DOP.
Figure 3. Effects of welding parameters on MSE variations in modelling BDW for training and validation data.
Table 6. % contributions of modelling variables in the performance of the designed models for BDW.
Figure 4. Interaction plot of the MSEv for BDW between gap, power and diameter.
the positive effect of the gap and the presence of diameter reduce the positive effect of power. All the interactions of the diameter remain below 0.003. The interaction between gap and power is more significant.
Assuming 5%, 2% and 1% as limit levels for the% contribution coefficients of various variables regarding MSEt, MSEv and MSEtot suggest three various configurations for DOP and BDW prediction models as presented in Table 7. These models are achieved by setting the variables at levels that minimizes the MSE values. Figure 5 and Figure 6 present training and validation results for the selected models. Figure 5 shows that the validation and training data are relatively well distributed for the three DOP prediction modes. MDOP1 present the best results as expected. MDOP2 show nearly the same performance for training and validation data but less accurate than MDOP1. The use of only three variables affects slightly the training performance. The prediction error on MDOP3 is comparatively higher for both validation and training than the two other models. MDOP1 is the best model but MDOP2 is clearly a good compromise between number of inputs and prediction performances.
Table 7. Variables selection for predictive modelling of DOP and BDW.
Figure 5. Comparison of predicted and measured depth of penetration for the 3 selected models.
Figure 6. Comparison of predicted and measured bead width for the 3 selected models.
The predicted and measured BDW for selected models with different inputs combination is represented in Figure 6. It can be seen that the validation and training data cover effectively and largely the range of BDW variation. The best modelling result is achieved using MBDW1 but it is not as good as for MDOP1. The three models show similar results for validation data. MBDW2 presents similar error for training and validation. This model is certainly more adapted to predict BDW. MDOP3 give an interesting approximation of BDW but having higher prediction error, it is less appropriate than the others models for prediction BDW accurately and effectively.
Table 8. Correlation between predicted and measured WSD attributes using various data sets.
These concluding observations are confirmed by the produced correlation analysis between the predicted and the measured WSD attributes. Correlation analysis results presented in Table 8 demonstrate the superiority of MDOP1 and MBDW1. Globally, these models present good agreement between measured and predicted DOP and BDW in training phase with more 99% and 95% and in validation phases with more than 92% and 86%. When considering all the data the MDOP1 and MBDW1 models present more than 98% and 93% as correlation coefficient respectively. Accordingly, the achieved results demonstrates that the ANN based prediction models present excellent performances and can effectively predict the weld shape and dimensions in LW of galvanized steel in butt joint configurations with an average errors less than 2% and 7% for DOP and BDW respectively. With 7% and 9% as average errors, MDOP2 and MBDW2 can be used as alternative prediction model. With more than 12% as average errors, MDOP3 and MBDW3 appear less appropriate as prediction models.
An artificial neural network based model is developed to predict the weld shape and dimensions in laser welding of galvanized steel in butt joint configurations. The models building procedure is based on a fused data provided by a structured experimental investigation and exhaustive 3D finite element method simulation. The possible relationships between welding parameters such as laser power, welding speed, fibre diameter and gap, and geometric characteristics of the welds specifically depth of penetration and bead width are analyzed and their sensitivity to the welding conditions evaluated using relevant statistical tools. Based on these result, various options for the prediction model building are established and evaluated using seven improved statistical criteria. The achieved results demonstrate that the resulting models present excellent performances and can effectively predict the weld shape and dimensions in laser welding with an average predicting errors less than 10%. These results demonstrate that the proposed ANN based prediction approach can effectively lead to a consistent model able to accurately and reliably provide an appropriate prediction of weld bead geometry and shape under variable welding parameters and conditions.