A Practical Solution to the Small Sample Size Bias and Uncertainty Problems of Model Selection Criteria in Two-Input Process Multiple Response Surface Methodology Datasets

ABSTRACT

Multiple response surface methodology (MRSM) most often involves the analysis of small sample size datasets which have associated inherent statistical modeling problems. Firstly, classical model selection criteria in use are very inefficient with small sample size datasets. Secondly, classical model selection criteria have an acknowledged selection uncertainty problem. Finally, there is a credibility problem associated with modeling small sample sizes of the order of most MRSM datasets. This work focuses on determination of a solution to these identified problems. The small sample model selection uncertainty problem is analysed using sixteen model selection criteria and a typical two-input MRSM dataset. Selection of candidate models, for the responses in consideration, is done based on response surface conformity to expectation to deliberately avoid selection of models using the problematic classical model selection criteria. A set of permutations of combinations of response models with conforming response surfaces is determined. Each combination is optimised and results are obtained using overlaying of data matrices. The permutation of results is then averaged to obtain credible results. Thus, a transparent multiple model approach is used to obtain the solution which gives some credibility to the small sample size results of the typical MRSM dataset. The conclusion is that, for a two-input process MRSM problem, conformity of response surfaces can be effectively used to select candidate models and thus the use of the problematic model selection criteria is avoidable.

Multiple response surface methodology (MRSM) most often involves the analysis of small sample size datasets which have associated inherent statistical modeling problems. Firstly, classical model selection criteria in use are very inefficient with small sample size datasets. Secondly, classical model selection criteria have an acknowledged selection uncertainty problem. Finally, there is a credibility problem associated with modeling small sample sizes of the order of most MRSM datasets. This work focuses on determination of a solution to these identified problems. The small sample model selection uncertainty problem is analysed using sixteen model selection criteria and a typical two-input MRSM dataset. Selection of candidate models, for the responses in consideration, is done based on response surface conformity to expectation to deliberately avoid selection of models using the problematic classical model selection criteria. A set of permutations of combinations of response models with conforming response surfaces is determined. Each combination is optimised and results are obtained using overlaying of data matrices. The permutation of results is then averaged to obtain credible results. Thus, a transparent multiple model approach is used to obtain the solution which gives some credibility to the small sample size results of the typical MRSM dataset. The conclusion is that, for a two-input process MRSM problem, conformity of response surfaces can be effectively used to select candidate models and thus the use of the problematic model selection criteria is avoidable.

KEYWORDS

Multiple Response Surface Methodology, All Possible Regressions, Model Selection Criteria, Data Matrices

Multiple Response Surface Methodology, All Possible Regressions, Model Selection Criteria, Data Matrices

1. Introduction and Literature Review

Multiple response surface methodology (MRSM) is when an industrial process with more than one response variable is investigated and studied through the analysis of reliably generated response models and corresponding response surfaces at some region of operability. Processes with two inputs provide a special case of MRSM problems whose response surfaces can be constructed in three dimensions and therefore can be analysed for conformity.

Hill & Hunter [1] are credited with originally identifying the existence of MRSM in their review paper in response surface methodology. In their review of developments in response surface methodology research from 1966 to 1988, Myers et al. [2] , conclude that most problems encountered in literature and practice are in fact MRSM problems as opposed to single response models (univariate case). Khuri [3] single handedly wrote a full review of MRSM. Mukhopadhyay & Khuri [2] and Khuri [4] afforded sections of MRSM in their latest response surface methodology reviews.

Myers et al. [5] emphasise that MRSM should always include canonical and response surface analysis before optimisation. Khuri [3] [4] clearly argues that traditional response surface techniques that apply to single response models, in general, are not adequate to analyse multiple response problems. Khuri [3] and Mukhopadhyay & Khuri [2] emphasize the use of multivariate statistical methods in every stage of MRSM processes so that the responses are considered simultaneously in every aspect, especially where correlation exists between the responses. Figure 1 shows the MRSM contextual framework as developed from literature.

The first of three problems with industrial MRSM work is the uncertainty inherent in the process of model selection (Wit et al., [7] , Steel [8] , Moral-Benito [9] ) and this worsens when sample sizes are small. Hjort and Claesken [10] state that the uncertainty associated with the model selection process can make the inference based on the final model to be seriously misleading. Danilove and Magnus [11] add that standard errors that are estimated under such circumstances are known to underreport variability. This problem can be solved by avoiding the use of model selection criteria in the selection of best models from candidate models.

Figure 1. Multiple response surface methodology contextual framework.

The second problem concerns the small sample size bias problem of classical model selection criteria. MRSM is largely a regression modeling and model selection problem. Model selection is done through model selection criteria. Seghouane and Bekara [12] state that the derivations of most model selection criteria like Mallow’s Cp [13] , Akaike’s AIC [14] and Schwarz’s SBC [15] rely on asymptotic approximations which are not valid for small sample sizes. Most MRSM experimental datasets fall within the small sample size context. The use of such criteria for the small sample size context problems results in increased bias.

Attempts have been made to correct some of the classical model selection criteria for the small sample size context. Suguira [16] corrected the Akaike’s information criterion (AIC) for small samples by making the assumption of a finite true model and avoiding the asymptotic argument in his derivation of the corrected AIC (AICc). Hurvich and Tsai [17] , after some simulation tests, confirmed that AICc achieves dramatic reductions in bias in small sample sizes compared to AIC and even Schwarz’s SBC. They recommend its use in place of AIC for small sample size problems. McQuarrie and Tsai [18] corrected Hannan and Quinn [19] ’s consistent criterion for small sample sizes to come up with the corrected HQ (HQc). Sawa [20] also did a small sample correction to Schwarz’s SBC to come out with BIC. Seghouane and Bekara [12] did a small sample correction to their Kullback-Leibler symmetric divergence information criterion (KIC) to come up with corrected KIC (KICc). Literature is quite silent in the application of such research findings in MRSM. Both Myers et al. [5] and Khuri [4] do complain on the lake of uptake of research findings by MRSM practitioners. The extent to which the small sample size bias problem has been solved by the small sample size correction efforts can be analysed using an MRSM dataset. In 1999 Pan [21] released his proposal of using bootstrapping for model selection with small samples. There are also proposals to look into model averaging (Yuang and Yang [22] , Xie [23] ).

In accordance with the information-theoretic approach, a “best model” for analysis of data depends on sample size as smaller effects can often only be revealed as sample sizes increase. The amount of information in large datasets greatly exceeds the information availed by small datasets.

The small sample size bias problem and the one of model selection uncertainty are related in that they all have to do with the use of model selection criteria in choosing the best model. If the use of model selection criteria is avoided for the use of an approach with more certainty, both problems will be solved.

The third problem with industrial MRSM work is that most studies fall within small data analytics since they are based on analysing response models generated from datasets emanating from running designed experiments. Such experiments are designed to be cost effective and at the same time are expected to provide optimum information. It is true that most MRSM studies in industrial work use experimental designs which are within the small sample size context, sometimes below the order of (10 + k), to minimize cost. For example, most of the MRSM industrial examples used in Myers et al. [5] and [24] do not have experimental runs (n) > k * 40. This critical problem has not been dealt with in MRSM research studies so far. Khuri [4] does not even mention this problem in his proposals of areas of future research in MRSM. This problem is dealt with by using a solution methodology that has more rigour and transparency.

The three problems encountered by MRSM practitioners related to the small sample MRSM dataset sizes are shown in Figure 2.

This paper focuses on the problems encountered with the use of model selection criteria in selecting best models with particular interest in MRSM. In other words, the focus is on obtaining the best result instead of the best models to use in optimization. In this way,

1) Sets of candidate models are selected based on conformity of response surfaces for each response,

2) Permutations of response model combinations are made,

3) Simultaneous “optimisation” is performed of response model combinations, and

4) The best result is obtained even by averaging.

The use of conformity of response surfaces to select candidate models for optimisation instead of model selection criteria is investigated in this paper. The use of conformity of response surfaces enables avoidance of model selection criteria and thus indirectly solves the model selection criteria uncertainty problem. In the same manner, the avoidance of the use of model selection criteria in model selection indirectly solves the worry of the small sample size bias problem. The selection of response candidate models based on their conformity of response surfaces enables the determination of candidate model combinations for optimisation. The permutation of response model combinations converts to the permutation of results which can be averaged to obtain the best result. This multi-model approach maintains the rigour required in model selection to ensure convincing and credible solutions for the small sample size problems in MRSM. The proposed MRSM theoretical framework of the current study is shown in Figure 3.

Figure 2. Showing the small sample size MRSM problems.

Figure 3. Multiple response surface methodology theoretical framework.

The remainder of this paper is organized as follows. Section 2 introduces the typical small sample MRSM dataset, which will be subjected to analysis in this study. Section 3 analyses the typical small sample MRSM dataset with sixteen model selection criteria taken from literature exposing the selection uncertainty problem in MRSM small sample sizes. In Section 4, the MRSM small sample size problem is resolved by a multi-model approach and the results are analysed. Section 5 looks at the validation of the results in line with the original problem of obtaining cure times of rubber covered mining conveyor belts for a Southern African manufacturer. Section 6 discusses findings and Section 7 concludes and proposes direction of future research.

2. The Dataset

The MRSM experimental design and results used for the current study are shown in Table 1. The experimental design is a two-factor central composite design [25] [26] . The experimental runs were done to determine the best cure times for different industrial and mining conveyor belt thicknesses for a Southern African rubber covered mining conveyor belts manufacturing company. The two input parameters into the belt curing process are cure time (T) and rubber thickness (RT). The measured quality responses are adhesion of belt components (in N/mm) and hardness of cured rubber cover (in Shore A).

This dataset is chosen for use in this investigation because, in addition to being a typical small sample MRSM dataset, two factor experiments produce response models that have response surfaces that can be analysed. Where factors are more than two, it is difficult to construct response surfaces of models in three dimensions. So two factor processes are a special case of MRSM where it is possible to check response model fitness to data, prediction performance and analyse response surface model conformity to expectations before selection of “best” model for consideration in multi-objective optimisation. More complex problems require more complex ways to solve them.

3. All Possible Regressions Modeling

All possible regressions modeling is applied to the dataset in Table 1 to obtain thirty one (2^{p−1}, where p is the total number of regressors) models for each of the two responses, that is, adhesion and hardness. ANNEXURE A shows the thirty one models of each response in detail.

4. The Model Selection Uncertainty

This section analyses the problem of uncertainty that characterises the model selection criteria of response models. The multi-selection criteria approach is

Table 1. Experimental design and averaged results from the MRSM experiment.

able to bring out the between-criteria uncertainty problem into the limelight.

The thirty one all possible regressions response models which are generated from the results of the MRSM experiment are analysed using sixteen different model selection criteria. The sixteen model selection criteria are used to select the best model for adhesion and hardness between the thirty one all regression models. Ten of the sixteen model selection criteria are corrected for small sample size inefficiency; three are not corrected and choose the best model fitting the MRSM dataset, while three choose the best model for prediction. The formulae and relevant details of the selection criteria are given in ANNEXURE B.

Model Selection Uncertainty 1: Model selection criteria do not necessarily agree on the best model

• Adhesion

Table 2 shows the models chosen as best by each criterion. The model selected as best by each criterion is marked by X.

There are two adhesion response models with the most selections in Table 2: one with six, [T*RT, RT^{2}], and the other with five, [T, T*RT, T^{2}, RT^{2}]. Another three response models have lower numbers of selections. In total the sixteen model selection criteria chose five different response models as “best”. The mere dispersion of the “best” selections all over Table 2 shows that there is disagreement on what is best. This is further buttressed by the fact that five models are selected as “best”. Model selection using a single criterion would not have highlighted the selection uncertainty problem in this way as it is clearly evident in a multi-criteria selection scenario.

Table 2. Showing selection criteria and the selected adhesion “best” models.

• Hardness

The model selection uncertainty problem is again evident with hardness as shown in Table 3. One model, [T, RT, T*RT, T^{2}, RT^{2}], is selected by eleven different criteria whilst another two are selected for the remaining five times. Even the ten small sample size corrected criteria do not agree with five choosing the full model, three choosing [T, T^{2}] and two [T, T^{2}, RT^{2}]. This uncertainty is characteristic of the model selection process which selects one model as best.

Model Selection Uncertainty 2: Good fitness to data and/or good prediction performance does not imply conforming response surface

• Analysis of Response Surfaces

Response surfaces for each of the thirty one models are obtained and analysed as to whether they conform to expectations. Data matrices are constructed to confirm.

Only four adhesion response models have conforming response surfaces which are: [T, RT, T*RT]; [T, RT, T*RT, T^{2}]; [T, RT, T*RT, RT^{2}], and [T, RT, T*RT, T^{2}, RT^{2}]. Only one hardness model has a conforming response surface which is [T, RT, T*RT, T^{2}].The response surfaces are shown in ANNEXURE C. Table 4 summarises the results of the analysis of response surfaces.

From Table 4, the adhesion models with the conforming response surfaces are not necessarily the ones selected for best fit to data or for best prediction. There is no single model that has best fit, best prediction as well as conforming

Table 3. Showing selection criteria and the selected hardness best model.

Table 4. Summarising model selection results split between fit to data, prediction and conforming Response Surface (RS).

response surface at the same time. Best fit and/or best prediction does not ensure conforming response surface. This is the second uncertainty problem.

Model Selection Uncertainty 3: Good fitness to data does not necessarily imply good prediction performance

The other uncertainty problem is that a model with good fitness to data is not necessarily good at prediction performance. This is evident from the table. This is the third model selection uncertainty problem.

Model Selection Uncertainty 4: The uncertainty of positioning/ranking by the individual criteria

Figure 4 shows how each of the four adhesion models with conforming response surfaces is ranked by each selection criterion. The only hardness model with a conforming response surface that is selected as best fitting is the full model which is selected by R^{2} only.

In addition to the uncertainty of selection as “best” model, as previously noted, there is also the uncertainty of positioning/ranking by the individual criteria. One cannot predict the positioning/ranking of models by the individual criteria.

The best selection position of the only hardness model with the conforming response surface is number three as seen in Figure 5. Seven model selection

Figure 4. Showing how the 4 adhesion models with conforming responses surfaces are ranked by different criteria.

Figure 5. Showing how the hardness model with conforming response surface is positioned by each criterion.

criteria put this model on position three in their rankings. However, the same response model is ranked twenty five by four other criteria. There is, therefore, no known formula to predict the ranking of the models with conforming response surfaces, as different selection criteria rank them differently. The only way that models with conforming response surfaces can be recognised is by inspecting response surfaces. This is now the fourth model selection uncertainty problem.

5. Effect of Correction of Model Selection Criteria

Twenty eight model selection criteria are used to select the best model and the frequency of selection per number of regressor variables (p) is determined. This is done for each of the response variables of adhesion and hardness.

Adhesion

Table 5 is for the adhesion response. Row 2 of Table 5, titled “ALL 28”, summarises the findings. Row 3, titled “Corrected”, has the frequency details of the ten small sample size corrected criteria.

A visual picture of Table 6 is shown by Figure 6.

The mere fact that there is a selection for each p-value when all the twenty eight model selection criteria are used indicates model selection uncertainty.

Table 6 shows that when only the ten small sample size corrected criteria are considered, two facts emerge: 1) there is zero selections for the three regressor

Figure 6. Showing the model selection results per number of regressors (p) for the adhesion response.

Table 5. Summarising model selection results per number of regressors (p) for the adhesion response.

Table 6. Showing model selection results for pooled regressors before and after the median for the adhesion response.

position and the one position, and 2) there are seven selections to the left of three regressors (median term) and only three to the right.

Hardness

Table 7 is for the hardness response. Row 2 of Table 7, titled “ALL 28”, summarises the findings. Row 3, titled “Corrected”, has the frequency details of the ten small sample size corrected criteria.

Table 7 gives the following line graph shown in Figure 7.

Uncertainty in selection is obvious in both the adhesion and hardness cases where twenty eight model selection criteria are used and when only small sample size corrected criteria are used.

Table 8 again queries the principle of parsimony as the small sample size corrected criteria select no model with three regressors (median term) but select four models to the left hand and six models to the right hand of the median term. It appears the major achievement of small sample size correction is the zeroing of the middle term.

This could be posing a query to the achievement of parsimony in balancing bias (lack of fit) and penalty especially where small sample size corrected criteria is concerned as there could be over-correction which results in criteria selecting models with fewer regressors or underfitting in trying to correct small sample size inefficiency.

Figure 7. Showing the model selection results per number of regressors (p) for the adhesion response.

Table 7. Summarising model selection results per number of regressors (p) for the hardness response.

Table 8. Showing model selection results for pooled regressors before and after the median for the hardness response.

6. Optimisation

MRSM is based on the use of response surfaces of a selected combination of response models to simultaneously determine the region with the desired results. There being no certain method of using a classical model selection criterion to predict models with conforming response surfaces implies the simplest way to deal with the problems of model selection uncertainty and small sample size bias is to avoid choosing candidate “best” models with model selection criteria. In two-input MRSM processes this is possible as response surfaces can be used to select candidate models. Therefore, in this section the four possible results from the permutations of the four adhesion response surfaces and one hardness response surface are analysed. The four results are obtained by simultaneously solving the tabled four pairs of models. The mean square errors (MSE) of the conforming models are also determined.

6.1. The Permutations of Models with Conforming Response Surfaces

The permutations of adhesion-hardness response model pairs with conforming response surfaces is shown summarized in Table 9. The four pairs form the set of candidate pairs for optimisation or determination of desired results.

6.2. The Determination of Desired Results

The desired results are obtained by constructing data matrices for each response in a candidate pair using Microsoft Excel and overlaying them to determine the region where the two response models simultaneously achieve customer expected results of adhesion ≥ 12 N/mm and hardness ≥ 60 ˚Shore A. The detailed process of determination of desired results for each combination is shown in ANNEXURE D. Pair 2 is used here as an example to demonstrate the process.

Pair 2: [T, RT, T*RT, T^{2}] vs. [T, RT, T*RT, T^{2}]

The response surfaces and corresponding data matrices of Pair 2 are presented as an example. The data matrices are overlaid to obtain the desired region from which results of cure time per rubber thickness are obtained. The boxed figure 9 in the data matrix in Table 10 is obtained by setting a time of 22 minutes for a belt with rubber thickness 20 mm.

Table 9. Showing the four pairs of four adhesion models and the one hardness model with conforming response surfaces.

Table 10. Showing the data matrix of the adhesion model [T, RT, T*RT, T^{2}].

• Adhesion [T, RT, T*RT, T^{2}] Data Matrix

The data matrix for the adhesion model [T, RT, T*RT, T^{2}] if overlaid with a hardness model data matrix from a conforming response surface gives the desired region with optimum cure time per rubber thickness.

• Hardness [T, RT, T*RT, T^{2}] Data Matrix

The hardness data matrix of Table 11 for model [T, RT, T*RT, T^{2}] is overlaid with the data matrix for the adhesion model [T, RT, T*RT, T^{2}] to give the desired region with cure time per rubber thickness meeting customer expectations. Table 12 shows the region in red in which both the adhesion and hardness are within the customer specified region of adhesion ≥ 12 N/mm and hardness ≥ 60 ˚Shore A.

Cure times per belt rubber thickness are selected to ensure customer expectations are met and right levels of productivity maintained. The region in red has both adhesion and hardness results in levels acceptable to the customer expectations. The boxed figures in the desired region indicate the cure time per belt rubber thickness combinations considered for work instructions.

6.3. MSE’s of Conforming Response Models

Mean Sum of Squares (MSE’s) for the models with conforming response surfaces are determined using Equation (9) for comparing of accuracy of response models. The formula for MSE is given below for a sample size n.

$\text{MSE}=\sqrt{\frac{{\displaystyle {\sum}_{i=1}^{n}{\left({Y}_{i}-{\stackrel{^}{Y}}_{i}\right)}^{2}}}{n}}$ (9)

where ${Y}_{i}$ is the measured response, ${\stackrel{^}{Y}}_{i}$ is the predicted response.

Table 11. Showing the data matrix of the hardness model [T, RT, T*RT, T^{2}].

Table 12. Showing the region of overlay with the wanted results.

7. Results

This section shows the results obtained by optimizing each of the four pairs with the methodology outlined in Section 5. The computed MSE results of the individual models with conforming response surfaces are shown.

7.1. Tables of Rubber Thickness vs. Cure Time

Using overlaying of data matrices, tables of rubber thickness-cure time combinations are determined for the four pairs of models. The tables are shown as Tables 13-16.

If all the result tables are averaged, a result equivalent to Table 14 and Table 15 is obtained. This result is both the median and mode of the tables and is therefore the best to adopt for use.

7.2. MSE’s of Models with Conforming Response Surfaces

Response model accuracy is checked by the size of the MSE. Table 17 shows the computed MSE values of models with conforming response surfaces.

The MSE values for adhesion models show that the full model, [T, RT, T*RT, T^{2}, RT^{2}], has the best fit-to-data accuracy compared to the other three models with conforming response surfaces.

Table 13. Pair 1 [T, RT, T*RT] vs. [T, RT, T*RT, T^{2}].

Table 14. Pair 2 [T, RT, T*RT, T^{2}] vs. [T, RT, T*RT, T^{2}].

Table 15. Pair 3 [T, RT, T*RT, RT^{2}] vs. [T, RT, T*RT, T^{2}].

Table 16. Pair 4 [T, RT, T*RT, T^{2}, RT^{2}] vs. [T, RT, T*RT, T^{2}].

Table 17. Showing MSE results for adhesion and hardness.

8. Validation

The results from the first seven conveyor belts to be run from the process, shown in Table 18, are compared with model forecasted results by a paired T-Test and the mean forecasted squared errors (MSFE) of response models are computed.

Validation by T-Tests

The first validation test proves that there is no significant statistical difference between the forecasted results and the results obtained from belts run from the normal belt production process. The results from the first seven conveyor belts to be run from the process are compared from model forecasted results using a paired T-Test as shown in Figure 8.

Validation by MSFE

In this section, validation of results is done by checking the effectiveness to obtain better or the same rubber thickness-cure time combinations as obtained in the first MRSM experiment.

$\text{SFE}=\sqrt{\frac{{\displaystyle {\sum}_{i=1}^{n}{\left({Y}_{i}-{Y}_{Fi}\right)}^{2}}}{n}}$ (10)

where
${Y}_{i}$ is i^{th} the measured response,
${Y}_{Fi}$ is the i^{th} forecasted response.

The MSFE values for adhesion models shown in Table 19 reveal that the full model, [T, RT, T*RT, T^{2}, RT^{2}], has the best prediction accuracy compared to the

Figure 8. Showing the paired T-Test results.

Table 18. Validation results.

Table 19. Showing MSFE results for adhesion and hardness.

other three models with conforming response surfaces. The best combination of adhesion versus hardness models for the best accuracy results is therefore the pair: Pair 4 [T, RT, T*RT, T^{2}, RT^{2}] vs. [T, RT, T*RT, T^{2}].

Averaging all the results tables gives a table similar to Table 14 and Table 15. The table represented by Table 14 and Table 15 is therefore both the mean and mode of the permutation of results. This shows that it is a robust result. This result would be the best to use in Work Instructions in this case.

9. Discussion

This section discusses the small sample size MRSM datasets problems of using classical model selection criteria to select “best” models for use in determining desired solutions and how they are solved.

The Small Sample Size Bias Problem

The small sample size model selection criteria bias problem is solved by avoiding the use of model selection criteria for selecting “best” models in this study. In fact where response surfaces can be used to select candidate models, classical model selection criteria become irrelevant with all their problems.

The Model Selection Uncertainty Problem

The model selection uncertainty problem is related to the use of classical model selection criteria for selecting best models. Even the use of small sample size corrected criteria is shown to have the problem of model selection uncertainty. It is difficult to predict a model with a conforming response surface with classical best model selection criteria whether small sample size corrected or not. MRSM depends a lot on the use of conforming response surfaces in simultaneously determining the desired results. Research on design of experiments for MRSM should focus more on determining conforming response surfaces than just models with good fit to available data or prediction. The conformance of the response surface within the region of interest should be the focus of MRSM experimental designing research.

Using Response Surfaces to Select Response Models

According to Moral-Benito [9] , standard practice in empirical research is based on two steps: 1) researchers select a model from the space of all possible models and 2) proceed as if the selected model had generated the data. Uncertainty is, therefore, typically ignored in the model selection step.

The use of response surfaces to select conforming models is very possible with two-factor multiple regression models although it is not that easy with processes of more than two factors. However, whilst still dealing with two factor processes, selecting models based on conformity of response surfaces avoids use of classical model selection criteria in choosing the best models and hence the problems of model selection criteria uncertainty and small sample size datasets bias. In MRSM that is very important. Using response models with conforming response surfaces within the region of interest is the fundamental focus of MRSM. The selection of candidate models with conforming response surfaces within the region of interest also opens the door for multiple model solution methodology which introduces rigour, transparency and therefore credibility into the MRSM results. The focus of MRSM, therefore, shifts from obtaining the “best” model to obtaining the best results within the region of interest.

10. Conclusion and Future Research

Selecting candidate response models based on conformity of response surfaces avoids the uncertainty and small sample size bias problems that are related to using classical model selection criteria in selecting best models. So in two-input process problems, where response surfaces can easily be constructed and analysed, it is better for practitioners to use response surfaces to select candidate models for determining the permutations of response model sets for onward simultaneous optimisation.

Multiple model MRSM approach ensures credibility as rigour is maintained up to the final result. The problem of model selection uncertainty is kept clear of affecting the final result in a very transparent way.

However, for best results, the proposed multiple model MRSM approach based on using candidate models with conforming response surfaces requires prior knowledge on the expected ideal response surface.

Future Research

1) More datasets from two-factor processes need to be studied to ensure generalizability of findings to all other two-factor processes.

2) A usability study of the multivariate approach (Fujikoshi and Satoh [8] ) will be done to investigate applicability, simplification, effectiveness and accuracy.

3) Model averaging (Yuan and Yang [27] , Xie [22] ) will be looked into to investigate applicability, accuracy, effectiveness and simplification.

4) Applicability of envelope models and methods to MRSM.

5) Research on generalising beyond the two-factor process to three and higher factor processes is necessary.

Annexure A

In this section, the MRSM dataset is modeled to produce adhesion and hardness response models studied in this univariate investigation. Response modeling is done using Minitab 17. The model selection uncertainty of this MRSM small sample dataset is analysed. The general multivariate multiple regression model is

${y}_{ui}={f}_{i}\left({X}_{u},\beta \right)+{\epsilon}_{ui},u=1,2,\cdots ,n$ _{}and
$i=1,2,\cdots ,r$ (1)

where
${X}_{u}$ is the vector of settings of k design variables at the u^{th}_{ }experimental run,
$\beta $ is a vector of unknown parameters,
${f}_{i}$ is a function of known form for the i^{th} responses, and
${\epsilon}_{ui}$ is a random error associated with the i^{th} response for the experimental run u. It is assumed that the
${\epsilon}_{ui}$ ‘s are normally distributed such that
$E\left({\epsilon}_{ui}\right)=0$ ,
$E\left({\epsilon}_{ui}\times {\epsilon}_{vj}\right)=0$ , for all
$i\ne j,u\ne v$ ,
$Var\left({\epsilon}_{ui}\right)={\sigma}_{ui}$ ._{ }

There are two important mathematical models that are commonly used in multi-response surface methodology which are special cases of model (1).

The first-degree model (d = 1),

$y={\beta}_{0}+{\displaystyle {\sum}_{j=1}^{k}{\beta}_{i}{x}_{i}}+\epsilon $ (2)

And, the second-degree model (d = 2)

$y={\beta}_{0}+{\displaystyle {\sum}_{i=0}^{k}{\beta}_{i}{x}_{i}}+{\displaystyle {\sum}_{i=0}^{n}{\displaystyle {\sum}_{j=0}^{n}{\beta}_{ij}{x}_{i}{x}_{j}}}+{\displaystyle {\sum}_{i=0}^{n}{\beta}_{ii}{x}_{i}^{2}}+\epsilon $ (3)

In this study, y is either adhesion or hardness, and ${x}_{i}$ is cure time (T) or total rubber thickness (RT). The parameters (β’s), are estimated using statistical software.

All regression methodology is employed to produce thirty one response models for each of the two responses, that is, adhesion and hardness from the MRSM dataset.

The general second degree model of Equation (3) is expanded into the following belt curing model:

$\begin{array}{l}\text{Adhesion}/\text{Hardness}\\ ={\beta}_{0}+{\beta}_{1}\ast \text{T}+{\beta}_{2}\ast \text{RT}+{\beta}_{12}\ast \text{T}\ast \text{RT}+{\beta}_{11}\ast {\text{T}}^{\text{2}}+{\beta}_{22}\ast {\text{RT}}^{\text{2}}+\text{error}\end{array}$ (4)

where T is cure time in minutes, RT is rubber thickness in millimeters, ${\beta}_{0}$ is the intercept and ${\beta}_{1}$ , ${\beta}_{1}$ , ${\beta}_{12}$ , ${\beta}_{11}$ , and ${\beta}_{22}$ are estimates of parameters.

1) Adhesion Response Models

Table A1 shows the summary information of the thirty one all possible regression models generated by Minitab 17 for the adhesion response. Each response model is shown with its regressors and parameter values. For example, the first and second models which are represented by T and RT expand to:

$\text{Adhesion}=3.26+0.3244\ast \text{T}$ (5)

and

$\text{Adhesion}=15.41-0.3127\ast \text{RT}$ (6)

The rest of the twenty nine models are expanded in a similar manner.

Using the classical model selection process anyone of these thirty one models shown in Table A1 is a potential “best” model as chosen by the selection criterion in use. In this study, this set of models is subjected to multiple criteria analysis to avoid the risks of the one criterion model selection approach.

Table A1. Summary of adhesion models.

2) Hardness Response Models

Table A2 shows the summary information of the thirty one all-regressions models generated by Minitab 17 for the hardness response. Each response model is shown with its regressors and parameter values. The first two summerised models can be expanded to:

Table A2. Summary of hardness models.

$\text{Adhesion}=45.75+0.513\ast \text{T}$ (7)

and

$\text{Adhesion}=60.25-0.18\ast \text{RT}$ (8)

The rest of the twenty nine models can be similarly expanded.

Annexure B

Annexure C

The four adhesion and one hardness conforming response surfaces constructed using excel are shown.

1) Adhesion [T, RT, T*RT] (Figure C1)

Figure C1. Showing the adhesion response surface for the model.

2) Adhesion [T, RT, T*RT, T^{2}] Response Surface (Figure C2)

Figure C2. Showing the adhesion response surface for the model [T, RT, T*RT, T^{2}].

3) Adhesion [T, RT, T*RT, RT^{2}] Response Surface (Figure C3)

Figure C3. Showing the response surface of the adhesion model [T, RT, T*RT, RT^{2}].

4) Adhesion [T, RT, T*RT, T^{2}, RT^{2}] Response Surface (Figure C4)

Figure C4. Showing the conforming response surface for the adhesion model [T, RT, T*RT, T^{2}, RT^{2}].

The four adhesion response surfaces show the expected continuously rising modulus behaviour of rubber skim compounds whose specification is synthetic rubber based.

5) Hardness [T, RT, T*RT, T^{2}]

The response surface for [T, RT, T*RT, T^{2}] is the only conforming response surface for hardness among the thirty one hardness models and is shown in Figure C5.

Figure C5. Showing the hardness response surface from the single conforming response surface model.

The hardness behaviour of the cover compound is as expected from a natural rubber based compound designed for hardness.

Annexure D

1) Response surfaces, data matrices and determination of results

a) Data Matrices

i) Adhesion [T, RT, T*RT] (Table D1)

Table D1. Showing the data matrix of the adhesion model [T, RT, T*RT].

The data matrix for the adhesion model [T, RT, T*RT] if overlapped with a hardness model data matrix from a conforming response surface will give a wanted region with optimum cure time per rubber thickness results.

ii) Hardness [T, RT, T*RT, T^{2}] (Table D2)

Table D2. Showing the data matrix of the hardness model [T, RT, T*RT, T^{2}].

The hardness data matrix for model [T, RT, T*RT, T^{2}] will overlap with any conforming response model to give the optimum region with cure time per rubber thickness results.

Table D3 shows the region in red in which both the adhesion and hardness are within the customer specified region of adhesion ≥ 12 N/mm and hardness ≥ 60 Shore A.

Table D3. Showing the customer specified region in red.

2) Pair 2: [T, RT, T*RT, T^{2}] vs. [T, RT, T*RT, T^{2}]

a) Response Surfaces

i) Adhesion [T, RT, T*RT, T^{2}] (Figure D1)

Figure D1. Showing the adhesion response surface for the model [T, RT, T*RT, T^{2}].

ii) Hardness [T, RT, T*RT, T^{2}] (Figure D2)

Figure D2. Showing the hardness response surface for [T, RT, T*RT, T^{2}].

b) Data Matrices

i) Adhesion [T, RT, T*RT, T^{2}] (Table D4)

Table D4. Summarising model selection results split between fit to data, prediction and conforming response surface.

ii) Hardness [T, RT, T*RT, T^{2}] (Table D5 and Table D6)

Table D5. Showing the data matrix of the hardness model [T, RT, T*RT, T^{2}].

Table D6. Showing the customer specified region of Pair 2 in red.

3) Pair 3: [T, RT, T*RT, RT^{2}] vs. [T, RT, T*RT, T^{2}]

a) Response Surfaces

i) Adhesion [T, RT, T*RT, RT^{2}] (Figure D3)

Figure D3. Showing the response surface of the adhesion model [T, RT, T*RT, RT^{2}].

ii) Hardness [T, RT, T*RT, T^{2}]

The hardness response surface is shown already in Figure D2 above.

b) Data Matrices

i) Adhesion [T, RT, T*RT, RT^{2}] (Table D7)

Table D7. Summarising model selection results split between fit to data, prediction and conforming response surface.

ii) Hardness [T, RT, T*RT, T^{2}] (Table D8 and Table D9)

Table D8. Showing the data matrix of the hardness model [T, RT, T*RT, T^{2}].

Table D9. Showing the customer specified region of Pair 3 in red.

4) Pair 4: [T, RT, T*RT, T^{2}, RT^{2}] vs. [T, RT, T*RT, T^{2}]

a) Response Surfaces

i) Adhesion [T, RT, T*RT, T^{2}, RT^{2}] (Figure D4)

Figure D4. Showing the response surface of the adhesion model [T, RT, T*RT, T^{2}, RT^{2}].

ii) Hardness [T, RT, T*RT, T^{2}]

The hardness response surface is shown in D2 above.

b) Data Matrices

i) Adhesion [T, RT, T*RT, T^{2}, RT^{2}] (Tables D10-D12)

Table D10. Showing the data matrix of the adhesion model [T, RT, T*RT, T^{2}, RT^{2}].

Table D11. Showing the data matrix of the hardness model [T, RT, T*RT, T^{2}].

Table D12. Showing the customer specified region of Pair 4 in red.

2) Results

See Table D13.

Table D13. Showing rubber thickness―Cure Time optimum combinations for the four pairs of adhesion vs. hardness models. (a) Pair 1: [T, RT, T*RT] vs. [T, RT, T*RT, T^{2}]; (b) Pair 2: [T, RT, T*RT, T^{2}] vs. [T, RT, T*RT, T^{2}]; (c) Pair 3: [T, RT, T*RT, RT^{2}] vs. [T, RT, T*RT, T^{2}]; (d) Pair 4: [T, RT, T*RT, T^{2}, RT^{2}] vs. [T, RT, T*RT, T^{2}].

Cite this paper

Pavolo, D. and Chikobvu, D. (2019) A Practical Solution to the Small Sample Size Bias and Uncertainty Problems of Model Selection Criteria in Two-Input Process Multiple Response Surface Methodology Datasets.*Open Journal of Statistics*, **9**, 109-142. doi: 10.4236/ojs.2019.91010.

Pavolo, D. and Chikobvu, D. (2019) A Practical Solution to the Small Sample Size Bias and Uncertainty Problems of Model Selection Criteria in Two-Input Process Multiple Response Surface Methodology Datasets.

References

[1] Hill, W.J. and Hunter, W.G. (1966) A Review of Response Surface Methodology: A Literature Review. Technometrics, 8, 571-590.

https://doi.org/10.2307/1266632

[2] Mukhopadhyay, S. and Khuri, A.I. (2010) Response Surface Methodology. Wires Computational Statistics, 2, 128-149.

https://doi.org/10.1002/wics.73

[3] Khuri, A.I. (1988) The Analysis of Multiresponse Experiments: A Review. Technical Report 302, Department of Statistics, University of Florida, Gainesville, FL32611.

[4] Khuri, A.I. (2017) A General Overview of Response Surface Methodology. Biometrics & Biostatistics International Journal, 5, Article ID: 00133.

https://doi.org/10.15406/bbij.2017.05.00133

[5] Myers, R.H., Khuri, A.I. and Carter Jr., W.H. (1989) Response Surface Methodology: 1966-1988. Technometrics, 31, 137-157.

[6] Fujikoshi, Y. and Satoh, K. (1997) Modified AIC and Cp in Multivariate Linear Regression. Biometrika, 84, 707-716.

https://doi.org/10.1093/biomet/84.3.707

[7] Rawlings, J.O., Pantula, G.S. and Dickey, A.D. (1998) Applied Regression Analysis: A Research Tool. 2nd Edition, Springer-Verlag, New York Inc., New York.

https://doi.org/10.1007/b98890

[8] Suguira, N. (1978) Further Analysis of the Data by Akaike’s Information Criterion and the Finite Corrections. Communications in Statistics A, 7, 13-26.

https://doi.org/10.1080/03610927808827599

[9] Moral-Benito, E. (2015) Model Averaging in Economics: An Overview. Journal of Economic Surveys, 29, 46-75.

https://doi.org/10.1111/joes.12044

[10] Hjort, N. and Claeskens, G. (2003), Frequentist Model Average Estimators. Journal of the American Statistical Association, 98, 879-899.

https://doi.org/10.1198/016214503000000828

[11] Danilov, D. and Magnus, J.R. (2004) On the Harm That Ignoring Pretesting Can Cause. Journal of Econometrics, 122, 27-46.

https://doi.org/10.1016/j.jeconom.2003.10.018

[12] Seghouane, A.K. and Bekara, M. (2004) A Small Sample Model Selection Criterion Based on the Kullback Symmetric Divergence. IEEE Transactions Signal Processing, 52, 3314-3323.

https://doi.org/10.1109/TSP.2004.837416

[13] Mallows, C.L. (1973) Some Comments on Cp. Technometrics, 15, 661-675.

[14] Akaike, H. (1974) A New Look at Statistical Model Identification. IEEE Transactions on Automatic Control, 19, 716-723.

https://doi.org/10.1109/TAC.1974.1100705

[15] Schwarz, G. (1978) Estimating the Dimension of a Model. The Annals of Statistics, 6, 461-464.

https://doi.org/10.1214/aos/1176344136

[16] Steel, M.F.J. (2018) Model Averaging and Its Use in Economics. arXiv:1709.88221v2[stat.AP]

[17] Hurvich, C. and Tsai, C. (1989) Regression and Time Series Model Selection in Small Samples. Biometrika, 76, 297-307.

https://doi.org/10.1093/biomet/76.2.297

[18] MaQuarrie, A.D.R. and Tsai, C.L. (1998) Regression and Time Series Model Selection. World Scientific, Singapore.

[19] Hannan, E.J. and Quinn, B.G. (1979) The Determination of the Order of an Autoregression. Journal of the Royal Statistical Society, Series B, 41, 190-195.

[20] Sawa, T. (1978) Information Criteria for Discriminating among Alternative Regression Models. Econometrica, 46, 1273-1291.

https://doi.org/10.2307/1913828

[21] Pan, W. (1999) Bootstrapping Likelihood for Model Selection with Small Samples. Journal of Computational and Graphical Statistics, 8, 687-698.

[22] Xie, T. (2015) Prediction Model Averaging Estimator. Economics Letters, 131, 5-8.

[23] Wit, E., Van den Heuvel, E. and Romeeijn, J.-W. (2012) ‘All Models Are Wrong …’: An Introduction to Model Uncertainty. Statistical Neerlandia, 66, 217-236.

https://doi.org/10.1111/j.1467-9574.2012.00530.x

[24] Myers, R.H., Montgomery, D.C. and Anderson-Cook, C.M. (2009) Response Surface Methodology: Process and Product Optimization. 3rd Edition, John Wiley & Sons, New Jersey, Hoboken.

[25] Pavolo, D. and Chikobvu, D. (2018) Optimising Vulcanisation Time of Rubber Covered Mining Conveyor Belts Using Multi-Response Surface Methodology.

[26] Pavolo, D. and Chikobvu, D. (2018) Dealing with the Small Sample Size Credibility Problem of Two-Factor Multiple Response Surface Methodology Problems.

[27] Yuan, Z. and Yang, Y. (2005) Combining Linear Regression Models: When and How? Journal of the American Statistical Association, 100, 1202-1214.

[1] Hill, W.J. and Hunter, W.G. (1966) A Review of Response Surface Methodology: A Literature Review. Technometrics, 8, 571-590.

https://doi.org/10.2307/1266632

[2] Mukhopadhyay, S. and Khuri, A.I. (2010) Response Surface Methodology. Wires Computational Statistics, 2, 128-149.

https://doi.org/10.1002/wics.73

[3] Khuri, A.I. (1988) The Analysis of Multiresponse Experiments: A Review. Technical Report 302, Department of Statistics, University of Florida, Gainesville, FL32611.

[4] Khuri, A.I. (2017) A General Overview of Response Surface Methodology. Biometrics & Biostatistics International Journal, 5, Article ID: 00133.

https://doi.org/10.15406/bbij.2017.05.00133

[5] Myers, R.H., Khuri, A.I. and Carter Jr., W.H. (1989) Response Surface Methodology: 1966-1988. Technometrics, 31, 137-157.

[6] Fujikoshi, Y. and Satoh, K. (1997) Modified AIC and Cp in Multivariate Linear Regression. Biometrika, 84, 707-716.

https://doi.org/10.1093/biomet/84.3.707

[7] Rawlings, J.O., Pantula, G.S. and Dickey, A.D. (1998) Applied Regression Analysis: A Research Tool. 2nd Edition, Springer-Verlag, New York Inc., New York.

https://doi.org/10.1007/b98890

[8] Suguira, N. (1978) Further Analysis of the Data by Akaike’s Information Criterion and the Finite Corrections. Communications in Statistics A, 7, 13-26.

https://doi.org/10.1080/03610927808827599

[9] Moral-Benito, E. (2015) Model Averaging in Economics: An Overview. Journal of Economic Surveys, 29, 46-75.

https://doi.org/10.1111/joes.12044

[10] Hjort, N. and Claeskens, G. (2003), Frequentist Model Average Estimators. Journal of the American Statistical Association, 98, 879-899.

https://doi.org/10.1198/016214503000000828

[11] Danilov, D. and Magnus, J.R. (2004) On the Harm That Ignoring Pretesting Can Cause. Journal of Econometrics, 122, 27-46.

https://doi.org/10.1016/j.jeconom.2003.10.018

[12] Seghouane, A.K. and Bekara, M. (2004) A Small Sample Model Selection Criterion Based on the Kullback Symmetric Divergence. IEEE Transactions Signal Processing, 52, 3314-3323.

https://doi.org/10.1109/TSP.2004.837416

[13] Mallows, C.L. (1973) Some Comments on Cp. Technometrics, 15, 661-675.

[14] Akaike, H. (1974) A New Look at Statistical Model Identification. IEEE Transactions on Automatic Control, 19, 716-723.

https://doi.org/10.1109/TAC.1974.1100705

[15] Schwarz, G. (1978) Estimating the Dimension of a Model. The Annals of Statistics, 6, 461-464.

https://doi.org/10.1214/aos/1176344136

[16] Steel, M.F.J. (2018) Model Averaging and Its Use in Economics. arXiv:1709.88221v2[stat.AP]

[17] Hurvich, C. and Tsai, C. (1989) Regression and Time Series Model Selection in Small Samples. Biometrika, 76, 297-307.

https://doi.org/10.1093/biomet/76.2.297

[18] MaQuarrie, A.D.R. and Tsai, C.L. (1998) Regression and Time Series Model Selection. World Scientific, Singapore.

[19] Hannan, E.J. and Quinn, B.G. (1979) The Determination of the Order of an Autoregression. Journal of the Royal Statistical Society, Series B, 41, 190-195.

[20] Sawa, T. (1978) Information Criteria for Discriminating among Alternative Regression Models. Econometrica, 46, 1273-1291.

https://doi.org/10.2307/1913828

[21] Pan, W. (1999) Bootstrapping Likelihood for Model Selection with Small Samples. Journal of Computational and Graphical Statistics, 8, 687-698.

[22] Xie, T. (2015) Prediction Model Averaging Estimator. Economics Letters, 131, 5-8.

[23] Wit, E., Van den Heuvel, E. and Romeeijn, J.-W. (2012) ‘All Models Are Wrong …’: An Introduction to Model Uncertainty. Statistical Neerlandia, 66, 217-236.

https://doi.org/10.1111/j.1467-9574.2012.00530.x

[24] Myers, R.H., Montgomery, D.C. and Anderson-Cook, C.M. (2009) Response Surface Methodology: Process and Product Optimization. 3rd Edition, John Wiley & Sons, New Jersey, Hoboken.

[25] Pavolo, D. and Chikobvu, D. (2018) Optimising Vulcanisation Time of Rubber Covered Mining Conveyor Belts Using Multi-Response Surface Methodology.

[26] Pavolo, D. and Chikobvu, D. (2018) Dealing with the Small Sample Size Credibility Problem of Two-Factor Multiple Response Surface Methodology Problems.

[27] Yuan, Z. and Yang, Y. (2005) Combining Linear Regression Models: When and How? Journal of the American Statistical Association, 100, 1202-1214.