1. Article Structure
The article begins with three background sections: first a short general background on the University context that led to the course being developed; second a background on the course learning objectives, structure and the topics students chose freely for their individual research; and third a more traditional literature review of the inclusive pedagogy employed for the course. The research method used in this case study was direct observation of the students’ enthusiasm and success, made possible because the teacher was also an experienced educational researcher and the number of students was small.
The article proper then showcases the stages in the investigation of cupcake baking, starting with setting up the experiment and how to judge the cupcakes (measurement system analysis) which are shown in Figure 1.
Once the measurement system was suitable, the process screened the significant cupcake baking factors (i.e. oven temperature) from all the possible factors, and then involved a detailed modelling test and optimization of settings for the best cupcake baking. The article finishes with two conclusions, one for cupcake baking and the other for the benefits of inclusive curriculum design through
Figure 1. Example cupcakes.
student-centred learning especially in STEM subjects. The work is only offered as an encouraging example of curricular techniques to try for greater inclusion.
2. General Background
The University of NSW has developed one of Australia’s largest and most popular higher degree programs in systems engineering and project management, leveraging a close relationship with the Australian Department of Defence (DoD) that formed initially around the delivery of undergraduate programs to the Australian Defence Force Academy in Canberra, but which has since evolved to all aspects of normal university offerings. Most of the higher-degree coursework students are Defence Force members or public servants studying part-time. As a consequence, the technological contexts used are often skewed heavily towards ships, aircraft, missiles, land vehicles and other warfare fields like cybersecurity. Also, the STEM trends in the U.S. DoD are always closely monitored and followed wherever possible, as these fields are relevant to DoD students because of the close alliance between the two countries and the fact that much of Australia’s DoD materiel is procured from the U.S. (Defence, 2014) . Notwithstanding these general contextual tendencies, the University has worked to broaden the appeal of its degrees, so as to help attract and retain non-traditional parts of society to STEM subjects and careers. Such appeal is considered critical because Australia has a low population and yet an enormous area to defend, and it has an aging population , meaning that the DoD must compete for a smaller work-ready population, especially the school-leaver population (Defence, 2016: 150-152) .
The University of NSW’s Master of Systems Engineering and Master of Project Management are popular coursework higher degrees that have as one of their core subjects an introduction to test and evaluation (T&E). Despite many years of growth in these courses, they did not have any advanced elective subjects in T&E techniques. Recent comparison of Australia’s rigour in T&E compared to the U.S. found that the U.S. DoD had prescribed and implemented new advanced scientific techniques and competencies . As such, the UNSW Australia partnered with Air Academy Associates, LLC (AAA) in the U.S., who are a leading research and training firm in these advanced T&E techniques, to adapt one of their courses to the University’s Master programs. The resulting elective subject, known as ZEIT 8034 Advanced T&E Techniques, was developed in October 2015 to February 2016 and trialed for the first time in Semester One of 2016 (February to June) with 15 volunteer students (4 female, 11 male). The subject has successfully run twice since the inaugural trial course.
3. Background to the Course Objectives
The Advanced T&E Techniques course aims to provide advanced test design and analysis techniques that perform the following three major functions:
screen what are the significant factors that influence system responses; model precisely enough the effects of those significant factors to optimize use of a system, particularly limiting variation within quality boundaries; and validate that both the model and system in the optimized state perform acceptably in representative conditions.
The methods used originate from the design of experiments work by Fisher (1971 ) who provided much of the structural processes for applying modern statistics. Modern software packages and practical handbooks make these techniques realistic for the average researcher or test practitioner to use. In this course the main textbook for the practical theory is Schmidt and Launsby (2005) , while the main software packages used are two add-ons to Microsoft Excel® (XL) known as SPC XLTM and DOE PRO XLTM  A second text by Reagan and Kiemele (2008) provides much of the practical instruction on using the software packages to screen, model, optimize and validate performance. A third software package by Phadke Associates Inc., known as rdExpert LiteTM , is used for test design of large numbers of factors of varying types having varying levels for each of the factors, mainly because it has advanced algorithms in combinatorial mathematics.
Students do a one-week intensive course period where they are given the theory and use the new techniques in about eight scripted workshops of increasing complexity. Students “follow-along” the test design and analysis examples to get the solutions provided. They also work in collaborative groups over this week for about three hours each day to do very practical screening, modelling and performance validation on a toy instructional system known as a “statapult”. The Statapult® catapult  is a small wooden catapult that fires rubber balls over a short distance within the classroom as shown in Figure 2.
This learning device has many different factors that affect the distance the balls travel and how that varies between like firings, such that the system has a fairly significant performance variation (i.e. difficult to use accurately)unless it carefully understood and controlled. The teaching device therefore reinforces many of the key themes of the course, such as:
focus on understanding your measurement system before testing and ensure it is adequate; do a simplistic screening test to focus the later more detailed tests that will give accurate modelling; validating performance usually requires new test measures (i.e., timeliness and supply can become as important as accuracy); and there is nearly always multiple solutions in complex systems and understanding how to optimize from these with multiple input factors and multiple output responses requires multiple constraints and weighted solutions.
Figure 2. Illustration of the Statapult® Catapult system shown in use by one of the authors.
There is a 15-minute video compilation of the course at the following link that shows the Statapult® in use with student testimonials about what they learnt in various parts of the course:
At the end of the intensive week, students propose to the teacher and their peers a system that they will have access to and that they want to screen and model using their new test design and analysis techniques. Topics chosen by the students in this inaugural class are listed in Table 1, where a third of students did work-related topics, just over half did hobbies or interests and the remaining two students (13 percent) chose topics for easy testing.
Such context choices are intended to meet the challenge of making tertiary education internationally inclusive by “developing curricular contexts that extend themselves meaningfully into the personal life-worlds (i.e. environment from the perspective of an individual) of students” (Rasi, Hautakangas, & Väyrynen, 2015: 134) . Oral presentation and group discussion of those personal choices to their peers allowed students to “connect the theories, concepts and issues being taught to their life-worlds” (p. 139) and thus be more inclusive in many different ways.
Students undertook their personal research over the following months at work or home with some mentoring at key times by the teacher. Students also undertake a knowledge quiz on the key concepts. Assessment was divided into: 35 percent for the collaborative assignment report concerning the Statapult® system, 15 percent for the computerized knowledge quiz and 50 percent for the individual research and report.
Table 1. Student-centred research topics from the inaugural course.
4. Background to Curriculum Design
The challenge in any new university course is articulated as follows by Tait (2009: 193-194) :
“It is fundamental to the tertiary educator’s role to foster the development of a vibrant and supportive learning community through their relationship with students. Students need to know that university lecturers care about them and their learning. This can be evidenced through well-organized courses and materials; interesting, exciting, and fun activities for diverse learners; deep seated knowledge of the unit concepts; and flexibility to accommodate emergent student learning needs.”
One of the most fun and exciting aspects of the collaborative part of the course was validating the performance of groups’ models and systems to penetrate an enemy castle within an acceptable overall time and number of rounds (i.e. balls fired to get a set number of hits). Foam play blocks were taped together to represent walls and under time pressure, students delighted in trying to hit the walls as often as possible, albeit all underpinned by some serious scientific results. A ball having just impacted a castle wall is shown in Figure 3.
The course lecturer had significant educational research experience twenty years ago [i]that explored the benefits in tertiary education of structured collaborative learning, multiple teaching and learning styles and variety in contexts. He had remained interested in inclusiveness through texts like Hyter and Turnock (2006) and Hyde, Carpenter and Conway (2013) and when given this curricula development opportunity ensured the pedagogy had the crucial social ele-
Figure 3. A fun collaborative activity was the capstone timing of accurate hits on castle walls.
“learning is primarily a social, cultural, and interpersonal process that is influenced as much by social, emotional, and cultural factors as by cognitive ones. … This concern for the social context of learning clearly needs to be added to the suggestion that the meaningful learning of complex material (in contrast to the acquisition of isolated information, which in certain cases is still necessary) may be characterized as being active, constructive, cumulative, self-regulated, and goal oriented …The learner-centered orientation inherent in modern views of learning has important implications for instruction at the tertiary level …” [p. 193]
Inclusive education principles were applied to this course design, where for example there is the diverse pedagogy of structured lectures, structured workshops, collaborative and unstructured workshops (i.e., Statapult®), a knowledge quiz, student presentation, and student-centered individual research with mentoring. The benefits of diversity in pedagogy can include more robust conception, greater enjoyment, higher grades than normal, and better inclusiveness. In Ashman’s (2010) research on inclusive curriculum for Australian university, he refers to this notion as curriculum differentiation:
“Curriculum differentiation refers to a flexible approach to teaching that addresses the different learning needs of students including learning interests, styles and rates within specific learning contexts. In general terms, the curriculum can target a range of outcomes by concentrating on content mastery (e.g. learning ideas and skills), concept mastery (e.g. systems of knowledge) and process mastery (e.g. research and information management skills).” [p. 670]
Ashman (2010: 677) found, inter alia, significant inclusive benefits in creating “work units that accommodated levels of skill, preferences, and interests”. Udvari-Solner and Thousand as early as 1996 (p. 182) found that “sound theoretical foundations and the use of learner-centered, process-oriented, and communication-based instructional approaches are … promising practices for designing a curriculum that is responsive to the needs of diverse learners.”
The research method used in this case study was direct observation of the students’ enthusiasm and success, made possible because the teacher was also an experienced educational researcher and the number of students was small. There was no baseline comparator for this course however the teacher and most students saw examples of all of the aforementioned benefits during this course, when compared to other postgraduate subjects they had taught or taken. Such benefits can be particularly potent when computer-based learning speeds the ability of students to explore concepts and multiple contexts in student-centered ways and these software packages do this well.
Of all of the possible benefits of diverse pedagogy, student inclusiveness is the most rewarding, because students of lower ability often interact more, sharing their conceptions in less confronting ways than a lecture, while students bring in gender-inclusive and culturally-inclusive examples that are well beyond what the teacher could envisage alone. Simply the act of collaborative learning and presenting back to class has been shown to make STEM subjects more gender-inclusive as outlined as follows by Wistedt (1998) :
“From a gender perspective, variations in ways of knowing a subject are considered to be crucial to the learning process, as are the many styles of understanding and ascribing meaning to the course content. The notion of negotiation (Voigt, 1994; Burton, 1995) directs attention towards the reciprocal nature of knowledge formation. Negotiation calls for two-way communication, for reflection upon the foundations of statements put forward, for the trying out and testing of assumptions, for exploration and synthesis.” [p. 144]
Just why the social aspects of learning are important to women learning has always been less important in education than using this knowledge to positive advantage. However, a study by Robichaud et al. (2003) into four contributing factors to worry among university students found “Women in the sample reported significantly higher levels of thought suppression and negative problem orientation” than men. Collaborative learning helps create interactions around problems and compensate for any one student’s lack of confidence and as such, might counteract these tendencies. Surprisingly no studies could be found linking the work on gender differences in worry with collaborative learning.
The gender-inclusive benefit of pedagogy that supports negotiation and interaction is supported by a more recent study by Koppi et al. (2010) , especially their Table 1) into Australian university ICT graduates from 21 different universities. Koppi et al. (2010) , inter alia, concludes the following:
“A pedagogical approach that is inclusive would include the value and meaning of the technology in the broad context of its human application. Without lessening the technological content, the inclusive approach would relate the technology to everyday usage in society and the benefits afforded. … These considerations also relate to the greater call for relevancy and work-integrated learning that the great majority of survey respondents requested.”[pp. 278-279]
By allowing students to select topics important to them from work or hobbies is fundamentally relating what they learn to everyday usage, where they can engage with work colleagues, fellow hobbyist or families to relate at least the benefit of what they have learnt.
The greatest risk in such pedagogy is a potential loss of teaching structure, where some collaborative groups can become inefficient, really weak students can limit their classmates, teachers can become limited by “needy” students, student presentations can either run too long or collapse through a lack of confidence, and sometimes students will have to change topic or re-do large portions of work because they took a wrong direction and it is missed in the freedom of a task. All of these examples occurred in this course but were managed through either intervention or simply accepted by the participants as a reasonable tax for the freedom given to explore and express. Mentoring part-time students on their own work can require phone calls and e-mails on the only weekend they have set aside for their research, and because they chose the topic, the teacher has to ask more questions than they do. Such a reversal of knowledge provided the opportunity to show genuine teaching interest in individual student learning as outlined earlier from Tait (2009: 193-194) .
Having the Statapult® learning device for collaborative groups to explore, fundamentally deconstructs the classroom pedagogy from lecture-centered to student-centered, invoking more active participation. In collaboration students will represent their group’s views in ways they would not do alone and will volunteer ideas and ask questions amongst a group of four students that they might never do in front of the whole class. What makes such interaction inclusive is captured by Wistedt (1998: 152) following the examination of inclusive assessment methods to postgraduate mathematics, physics and statistics classes:
“Inclusive education appreciates such a variation in perspectives, individual as well as social and cultural. Furthermore, if the students’ appreciation of what is worth knowing, what counts as knowing and what characterizes knowing in an academic setting is inextricably linked to the norms expressed within the social setting in which learning takes place, critical reflection upon this setting is a necessary prerequisite for successful learning and teaching. Opening a dialogue with students who vary in backgrounds, interests and experiences, and with teachers who vary in perspectives and expertise, is one way of realizing an inclusive education.”
Students gain a lot of confidence to analyze a system of their own choice from the structured collaborative learning groups, as well as trust that the teacher is genuine in mentoring without direct control. The trust in mentoring is key, because students do their testing and analysis of a system of their own choice part-time once they return home and they need to feel confident to call the teacher and disclose knowledge forgotten or not understood, possible mistakes and so on. Students who are reluctant to take over their learning at such times, first do so in collaborative groups, building confidence from their peers and the students who are keener in this type of exploratory learning. What breaks down such barriers more than anything else once they are doing their own research context, is that they are the expert on their chosen system, and they enjoy teaching the teacher about their chosen system in the process of getting any help they might need.
5. Setting up the Cupcake Experiment
Experimental design for a system process like cupcake baking begins with examining the system through process flows, cause and effect diagrams and classification of factors in accordance with the methods in the course texts (Schmidt & Launsby, 2005; Reagan & Kiemele, 2008) . Figure 4 shows the process flow for baking each cupcake. This indicates how each cupcake was baked under the different scenarios; for example, using the same recipe for the batter but altering the variables according to the different levels for each factor during the cooking stages.
In the process of cooking a cupcake there are many factors that could contribute to the success of the baking process. This research only investigated the factors that affect the cooking process and did not look into the effects the ingredients or method of the batter preparation could have with baking the perfect cupcake. Figure 5 shows the cause and effect diagram for cupcake baking where the factors designations are:
“X” shows the factors that were deliberately altered as part of screening,
“C” are those factors kept constant, and
“N” are factors treated as experimental noise.
The experimental design model in Figure 6 shows the eight input factors and
Figure 4. Process flow for baking cupcakes (source: Microsoft ExcelTM).
Figure 5. Cause and effect diagram for cupcake baking (source: SPC XLTM (iv)).
Figure 6. Experimental Design Diagram for Cupcake Baking (source Microsoft PowerpointTM).
three response outputs for the system ready to do a screening test. Shown for the input factors are the high and low settings (i.e., two-level) such as ten to twenty minutes for cooking time and 140 to 180 degrees Celsius for oven temperature. The output responses chosen needed to be measured consistently by judges. The grading index shown in Table 2 was used to ensure all judges were aware of what qualified for each rating. This was done to attempt to reduce the amount of variation within the measured results. By using this Likert scale approach it allows the qualitative data due to human perspective and preference to be treated somewhat as quantitative data. The “moisture” and “how-well cooked ” outputs scales have an optimum in the centre (3) whereas the optimum for the “appearance” output scale is five.
Standard operating procedures (SOPs) were developed for each of the controlled variables so as to minimize unwanted variation; principal among these was the all-important recipe from Australian Good Taste (2016) and the one home oven as given in Figure 7.
6. How to Judge the Cupcakes: Measurement System Analysis
The repeatability and reproducibility of the cupcake measurement was assessed using the measurement system analysis (MSA) techniques of the course texts to ensure these were accurate, precise and stable. Because the judging criteria is bi-
Figure 7. Oven and recipe used.
Table 2. Grading index for cupcake output response judging.
nary data according to the course texts the number of people multiplied by the number of parts must ideally be greater than or equal to 60. As such the MSA was conducted with 6 judges (operators) and 10 different bake settings (parts) and replicated twice to determine the variation. The raw MSA data and ANOVA analysis of the MSA results for “appearance” are shown as an example in Table 3 and Table 4.
The precision-to-total of the ANOVA for the MSA of the “appearance” grading lies between 0.10 and 0.30 and is therefore sufficient to proceed to testing in accordance with the course texts. Also the resolution is greater than five (6.8) and is therefore adequate to proceed. The operator-to-part interaction is high (96%) showing some bias towards certain cupcakes dependent on their personal preference. The consistency in judges’ ratings is shown in Figure 8 across the ten
Figure 8. Variation in six judges’ ratings of cupcake “Appearance” for ten bakings (source: DOE XLTM (iv)).
Table 3. MSA data for the “Appearance” scale for the six judges (source: SPC XLTM (iv)).
Table 4. ANOVA for MSA for “Appearance” (source: SPC XLTM (iv)).
bakes, illustrating that judges mainly rated differently in Baking 1 and Baking 2 but consistently thereafter.
The other rating scales were also found to be sufficiently accurate and consistent to proceed to the screening test:
“moisture” had a precision-to-total of 0.18 (<0.3) and resolution of 7.6 (>5), and “how-well cooked” had a precision-to-total of 0.26 (<0.3) and resolution of 5.3 (>5).
7. Screening for Significant Factors in Baking
Using the mixed-method flowchart of the course text screening design selection is the Taguchi L12 test design for evaluating eight factors, each at two levels (i.e., high and low values only) with four or more repetitions. This screening method is capable of dealing with up to 11 factors without increasing the number of test runs. The final test design is shown in Table 5 where results are populated three times for each different output response: “moisture”, “appearance” and “how-well cooked ”. Hence a total of twelve by four equals 48 bakes were necessary with three responses for each equals 144 data points, each involving six judges or 864 ratings. Yes, a lot of cupcakes were consumed!
The two-level design for screening captures the linear effects of each factor independent of the other factors. This is adequate for a screening test design since the investigation is to determine the factors that have a significant effect on the perfect cupcake. The insignificant factors can then be screened out and the significant factors can be investigated more thoroughly using a three-level test design to determine any quadratic and interaction effects. Most of the factors for this test are discrete and qualitative except for temperature, fill-of-pan and cooking time which are continuous.
Four tools are predominately used from DOE PRO XL® to screen from the results: marginal means plots, multiple response regression, Pareto of regression coefficients and a multiple response optimizer. Since there are three output res-
Table 5. Taguchi L12 screening design used for cupcake screening (source: DOE XLTM (iv)).
ponses and both the average and variation in each distribution has to be considered, some 24 analyses occurred―hence only an example of each is shown here.
The marginal means plots of the absolute rating value of “moisture” is shown in Figure 9 produced by the eight input factors at their high and low values. It shows that Preheat Oven, Oven Setting, Size-of-Pan and Cooking Time effect the perfect cupcake, with Size of Pan having the greatest effect whereby the smallest pan drives a higher moisture rating while the largest pan drives a lower moisture rating. The marginal means plots for moisture variation are not shown but six of the eight input variables effect the variation in moisture of each cupcake, with Oven Setting, and Fill-of-Pan having the least influence.
The multiple response regression analysis for the absolute values of each output cupcake rating is shown in part in Table 6, where the two-tailed significance of each factor is shown with anything of significance (p < 0.05) shown in red and anything likely to be significant (0.05 < p < 0.1) shown in blue. The size of the non-dimensional coded coefficients directly illustrate the linear size of each factor’s effect relative to one another and this is shown graphically for the cupcake “appearance” rating in Figure 10; again colour coding shows significance and the oven setting and size-of-pan have the greatest effect.
The coded regression table (Table 6) can be decoded by DOE PRO XL® to provide dimensional equations for both the rating absolute value and the variation. As an example, the simplistic linear equations for the three ratings are as follows, where each factor is abbreviated to a letter (A to H) shown in Table 6:
Figure 9. Marginal means plots for the effect of each input factor on absolute value of cupcake “Moisture” grading (source: DOE XLTM (iv)).
Table 6. Multiple response regression analysis in screening for the absolute values of the three cupcake gradings (source: DOE XLTM (iv)).
Figure 10. Pareto Plot of Input Factor Effects on Absolute Value of Cupcake “Appearance” Grading (source: DOE XLTM (iv)).
“How-well cooked ” Rating:
As shown in Table 6, all regression coefficients for the three equations (i.e., R2) and the regression adjusted for sample size (i.e., Adjusted R2) are suitably close to each other (<0.9 rule-of-thumb) and reasonably high (lowest is 0.88), such that the regression models are good fits (<0.7 rule-of-thumb).The tolerance (Tol) in Table 6 is one for all factors showing the test was orthogonal.
The multiple response regression table for variation in each of the responses is not shown but revealed the following regarding the most likely “spread shifters”:
Oven Temperature is likely significant on variation of “Moisture” rating.
Preheat Oven and Fill-of-Pan are significant on variation of “Appearance” rating.
Oven Temperature is significant and Oven Position is likely significant on variation of “How-well-cooked ” rating.
The final tool used was the DOE PRO XL® optimizer tool which involves setting desired constraints, weighted as necessary to examine ideal settings, noting that at this stage the model is only linear. The optimizer allows the ideal ratings and a minimization of variance, however for simplicity here, only the optimization results of the optimum ratings are shown in Table 7. In this table there are two optimizations, one with all three ratings weighted equally at their ideal rating values and one for just optimizing “appearance” only. Clearly, if a cupcake just has to look good, a higher baking temperature and longer cook time are likely to give better results. From these investigations it appears to confirm there is an optimum temperature between 140-180 degrees Celsius and an optimal baking time between 10 and 20 minutes.
Table 7. Example optimized settings following screening test.
There were significant shifts if consistency in the ratings was desirable:
When variance is not as important as the rating values themselves, a lower cooking temperature is the preferred setting; however, when variance is controlled (set to zero) a higher temperature is recommended.
When the baker is only interested in the appearance of the cupcake then a mid-range and a 80% fill is the recommended method; however, if all variances are aimed to be set at a minimum then the fill should be lower (approximately 60%).
Collectively it was determined that the factors required for detailed modelling are Oven Temperature, Pan Fill and Cook Time, while the Position in the Oven, Oven Setting, and Size of Pan can be set to a constant based on their best value. The Preheat Oven and Preheat Pan results were conflicting, therefore these factors are required for modelling in order to appreciate the system accurately. It was decided to combine the Preheat Pan and Preheat Oven into one factor; “Preheat” where if Preheat = 1 then the oven and pan will be preheated at the cooking temperature for 10 minutes prior to baking, if Preheat = 0.5, the pan and oven are heated at 90 degrees Celsius for 10 minutes before baking and Preheat = 0, is starting the baking with the oven and pan cold.. These four factors will be investigated further in modelling.
The settings of the constants were determined from the marginal means plots and the consistent responses in the optimizations to be:
Position in oven―top;
Oven setting―fan forced;
Size of pan―small.
8. Test Design for Detailed Modelling: Just How Many More Cupcakes Have to Be Baked?
A four-factor, 3-level modelling design was required to model the system which according to the course text leads to the Box Behnken modelling design. The benefits of the Box Behnken is that there is a reduced test demand compared to a full factorial test, however, it still provides information on the main, two-way and quadratic interactions. A disadvantage of this method is that it is unable to show the three-way interactions. The setup of the four-factor Box Behnken can be seen below in Table 8 and was used to measure all three outputs; “moisture”, “appearance” and “how-well cooked ” with three repetitions of each test case, as recommended by the text.
As recommended by the texts the origin point is repeated three times to help with orthogonality, at test cases 9, 18 and 27, and the repeats are used to help check the consistency of the testing. Once again, similar to the screening, five
Table 8. Four-factor, three-level, 27-test Box Behnken test design used to model in detail the cupcake baking (source: DOE XLTM (iv)).
judges were used to taste the cupcakes and record their outputs for the “appearance”, “moisture” and “how-well cooked ” for each repetition of each cupcake. This time a 0.5 mark was included between each grading to allow for better distinction between similar cupcakes.
So in total the number of cupcakes baked for the detailed modelling was 27 by three or 81 bakes, with five scores for each bake making 405 test points―once again a lot of free tasting and advertising of the test rigour process within the workplace!
9. Model of the Cupcake Baking
The multiple-response regression model from the test results is shown in… with the insignificant factors progressively removed for each response output.
The tolerance in Table 9 is either a value of “1” or close to it for each factor and interaction, such that the testing is nearly and sufficiently orthogonal. The
The uncoded equations for the three output responses are as follows:
Figure 11. Test consistency at the test space origin for each cupcake rating (source: Microsoft ExcelTM).
Table 9. Multiple-response regression analysis from test results with insignificant factors and interactions removed (source: DOE XLTM (iv)).
These equations show three different four-dimensional spaces that are hard to envisage. Fortunately DOE PRO XLTM provide some impressive tools to explore the effect space that has been modelled. By this stage in their course students usually know their test domain and the tools and they go competently to the optimizer and run “what if ” cases. Before showing that though, it is worth showing two graphs that illustrate two aspects of the cupcake baking model. The first in Figure 12 holds constant a cooking time of 15 minutes and a preheated pan in order to show there is a fairly wide optimum of cooking temperature but a narrow optimum in pan fill whereby the pan needs about a 15 percent air gap from the pan lip.
A second graph in Figure 13 tries to illustrate the complex effect of preheating the pan by holding cooking temperature and cooking time constant. Both the formulae and this “uneven saddle” graph show that preheating is the most complex factor, being the most prevalent quadratic term and interaction term. Pre-
Figure 12. “Appearance” rating for cooking time of 15 minutes and a preheated pan (source: DOE XLTM (iv)).
Figure 13. “Appearance” rating for a fixed cooking time of 15 minutes and cooking temperature of 170 degrees (source: DOE XLTM(iv)).
heating was almost discarded in screening to try to get to a three factor test and the resultant model justifies the decision to keep it, as it is complex across all three cupcake ratings.
10. Optimizing the Best Settings for the Cupcake
The optimization tool of DOE PRO XLTM was used to determine the optimal settings for cupcake baking for five cases as shown in Table 10.
From Table 10 we can determine:
When variance isn’t important (Case 1) then Preheat and a short Cook Time is best, however when variance is to be minimized as well (Case 2) then no Preheat, a slightly lower Pan Fill and a long Cook Time is better.
If the baker is only concerned with the “appearance” of the cupcake, for example for a window display, then higher Oven Settings (Cases 4 & 5) are better than when “moisture” and “how-well cooked ” matter as well (Cases 1-3).
When variance isn’t important for “appearance” (Case 4), the same trend identified earlier of Preheat and a short Cook Time is evident compared to when consistency in appearance is to be minimized as well (Case 5) where no Preheat, a slightly lower Pan Fill and a long Cook Time is better.
Case 3 demonstrates the settings to enable the best compromise between all three outputs while also considering the consistency (variance) of all of the outputs. The predicted multiple-response confidence intervals for the Case 3 settings are in Table 11. Looking at “appearance” shows these are a compromise set of settings because the “appearance” is not a perfect “5”.
If the cupcakes are to be iced then “appearance” is not important and only “moisture” and “how-well cooked” plus the consistency of those two measures needs to be optimized. The ideal settings are shown in Case Six, which are much the same as Case 3 but with a much lower Pan Fill and very good confidence intervals as shown in Table 12.
Figure 14 shows the different cupcakes baked during modelling to show some
Table 10. Optimal settings for five different cases of what a cook might want.
Table 11. Confidence intervals for all-round best cupcake settings (Case 3) (source: DOE XLTM (iv)).
Table 12. Confidence intervals for the best iced cupcake settings (Case 6) (source: DOE XLTM (iv)).
Figure 14. Cupcake variation.
of the variety in the different baking settings. From general observations, the mushroom like muffin top was a result of a full Pan Fill (i.e., 1) and a low temperature oven. The cupcakes with the burnt edges were a result of the longer cook time and the spherical top to some was often a result of a hot, preheated cooking temperature. Note also that there is some variance between cupcakes in each row despite the cupcakes in each row being baked under the same settings.
A new post-graduate tertiary course in advanced test and evaluation techniques (experimental design) provided an opportunity for a more inclusive curriculum through structured collaborative learning on a fun learning device, followed by students having an open choice of a system from their work or personal interests to analyze themselves over the following months with mentoring from their teacher. This extended the curriculum to students’ “life-worlds” as proposed by Rasi et al. (2015) . Presenting on that choice to their peers and instructor further empowered students to share their “life-worlds” in ways that leveraged and enhanced the social aspect of learning and formed greater trust with the teacher for the part-time mentoring phase. Because students go on to also share their analyses with their work colleagues, hobby friends and family, their new knowledge is reinforced in personal ways entirely consistent with both Constructivist and Vygotsky educational theory (Udvari-Solner & Thousand, 1996) . The research method used in this case study was direct observation of the students’ enthusiasm and success, made possible because the teacher was also an experienced educational researcher and the number of students was small. The work is only offered as an encouraging example of curricular techniques to try for greater inclusion.
Using the showcased student’s work on cupcake baking, a female electronics engineer, after the fun collaborative learning, was able to bring her passion for cooking into the class and then, over the following months, and conducting 139 individual bakings, 753 judgings, and obtain 2259 judge ratings amongst her work and friends. This enabled her to share her new knowledge of advanced test techniques in a very personal way, which undoubtedly will have robust and enduring conceptions which she can use to benefit her future test work. Her example was not the only ones, another female engineer shared her passion for toy slot cars, and another aspiring female researcher brought her commercial business knowledge into the learning, helping breakdown difficulties in English-as-a second-language. This case study has reinforced that if STEM subjects are to appeal to non-traditional sources of students, then such structured fun learning and open contextualization are key. In this case, a common cooking effort has been analyzed with advanced test techniques and this should appeal to several non-traditional STEM markets. The social aspect of learning was not only beneficial for females, several fairly reclusive male students blossomed when bringing their hobbies into the class and then their classwork to their hobbies.
There are also other educational aspects at work in the new course as showcased in this article. The ability to explore complex systems with relatively easy-to-learn statistical and experimental design packages involving multiple visual analysis tools is highly effective computer-assisted learning for engineers and project managers, very analogous to the burgeoning use of finite-element modelling packages in research and teaching in the 1980’s and 90’s. As such, the inclusivity of the course is likely to extend to students of lower ability or who are more visual learners.
This case study in new curriculum for a complex STEM subject found the student-centred learning of collaboration, computer-based analysis, and an open student choice of personal research interests, to be highly inclusive in the ways proposed by the literature reviewed (Tait, 2009; Ashman, 2010; Koppi et al., 2010) , especially for gender (Wistedt, 1998) .
4SPC XLTM and DOE PRO XLTM are copyright Air Academy Associates, LLC, and SigmaZone.com.
5rdExpert LiteTM is copyright Phadke Associates Inc.
6Statapult® catapult is a registered trademark of Air Academy Associates, LLC.
 Aldis, G. K., Sidhu, H. S., & Joiner, K. F. (1999). Trial of Calculus and Maple with Heterogeneous Student Groups at the Australian Defence Academy. International Journal of Computer Algebra in Mathematics Education, 6, 167-190.
 Australian Good Taste (2016). Red Velvet cupcakes.
 Churchill, B., Denny, L., & Jackson, N. (2014). Thank God You’re Here: The Coming Generation and Their Role in Future Proofing Australia from the Challenge of Population Ageing. Australian Journal of Social Issues, 49, 373-392.
 Defence Department Australia (2016). Defence White Paper 2016.
 Johnson, R. T., Hutto, G. T., Simpson, J. R., & Montgomery, D. C. (2012). Designed Experiments for the Defense Community. Quality Engineering, 24, 60-79.
 Joiner, K. F., Malone, J., & Haimes, D. (2002). Assessment of Classroom Environments in Reformed Calculus Education. Learning Environments Research, 5, 51-76.
 Koppi, T., Sheard, J., Naghdy, F., Edwards, S. L., &Brookes, W. (2010). Towards a Gender Inclusive Information and Communications Technology Curriculum: A Perspective from Graduates in the Workforce. Computer Science Education, 20, 265-282.
 Lednicky, E. J., & Silvestrini, R. T. (2013). Quantifying Gains Using the Capability-Based Test and Evaluation Method. Quality Reliability Engineering International, 29, 139-156.
 Rasi, P., Hautakangas, M., & Vayrynen, S. (2015). Designing Culturally Inclusive Affordance Networks into the Curriculum. Teaching in Higher Education, 20, 131-142.
 Tait, K. (2009). Reflecting on How to Optimize Tertiary Student Learning through the Use of Work Based Learning within Inclusive Education Courses. International Journal of Teaching and Learning in Higher Education, 20, 192-197.