Proper management of risks is an important part of ensuring the competitiveness of manufacturing businesses. The rapid pace of the changes taking place in this economic sector is increasing the complexity of the business environment and creating all kinds of emergent risks that must be taken into account. The digital-organized production related to new technologies is also a challenge mainly because it changes the working environment  . To manage risks adequately, a business must be able to analyze and assess them. Risk analysis is a systematic process of identifying occurrences that could affect an organization’s chances of attaining its goals  . This process is essential for all managerial functions of a business, including occupational health and safety (OHS).
Over the past few decades, several methods of risk analysis have been developed for use in manufacturing settings. All have the same aim, namely the identification of the risks in the organization so they can be managed to prevent accidents in the workplace. They have evolved and have contributed substantially to advancing knowledge in the field of OHS. However, despite much effort, the goal of achieving substantial reductions in the frequency of workplace accidents remains elusive  . It is being observed more and more that the risk analysis methods and accident models in current use are not in tune with current knowledge in the field of OHS management  . To improve workplace safety, other types of factors need to be taken into consideration. The evolution of accident models reflects this, as emphasis has shifted from technical factors to the human factors  . In addition, organizational factors need to be considered as emergent risks within businesses nowadays  .
Most of the conventional methods of risk analysis used in the manufacturing sector take technical elements and the human factors into account. However, the complexity of manufacturing businesses is increasing, and there is a growing need for a method that also takes new realities into account. Conventional methods are not necessarily up to the task of identifying and analyzing risks in complex sociotechnical systems   . Most of the time, these methods are sufficient when dealing with simple risks but seem to be less effective when dealing with organizational or systemic risks. According to current thought, analysis must take into account potentially emergent risks inherent in the organization itself  or in its supervision and management or in governmental legislation and policy  .
Several paths to improved understanding of the challenges associated with risk analysis in complex systems have been proposed in the literature  . In recent years, new models of systemic risk analysis have been developed to consider the sociotechnical system as a whole. Among these, the system-theoretic accident model and processes or STAMP method  , the functional resonance analysis method or FRAM  and the risk management framework model  have been the subject of numerous articles. Although these models differ in several ways, they share a common vision that considers safety to be influenced by all levels of the system.
This situation raises questions regarding the risk analysis methods used in manufacturing settings, in particular whether or not conventional methods alone are adequate for analyzing the emergent risks that manufacturing businesses are now facing. In such companies, the dangers inherent in these risks might be going unnoticed  . It appears that it would be dangerous to rely on these methods alone or to assume that they still effectively take into account the current complexity of businesses  .
In this article, we compare one conventional and one relatively new method of risk analysis and assess the complementarity of the two. We consider resilience engineering as a new paradigm for defining a framework for safety management. We begin by examining how failure mode effect and criticality analysis (FMECA) is applied as a conventional risk analysis method to a specific sector of a manufacturing business. We then examine the application of the systemic method known as FRAM, which is representative of resilience engineering. In the final section, we discuss the relevance of using systemic methods to safety management in the manufacturing sector as well as how the complexity of business nowadays is requiring managers to use a variety of methods to analyze risks effectively.
2. Resilience Engineering as a New Safety Management Paradigm
In recent years, the scientific community has been examining the implications of a new paradigm in OHS management, namely resilience engineering. From a sociotechnical system perspective, resilience means the ability of a system or organization to respond to and recover from a disruptive event while minimizing its loss of dynamic stability  . Resilience needs robustness and flexibility when facing predictable changes and agility to cope with uncertainties  .
Resilience engineering is based on several fundaments  . To begin with, OHS management is reactive and proactive, meaning that it takes past events into account but also attempts to identify and even anticipate OHS risks. Next, it is understood that a given event or condition may lead to an accident or not, depending on performance within the system. OHS management is an integral part of the fundamental preoccupations of an organization, just like the activities that underlie production and profitability. Safety is a prerequisite for productivity and profitability and vice-versa. Safety is therefore viewed as a continuous improvement factor and not as a constraint on productivity  .
Most of conventional approaches in safety management focus on the events from the past. Risk analysis methods are based on the identification of the root cause to prevent future events which suggests that risks are predictable. From this perspective, safety management is based on what went wrong. To prevent an occurrence, the focus remains on safety barriers such as formal procedures and by the standardization of the way of doing things. This might have the effect of decreasing the flexibility and leeway of employees  . This is called Safety I.
In contrast, Safety II recognizes the inevitability of performance variability within the system and attempts to take advantage of it, considering it necessary for greater flexibility and better adaptation to the various disruptions to which a business may be subjected. A comparison of Safety I and Safety II reveals their complementarity  .
From the resilience engineering perspective safety is considered as an emergent property from the system interactions. Safety is about how the system performs. This highlights the need to focus on the connection between the components of the system and not just only on individual agents  . Resilience engineering is “about the characteristics of resilient performance per se, how we can recognize it, how we can assess (or measure) it, how we can improve it”  .
The ever-increasing complexity of manufacturing environments combined with the emergence of organizational and human-related risks raises several questions about how risks are identified and managed.
If resilience engineering is to fill certain gaps in safety management, the risk analysis method should satisfy the following requirements  :
1) It must be systemic. Analysis of OHS risks cannot be confined to a single element of the sociotechnical system. The method must consider the organization as a whole rather than treat it as an assembly of several components.
2) It must model the normal functioning of the sociotechnical system. Most models focus on the possibility of system failure or of an accident. From a resilience engineering perspective, normal functioning of the system may be the source of success or of failure.
3) It must view accidents and safety as an emergent phenomenon resulting from the normal functioning of the sociotechnical system.
4) It must be able to take human performance into account not in nominal or standardized terms, but in a way that describes performance variability as a manifestation of constant adjustment made by workers as they perform their daily tasks. For this purpose, the positive contribution of workers to the safety of the organization must be known.
In the section that follows, a new method of risk analysis based on the principles of resilience engineering and satisfying the four criteria listed above is presented.
3. The Functional Resonance Analysis Method (FRAM)
Functional resonance analysis is a method of analyzing risk in complex sociotechnical systems  . It uses the principle of resonance to reveal how the variability of one function can affect functions downstream. The daily functions necessary for adequate performance of the system are identified, the performance variability at each of the functions is characterized, the potential variability is defined and interpreted, and finally ways of monitoring and reducing this variability are proposed. The particularity of FRAM is that it is based on the work as it is actually carried out on a daily basis (work-as-done) and not on written procedures or established production sequences (work-as-imagined).
Developing a FRAM model required to follow 4 main steps  :
1) Identify and describe functions.
2) Characterize the variability of the functions.
3) Aggregate the variability.
4) Propose means of managing the variability.
Since its development, FRAM has been applied in a variety of ways in various sectors, for example in air-traffic control in Europe    and in the pharmaceutical sector  . It is also gaining ground in the European hospitals, particularly in southern Denmark, where its use began several years ago  .
Various studies of the use of FRAM demonstrate the emergent property of accidents when a plurality of factors such as human behavior, technology and organizational characteristics interact within a system  . The very existence of this interaction shows that accidents cannot be the result of a single factor  . Conventional approaches such as “fault tree analysis (FTA), event tree analysis (ETA) and failure mode, effects and criticality analysis (FMECA)” are less and less helpful not only for explaining the composite causes that lead to an accident  but also for understanding how an accident develops.
The advantages and disadvantages of FRAM have been examined in earlier comparisons with other systemic methods. One of these is the sequentially timed events plotting or STEP method, which was found inadequate for the identification of certain causes of accidents  . FRAM appears to be better suited for the analysis and understanding of dynamic and non-linear systems such as sociotechnical systems.
Methods used in human reliability analysis (HRA), including FRAM, have been reviewed in recent years  . Since then, FRAM has been used as a method of qualitative analysis in support of quantitative risk analysis. Especially in a study of human reliability that showed the importance of factoring qualitative information into risk analysis, when calculating human error or reliability  .
In the field of railway traffic supervision, FRAM has been compared to FTA in a study of the speed with which train operators detect potential incidents  . In his conclusion, the author emphasized the pitfalls of relying on sequential models alone for the analysis of accidents.
Most of these studies show that the use of FRAM as a complementary method in risk analysis provides insight into the impact of human and organizational factors. The combination of one or more conventional methods with FRAM in a systemic approach is therefore a step towards better understanding of what workers are actually doing on the job  .
Although used mainly in environments with high levels of risk, FRAM has been extended in other sectors, such as the manufacturing sector. Zheng and al.  proposed an upgrade of the FRAM with model-checking-aided in the manufacturing sector to improve manufacturing processes. The proposal aimed at identifying the paths that may lead to manufacturing risks in terms of product quality. Patriarca et al.  also proposed an evolution of FRAM with a semi-quantitative method based on Monte Carlo simulations. Instead of describing the functions with linguistics definition, a numeral score is assigned to each performance variability state. The results indicate that the method enhance the traditional safety assessments.
The methodology used in the present study is described in the section below.
The research methodology used is a case study  carried out on the premises of a manufacturing activity. The two methods selected are failure mode effect and criticality analysis (FMECA, used widely in safety management in this sector) and FRAM, a systemic method representing application of the principles of resilience engineering as just described in Section 3.
The organization that participated in the study has been well established for many years in North America and has over a thousand employees. It is thus a large company, typical of motor vehicle manufacturing. Among the numerous activities of the participant, the corporate managers chose chassis assembly because it represents a critical step in the manufacturing chain. This operation is subject to much variability mainly because it is carried out manually and involves welders who must use a variety of welding methods and processes. When the welds are completed, the chassis is sent to an assembly line where the remaining parts are installed. The studied setting comprised 26 different workstations to which up to 95 welders may be assigned, depending on demand.
The study was carried out in two stages. We began by gathering the data contained in the company’s FMECA. We then developed the FRAM model for this sector of activity.
4.1. Failure Mode Effect and Criticality Analysis (FMECA)
FMECA is a method of risk analysis developed in the USA during the 1960s. Towards the end of the 1970s, it was brought into the automobile industry, notably at Toyota, Nissan, Ford, BMW, Peugeot, Volvo and Chrysler. It subsequently became a practiced and proven method in several other industries around the world. It is described as an inductive method for carrying out qualitative and quantitative analyses of system reliability and safety  . The method comprises examining the causes and consequences of each of the potential failures of a system. It ranks potential failures in terms of the estimated level of risk associated with each (criticality).
There are several types of FMECA analysis such as processes, system or design  . This method is also used in the field of Occupational Health and Safety. The safety FMECA is a method for the identification of hazards and the analysis of the risk to which the worker might be exposed. Thus, evaluation criteria are based on the probability that a hazardous situation could arise, the frequency and time of exposure of the operator, and finally the severity of potential injuries if the risk occurred. The Safety FMECA might assist designers to perform a risk assessment at the beginning of the design process  . The risk priority number (RPN) is obtained by the multiplication of the severity of the injury if the failure mode occurred, the probability of the occurrence that the failure mode occur as a result of a specific cause and the frequency of exposure to the hazard.
We chose FMECA for this case study because it is used widely for OHS risk analysis in manufacturing sector and the participating company had been using it for this purpose for several years. For the first phase of this study, the company kindly provided us with the results of the analyses conducted by in-house analysts during the period of 2013-2015. Although these analyses were carried out for each of the departments and workstations in the production facility, only the data concerning the operation under study were provided. These represent about a hundred risks all entered into a grid in spreadsheet format.
4.2. Developing FRAM for the Chassis Assembly Department
The FRAM model for chassis assembly had been developed previously in the course of our study on the applicability of FRAM to the manufacturing sector  . The relevant information used came from field observations as well as semi-directed interviews conducted with 10 individuals (workers, professionals and managers) who work in this sector. Thorough understanding of the system as a whole is necessary for the development of such a model. For this reason, a variety of individuals was interviewed. Those interviews were focussed on production activities to analyse and understand the organisational environment. The interviews focused on the description of each aspect of the FRAM: Input, Output, Preconditions, Control, Time and Resources.
The results obtained using the model are presented in the section that follows.
The Occupational health and safety department in the participating company began to use FMECA to document risks present within the facility and its operations in 2013. The results thus obtained were used to identify the various risks within the operation under study. The risks were compiled first according to department and then according to the tasks performed by welders. A list of risks was thus identified for each task.
The figure below (Figure 1) represents the grid used to perform the analysis. Each of the functions are described in terms of potential failure mode, potential causes of failure and potential effects of failure. Following the description of the functions/task, each of the potential failure modes is evaluated using three criteria
Figure 1. Safety FMECA grid.
as described below (Figure 1). The possible failure modes for each enumerated risk are then listed. For example, the different circumstances that could lead to a worker falling: an improper manoeuvre, working at heights without a proper anti-fall device, tripping on objects on the ground, and so on. The failure modes may be numerous. The elimination of the risk or the control measures might include a program to ensure the use of anti-fall devices, for example wearing a safety harness.
To rank the risks thus identified, Safety-FMECA uses a quantitative method of evaluation based on three criteria: Severity, Occurrence and Frequency. Each criterion is defined according to a level and a quantitative factor is assigned to it. Figure 2 below represents the three criteria used by the company under review to perform its risk assessment. Each of the criteria is defined on a ranking scale (Figure 2).
The Risk Priority Number (RPN) is obtained by the multiplication of the three criteria (Figure 3). As an example, if the severity of a risk is catastrophic  , the frequency of exposure is low  and the probability of occurrence is possible  , the RPN would be 150 (25 × 2 × 3). The RPN provides the ranking of all risks identified and allow the managers to define priorities for the implementation of control measures. For example, an index greater than 250 indicates an intolerable risk.
The development of the FRAM model required identification of the principal functions necessary for the desired functioning of the system  . Each of these functions is then described in terms of the six aspects of FRAM (input, output, resources, preconditions, time, and control). Table 1 below describes one of the system functions, namely planning the production sequence.
Since FRAM allows analysis of different scenarios to understand how the system
Figure 2. Ranking scales.
Figure 3. Risk Priority Number (RPN) scale.
Table 1. Description of the function planning the production sequence.
functions and to define its range of variation, various potential scenarios may be analysed to understand the impact that variable inputs to a function would have downstream.
The scenario presented here is that of a wrong part being taken and inserted in the wrong position in the sub-assembly. In this situation, the worker does not realise that the right part has not been used since they are not all identified on the parts trolley. The worker completes the assembly, which is then moved to the next workstation. Table 2 illustrates how variable performance by the worker will have an impact on the system.
In the above case, the function assembly of the roof structure receives an incorrect input, one that does not meet assembly requirements, meaning that the quality of the input per se is inadequate. This non-compliancy might not be detected at the subsequent workstation (i.e. the function finishing assembly of the chassis), since there is no formal monitoring mechanism in place. Not being able to detect this variability could have the following consequences:
・ Increased pressure on workers because of the time required to correct the error. This company operates with TAKT time, which reduces workers’ margin for manoeuvre;
・ Decreased productivity due to the reworking that will be necessary to correct the error;
・ Increased risks associated with the use of tools such as grinders (to cut and remove the part) and welders to redo the work;
・ Increased costs associated with time and parts;
・ Decreased product quality.
The consequences for the rest of the system can be multiple, as much for OHS as for productivity and quality. In this case, rapid detection of the error will allow a
Table 2. Potential variability of the function assembly of the roof structure.
reduction of its negative impact on the assembly process, in other words, the effects of performance variability will be buffered. Such variability will be greater where new employees are involved. Based on the results of different scenarios, the business will be able to devise and implement means of reducing the impact of performance variability.
Although the results of this case study demonstrate the applicability of FRAM in the manufacturing sector, the participation of only one business constitutes a major limitation on their interpretation. Our findings cannot be generalised to the manufacturing sector as a whole. In addition, the study was focused on a single operation within the company. It needs to be extended to the whole organisation to gain greater insight not only into the activity examined but also into the interactions between the various operations within the company.
The FMECA method is relatively easy to understand and apply. It provides a chronological overview of the tasks carried out by the workers. However, it has the considerable shortcoming of not being designed to consider human and organizational factors. The very basis of this method is the detection of the possible failure modes of a system without understanding their source. It nevertheless offers the possibility of ranking the risks as a function of specific criteria defined by the business and allowing managers to prioritise their actions.
The results reveal certain advantages of one method over the other. Each method identifies risks that the other overlooks. As an example, installing a wrong part when assembling the chassis will without doubt affect product quality, but will also increase OHS risks considerably. The workers will have to correct the error using procedures that may be improvised. The risks increase if the error is not detected quickly, that is, if assembly does not include organised monitoring, and corrective actions have to be carried out on the assembly line. In addition, in the present case, the time constraint is identified in the FRAM analysis but not in the FMECA analysis. The variability of the output of a function due to time could have major repercussions on the rest of the production line. Time constraints come from an organizational decision based on various factors specific to the organization. In the design of the system, the tasks were defined in a given order taking into account the nominal time required to perform each of these tasks. By modeling the actual operation of the system with FRAM, it is more obvious to highlight the trade-offs that workers make to achieve the goals. These compromise may be due to either time constraints or a lack of resources and information  . In these situations, employees will have to adjust the way they do things within these constraints so that organizational goals are met. This corroborates other researches on FRAM stating that safety is an emergent property of the system.
These elements cannot be identified using FMECA alone since this method has not been developed to analyze organizational and human factors. On the other hand, FRAM is recognised as a systemic method that allows this type of analysis. FRAM is, however, less effective for detecting risks of a more technical nature, such as those associated with machine operation. The analysis of the various scenarios allows detection of elements that contribute to OHS, quality, productivity and processes weaknesses.
The two methods used for this study come from different approaches. While FMECA comes from system safety approach, FRAM is derived from the human factors approach. The logic behind each of the methods is different and does not constitute an alternative choice but rather a complementary choice. FMECA is a proven method for technical risk assessment, but is limited in organizational risk assessment and consideration of the actual functioning of the system. System failures can emerge from a technical failure or spread of variability throughout the system  .
This study thus corroborates the findings of several other studies. It is recognises that FRAM offers advantages over other methods of risk analysis in terms of identifying factors that are not taken into account when using more conventional methods    . The usefulness of FRAM has been demonstrated in several fields, including the hospital sector, the petrochemical industry and the aeronautical industry. This study also corroborates several studies on FMECA and the need to address the limits of this method     .
The main contribution of the present study is to confirm that FRAM can add value to risk management in the manufacturing environment, which has not yet been the subject of case studies in this sense. Although it has been demonstrated previously that FRAM could be of use in the manufacturing sector  , it appears also to offer an advantage over conventional methods in some instances, by considering contextual and operational factors that affect work as it is carried out on a daily basis. Including these organisational factors in risk management is a key element for improving OHS performance since they play a causative role in the occurrence of accidents  . In addition, the study was conducted in a manufacturing company adopting a LEAN production approach. LEAN is based on the elimination of wastes to reduce activities without added value for the customer. One of the objectives of LEAN is to reduce any source of internal or external variability by standardizing processes. Although LEAN and resilience may seem contradictory, studies have demonstrated their synergistic character   . The integration of resilience engineering principles into LEAN would allow manufacturing companies to better consider the issues that emerging risks represent from an OHS, productivity, quality and logistics perspectives. Our study makes an additional contribution in this direction by addressing OHS risks for employees. By integrating a method such as FRAM into safety management and risk analysis, the company will be able to make the most of this synergy to improve its performance in a sustainable way. Indeed, FRAM makes it possible to better take into consideration the leeway of the employees and the need for variability in their performance for the maintenance and the safety of the system.
In the light of the above, it must be concluded that the applications of FRAM are not limited to OHS management. In this study, elements of productivity and quality were identified, suggesting that FRAM could be used to reveal weaknesses in process design. Variability in human performance also has a major impact on production quality and productivity. Better understanding of the normal functioning of the system can enable managers to identify zones of process fragility. In addition, FRAM allows the design of a wide variety of scenarios, current, potential or hypothetical, as well as the winning conditions. Therefore, it can be an effective tool for improving the robustness of anticipated processes as well as for improving the resilience of a business in the face of unforeseen developments.
7. Conclusion and Future Work
The objective of this study was to compare and to note the complementarity of two methods of risk analysis applied in a manufacturing setting. The comparison of FMECA and FRAM provided insight into the importance of considering organisational factors. Although the validity of many conventional methods of risk analysis has been proven in recent years, they do not by themselves reveal the emergent risks with which many manufacturing businesses are now confronted to. These businesses need to consider novel and more systemic approaches that take into account the increasing complexity of their activities and social environment. Indeed, they need to view their operations as sociotechnical systems. The results of this study demonstrate that FRAM can be helpful for this purpose. Manufacturing businesses could use systemic methods as a complement to conventional methods to gain insight into the most determinant aspects of their production systems and into the implications of these in terms of OHS.
The use of FRAM in the manufacturing context warrants more in-depth study to obtain results that can be generalized. In addition, more studies could be conducted to integrate FRAM in lean management as a way to improve the resilience of the system.
The authors thank the Natural Sciences and Engineering Research Council (NSERC), École de technologiesupérieure (ETS) and the Association Québécoise pour l’Hygiène, la Santé et la Sécurité (AQHSST) for financial support, and professor Bryan Boudreau-Trudel for his support and comments during the preparation of this article.