Solving Potential Problems Using Effective Process Failure Mode and Effects Analysis
(February 12, 2019)
Various Indian automotive organizations have adopted Failure Mode and Effects Analysis (FMEA) as an integral part of their new product development process. In spite of this, there are several obvious problems which are noticed in manufacturing and later these problems are solved. This indicates that FMEAs are not utilized effectively.
If FMEAs are effective, the new or unknown defect would not occur, all known defects would be well within the target and no defect would pass to the customer. The Quality of FMEA is determined by the “depth of identification of causes” and “extent of new controls identified through FMEA”.
Failure is a success if we learn from it. Things that can go wrong will go wrong if we don’t prevent them from occurring. But success lies in learning from these failures and preventing them in the future. Every customer complaint, its cause analysis and problem-solving, presents an opportunity for lessons to learn. This paper provides information about how to draw inputs in FMEA from “Corrective Action Process” and “Past Trouble Data Base” (PTDB) document which contains ‘lessons learned’ and ‘technical know-how’ to ensure that there is systemic action.
A unique way of linking Process Flow diagram, Cause, and Effect Diagrams, Process FMEA, Error Prevention methodology, Control Plan (aka QCPC), and SOP is also presented in this paper. This paper aims to enhance the method of doing process FMEA to solve potential quality problems.
2. Defect Prevention Philosophy and tools
Table 1. Defect Prevention Philosophy and Tools
Defects are generated if something goes wrong in the process. Unless we control the process it is not possible to achieve a defect-free product. Process control could be error proofing, process audit, preventive maintenance, start-up checklists, operator training and skill evaluation, fixture design etc.
The philosophy of arriving at process controls by creating process understanding and identifying failure modes is shown in Table-1 above.
3. Process Flow Diagram
Conventionally PFD indicates the sequence of operations, flow symbols, primary and alternate path, usage of multiple machines, etc. PFD used for making FMEA additionally includes Product Characteristics and all the Incoming Sources of Variations and Process Characteristics which can affect the Product Characteristics. Other information like cycle time, distance (for material movement), etc can also be added as needed. PFD should also include nonmanufacturing operations like storage, inspection, transportation, which can affect product Quality. An example of PFD has been shown in Figure-1.
Foam pasting is one of the operations done manually by an operator. The team has identified product characteristics: location of foam, Peel of strength, Plastic part free from damage and scratch, Free from the gap, Presence of all the foam, Right foam usage. Only two of them are shown in the below example.
Figure 1.Example of the Process Flow Diagram (PFD)
Product characteristics are identified by asking “what is expected the outcome of the process?” and “what is not expected outcome?” What is expected outcome is answered by referring to all technical documents for dimensional, functional, engineering, visual and safety regulatory characteristics. For example, for foam pasting operation peel-off strength specified in the drawing may be expected outcome. What is not expected is answered by referring all the past troubles, technical knowledge. For example for foam pasting what is not expected could be plastic part free from scratches.
3.2 Process Characteristics& Incoming source of variation (ISV)
For each product characteristic, identify process characteristics and incoming source of variation (ISVs) separately as shown in the above example of peel-off strength. Process characteristics can include parameters related to man, method, machine, measurement system or environment etc. Breaking-up the operation into activities, helps to identify process characteristics more comprehensively.
3.3 Linkage of product characteristics with subsequent operation or customer
Product characteristics of one station may be considered as Incoming Sources of Variation (ISV) in subsequent operations. For example – in car manufacturing, windshield aperture dimension is a product characteristic at the body fabrication stage, whereas the same dimension becomes ISV for Wind Shield Glass fitment stage in TRIM shop. While auditing PFD of previous operations and supplier, check whether all ISVs of downstream operations are addressed in respective processes.
3.4 Flow diagram
The flow diagram clearly indicates the primary path, secondary path, clear indication of the outsourcing process, pre-defined rework process if any. Symbols to include whether the manual operation, semi-automatic, fully automatic, storage, inspection, delays, multiple machines, etc
4. Preparation of FMEA
Figure 2 Example of Process FMEA for foam pasting (Few columns& rows are shown).
4.1 Approach for failure mode identification
4.1.1 Refer to product characteristics
Negative of product characteristics is considered to be failure mode. For example, peel of strength is product characteristics, less strength is considered as a failure mode, the presence of all foam is product characteristics, foam missing is considered as a failure mode, free from gaps are product characteristics, the gap is considered as a failure mode. Some parameters considered in DFMEA as a cause for functional failure, are considered as a failure mode in Process FMEA.
4.1.2 Refer past trouble database and corrective actions
All the past troubles which have generated failure modes should be addressed including internal and customer complaints. The document addressing past trouble database gets updated during every customer complaint. This document includes details like a customer, part name, defect details, temporary actions, validated cause, corrective actions, potential problems that can happen due to action taken, preventive action on the potential problem, lessons learned, systemic actions, etc.
Address failure modes based on corrective actions. Example, there is a problem of leakage after final assembly and it was found during problem-solving that you found that variation in roundness in the part is creating leakage. In a machining operation, roundness variation to be added as a failure mode, cause for the same could be vibration in the machine.
4.2 Approach for Identification of Multiple Effects
For each failure mode, CFT brainstorms effect on Subsequent Operations (SO), Customer Effect (CE), End User Effect (EU), Operator Safety (OS), Machine or tool (MC). After considering all the effects, each effect is rated depending on the severity of effect. Severity ranking standard is followed and the highest number is considered. Process FMEA team can also refer DFMEA to see the impact of the parameter on the function and impact on the end user. The failure mode of a particular station may be addressed as an effect in previous or supplier operations.
4.3 Approach for cause identification
4.3.1 Refer to Process Characteristics
Refer process characteristics column of PFD to ensure that all these characteristics are considered as causes. These causes generally contain primary level causes. Incoming source of variations (ISVs) is generally not considered as a cause in FMEA. FMEA assumes that incoming material is OK. Exceptions can be made based on historical data or teams experience during the development stage.
4.3.2 Prepare Cause and Effect Diagrams
Conduct brainstorming and prepare cause and effect diagram for every failure mode. This helps to ensure all the possible causes related man, machine, method, measurement system, environment, etc are considered. Conduct a why-why analysis to identify next level causes.
4.3.3 Refer to Corrective Actions taken
Address causes mentioned while taking corrective actions. For example, in the machining operation, roundness is failure mode and cause is vibration. Further investigation during problem-solving it was found that bearing in the spindle of the machine was worn out. Now vibration which is primary cause and spindle worn-out which secondary cause to be added in the cause column. It helps to identify detection control for spindle vibration by regularly measuring vibration. It also helps to identify bearing replacement frequency as a control to keep vibration within limit which will help to reduce the occurrence of roundness variation.
4.3.4 Refer Error identification Checklist
Each operation is divided into activities as shown in Figure-3 for foam pasting operation. Activity wise error identification using the checklist helps to identify several possible errors. Using this checklist of 16 errors, which is generic and applicable for almost all human operations. Identification of errors can be maximized.
For example, in an operation of 12 minutes cycle, the team has identified a total of 138 possible errors using this checklist. Refer all the causes like operator forgetting, delay in pasting the after removal of the sticker from adhesive.
Figure 3 Activity wise Error Identification
4.4 Occurrence Ranking
A very unique approach is to rank failure mode and cause occurrence separately. As you can see in the example, the first row is crossed and causes are mentioned in subsequent rows. The first row is dedicated for rating occurrence of failure mode. Occurrence number 10 given in the first row is for failure mode i.e. less strength. This is justified by data of development trials where more than 12% parts were found be with less strength. There are standard tables available for occurrence ranking.
For an existing product, failure mode ranking is given based on actual defect data. Where cause wise data collection is practically possible, cause wise occurrence ranking is given. Otherwise, the occurrence number for cause is given by team judgment considering current prevention controls.
Products which are in development (pre-launch) stage, process stability and capability study and before the pre-launch stage, occurrence ranking is estimated based on similar product/process data and teams judgment on the planned controls.
4.5 Types of controls and their impact
Controls are identified depending on the associated risk. If the parameter is affecting safety then the associated risk is high. Detailed analysis of causes of such parameters, identification of controls and implementation will reduce the occurrence of defects. Detection control is about checking and then taking action. Controls that reduce the occurrences are part of prevention controls. One cannot prevent failure modes directly. It is possible to detect or prevent cause. For example, the training operator to follow SOP is caused by prevention control. Process audit to check adherence of SOP by the operator is cause detection control. These 3 types of controls are explained below.
4.5.1 Detection of failure mode
In order to protect the customer from receiving defects, we ensure that failure modes are detected at the same stage or at the earliest stage using subjective inspection, by usage of gauges or automatic controls, etc. For example crack in the tube is tested by non-destructive testing and if there is any crack, automatically tube falls in rejection bin.
Improving the detection of failure mode reduces the chance of defective part going to the customer. By improving detection of failure mode, DET ranking of failure mode reduces thereby reducing Risk Priority Number (RPN) for the failure mode. If there are multiple detections for single failure mode, best detection is considered for evaluating risk. So whenever we improve detection of failure mode it is important to re-examine the impact on (a) DET ranking of failure mode (b) RPN of failure mode.
4.5.2 Detection of causes
Detection of failure mode and causes are dealt with differently by separate rows for their DET ranking. Detection of cause can be subjective inspection, or it can be based on some alert like flashing light or buzzer. Since the cause gets detected, there will be less chance of occurrence of failure mode.
For Example -Measuring the operator skill for painting through an evaluation sheet during initial stage and at defined period will reduce the occurrence of manual paint defect (failure mode), By measuring coolant concentration (cause), it is possible to take action before the surface finish (failure mode) goes beyond the specification limit.
Improving detection of the cause will reduce its DET ranking thereby reducing its RPN. Also, the OCC &RPN of failure mode will reduce. So, whenever we improve detection of cause it is important to re-examine the impact on (a) DET ranking of the cause (b) RPN of cause (c) Occurrence ranking of failure mode (d) RPN of failure mode.
4.5.3 Prevention of causes
Failure mode cannot be prevented directly. Its occurrences can be reduced or eliminated by controlling its causes. To prevent the cause, it is important to identify control sub causes either through detection or through prevention. When the main cause is prevented it reduces the occurrence of the main cause hence reduces the occurrence of the failure mode.
For example, the Failure mode is less peel-off strength, one of the cause is the operator selecting the wrong roller. When the analysis is carried out for next level causes for wrong roller selection, the team identifies that operator not aware, operator not able to identify correct roller etc. So, controls like training operator or color coding of rollers will reduce the occurrence of the operator selecting the wrong roller. Intern, it will reduce the occurrence of less strength. Still, if the operator selects the wrong roller, there may not be any detection, hence detection number is considered as 10.
There are some prevention controls which will also eliminate the occurrence of cause as well as a failure mode. For example, the clip inserted in the wrong location is a failure mode due to the operator's forgetfulness. This is prevented completely thorough fixture design. It will be impossible to do this error by the operator. In such cases, OCC and DET ranking is considered as “1” for this cause.
Whenever we improve prevention of cause, it is important to re-examine the impact on (a) OCC ranking of the cause (b) RPN of the cause (c) OCC ranking of Failure Mode (d) RPN of failure mode.
4.5.4 Controls on Incoming Sources of Variation (ISV)
Where input material related causes are considered based on historical data, controls could be checking the input material related parameter before the process or modify the fixture such that if the input is not OK, it should not fit. Where ISVs are identified at the development stage, as shown in figure 4 as the adhesive property of the incoming material, actions could be the one which needs to be done during development. It could include conducting experimentation with alternate material, considering ISV while designing fixture. Reducing the occurrence of ISVs can be done only in the stage of generation. Only detection control is possible at subsequent stages.
4.5.5 Control Methods for Dominant Source of Variation
First priority controls can be identified based on the dominant source of variation. For example, assembly operation may be the operator and incoming component dominant process, plastic molded parts may be machine and setup dominant, whereas painting may be the operator and environment dominant process. Following table shows some examples of variation class and control method.
Table 2. Example of Dominant variation class and examples of controls
4.5.6 Detection Ranking Criteria
While assigning the detection, it is assumed that failure mode or cause has occurred and the ability to detect the same is estimated to rank detection. A simplified detection ranking is shown in Table-3.
Table 3. Detection Ranking (Simplified)
Ranking 5: Part assembled in the wrong location is alerted through light or buzzer. The defect may go to the next stage if actions are not taken by the operator.
Ranking 4: Part assembled on the wrong location is sensed by a sensor fitted in a subsequent operation and it does not start the cycle to prevent further processing.
Ranking 3: Part assembled in the wrong location is detected in the same station and part is automatically marked with red and prevents to proceed for the next station.
Ranking 2: When a part is placed in the wrong location during assembly, the machine stops working to ensure that the wrong assembly does not happen. It stops the generation of failure mode.
Ranking 1: Part cannot be placed in the wrong location during assembly due to fixture design
4.6 Prioritization of risk
Multiple methods for risk privatization is to be utilized as one single method may have certain limitations. Risk depends on Severity (SEV), Occurrence (OCC) and Detection (DET).
For the failure modes of the highest risks, the next step is to consider causes with higher OD number and apply detection and prevention controls for the same.
5. Approaches to arrive at recommended actions
5.1 Team’s knowledge: Team should use technical knowledge to suggest prevention &detection controls. Brainstorming in a team or taking reference from the internet can be further helpful.
5.2 Use Principles of error proofing: Refer Table-4 and apply checkpoints to identify solutions for identified errors. For example, 9 solutions for incorrect roller selection, which was a cause of less adhesive strength, have been identified.
Table 4. Example of usage of error prevention principles for the generation of solutions.
5.3 Cause Analysis Table (CAT)
Each cause is verified using a CAT which presents a very logical sequence of reaching an appropriate solution. The first and foremost requirement is to have the standard for the causes under investigation. If the standard is not there, the action plan is to prepare a standard. For example, in press forming variation, if the cause is improper oil application, and for there is no standard for this, the action plan would be to prepare SOP defining the method to apply oil.
If there is standard, check whether there is any basis for whatever is mentioned in the standard. If there is no basis, the action plan is to establish the same using Design of Experiments (DOE), verification through technical calculations, simulations or any other evaluation methods.
If there is a standard supported with basis, it is important to check whether the standard is being followed or not. For example, the oil application method validated and defined in SOP, start checking whether SOP is being followed or not. Wherever there is a gap between standard and actual, conduct why-why analysis to further drill down to next level causes. Repeat verification of these causes using CAT.
Table 5. Example of Cause Analysis Table
6. Update FMEA during Corrective Action and Continuous Improvements
Apart from updating of failure modes, effects, causes, controls and number, during any corrective actions and improvement actions, it is also important to identify the side effects of changes.
Example: Customer received gears (parts) in a mix-up condition.
Add effect: They added effect “Mix up part leading to segregation at the customer.” Team checks that mix up happened during packing operation.
Add failure mode: Team adds “Gear Mix up” as a failure mode in packing.
Add valid cause: Upon investigation, it was found that gears which were mixed up due to wrong selection as they were are similar looking. Add cause as an incorrect selection.
Add new or changed controls: Team has identified new control as number punching on the gear to identify them. Now update SOP for packing operation with numbers. Respective check sheet-like PM check sheet, set up sheep, start-up sheet, Poka Yoke sheet, Tool history card etc to be updated
Add potential causes: During solving this problem team also identified possible causes of mix up at various stations. Same are also added in respective stations to avoid problem repetition due to some other possible cause.
Identification of New Failure Modes: During corrective action process changed. Addition of punching operation for identification may have other failure modes like punch missing, not visible, wrong punching, punching at the wrong location, scratch during punching etc. They are further analyzed and controls identified to prevent them.
Change in Rankings: Rankings of Occurrence of FM, Occurrence of Cause, Detection of cause and RPN depending on the type of actions.
7. Comparison of Conventional FMEA and Effective Approach
Table 6. Conventional FMEA vs Effective Method
It is rightly said by William A. Foster “Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction, and skillful execution; it represents the wise choice of many alternatives”. This paper was aimed at drawing attention on the importance of a proactive approach. FMEA is not at all a history capturing exercise but is a powerful technique to anticipate problems, analyze potential causes and institutionalize controls for getting a defect free product.
While FMEA is a wonderful technique to support the objective of “doing right first time”, the trouble with “doing right first time” is that nobody appreciates how difficult it was. People who do firefighting and solve the problems are generally rewarded. If people say they anticipated all the possible reasons and ensured that there is no fire in their area, there is no much appreciation. It is, therefore; very essential to ensure that the right mindset and attitude for doing FMEA is encouraged in the organizations.
I sincerely thank Prof.Hitoshi Kume and Prof.Takeshi Nakajo for sharing and teaching error prevention Methods. It has provided immense help to implement FMEA fruitfully.
AIAG Manual (2008), Potential Failure Mode & Effects Analysis – 4th Edition,
Prof.Nakajo (2009), Training handbook on Error Prevention
About the Authors
Mahesh Hegde is a TQM Counselor.
M K Somanathan, Head – Quality Management System, Ashok Leyland.