Risk Management

FMEA for Safety: 7 Failure Modes Leaders Miss

FMEA for safety protects people only when leaders convert risk rankings into verified controls, escalation rules, and field evidence that changes work.

Por Publicado em 7 min de leitura

Principais conclusões

  1. 01Diagnose failure modes in operational language, because broad hazard labels hide the exact condition that lets a serious event develop.
  2. 02Separate severity from probability, keeping credible SIF potential visible even when the site has a clean recent injury history.
  3. 03Audit detection controls before trusting the score, since monthly inspections rarely catch a ten-minute exposure at the right moment.
  4. 04Replace repeated retraining actions with redesigned work, stronger barriers, named owners, deadlines, and field evidence that the exposure changed.
  5. 05Subscribe to Headline Podcast to bring leadership-grade risk conversations into the meetings where FMEA findings become decisions.

FMEA was designed to expose how a system can fail before people pay the price, yet many safety teams reduce it to a spreadsheet whose highest number receives attention first. This article shows seven failure modes senior EHS leaders should catch before FMEA becomes another document that looks analytical while the worksite remains exposed.

Why FMEA in safety fails when it only ranks risk

Failure Mode and Effects Analysis is useful in occupational safety because it forces a team to ask how a task, barrier, asset, or decision can fail, what the effect would be, and which control should stop the sequence. The AIAG-VDA FMEA Handbook, 2019 edition, uses severity, occurrence, and detection as core logic, while ISO 45001:2018 expects organizations to address hazards, risks, opportunities, operational control, and change before harm occurs.

The weak point is not the method. The weak point is the leadership habit of treating a high Risk Priority Number as the final answer. On the Headline Podcast, Andreza Araujo and Dr. Megan Tranter often bring safety leadership back to real conversations, because the score does not rescue a worker, isolate energy, or stop a contractor from improvising.

As co-host Andreza Araujo argues in Safety Culture: From Theory to Practice, culture becomes visible in repeated operational decisions. FMEA therefore has to leave the meeting room and enter permit-to-work design, maintenance planning, supervision routines, and executive escalation.

1. The team scores the hazard, not the failure mode

FMEA becomes weak when the worksheet names a hazard such as "fall from height" instead of naming the failure mode that lets the fall happen. A useful entry would describe the failed condition, such as anchor point unavailable, rescue kit incomplete, harness inspection skipped, or permit approved without verifying edge protection.

This distinction matters because a hazard label is too broad to change the work. When teams confuse hazards with failure modes, they often produce the same generic controls that already appear in every procedure, which is why the method looks complete while nothing in the field changes.

In a maintenance shutdown, the EHS manager should ask the team to write failure modes in operational language. The question is not "what is the hazard?" but "what must fail for the unwanted event to become possible?" That shift connects FMEA with risk matrix blind spots, because both methods fail when categories hide the real mechanism.

2. Severity gets averaged down by probability thinking

Severity in FMEA should describe the credible consequence if the failure mode reaches the worker, asset, or community. The AIAG-VDA FMEA Handbook uses a 1 to 10 severity scale, and safety teams weaken the method when they let low probability soften a consequence that could still be fatal.

The leadership trap is familiar. A supervisor says the event has never happened here, so the team lowers the concern and moves on. Across 250+ cultural transformation projects, Andreza Araujo observes that absence of recent harm often becomes mistaken for control strength, although the real question is whether the barrier would work tomorrow under pressure.

For serious injury and fatality potential, severity should remain anchored in the credible worst consequence. Occurrence and detection can show frequency and discoverability, but they should not make a fatal exposure look tolerable because the plant has been fortunate.

Case

50% accident reduction in 6 months

During Andreza Araujo's PepsiCo South America tenure, the accident ratio fell 50% in six months after leadership treated safety as operational discipline rather than campaign language. The FMEA lesson is direct: a score matters only when leaders convert it into changed decisions.

3. Detection is scored as if audits happen at the right moment

Detection in FMEA measures how likely the organization is to discover the failure mode before harm occurs, yet many teams score detection as if a monthly audit can catch a risk that appears during a ten-minute job step. A control that is visible only after the task is complete does not deserve a strong detection score.

This is where FMEA becomes a leadership test. If the only detection mechanism is an inspection after the work, the team is documenting regret rather than prevention. Detection should include pre-task verification, supervision presence, interlock status, alarm response, stop-work triggers, and evidence that the person closest to the task can interrupt the sequence.

For example, a lifting operation may have a checklist, a rigger certificate, and a permit, but the key detection question is whether someone will notice the wrong sling angle before the load is lifted. That is why FMEA should connect with Bow-Tie barrier questions, since both methods force leaders to test whether controls are preventive, mitigative, or only administrative.

4. The RPN becomes a hiding place for weak controls

The traditional Risk Priority Number multiplies severity, occurrence, and detection, which means different risk profiles can produce the same number while demanding very different decisions. A high-severity, low-occurrence fatal risk should not be treated like a frequent first-aid exposure just because the arithmetic looks comparable.

3 scoring dimensions shape the classic FMEA logic, according to the AIAG-VDA FMEA Handbook, 2019 edition. The problem is that multiplication can create false precision, especially when teams argue about whether occurrence is a 3 or a 4 instead of testing the control that should prevent death.

Leaders should require a second view for high-severity risks. Any failure mode with credible SIF potential should receive a barrier review, even when the RPN is not at the top of the list. This keeps FMEA aligned with SIF leading indicators, where precursor control health matters more than the comfort of a low injury rate.

5. Corrective actions repeat training instead of redesigning work

FMEA loses force when every high-priority action becomes retraining, communication, or a revised procedure. Those responses may be necessary, but they rarely change exposure when the failure mode comes from design, access, workload, interface risk, or missing engineering control.

What most safety programs understate is that training often compensates for a system the organization has chosen not to fix. In The Illusion of Compliance, Andreza Araujo's work points to the gap between documented obedience and operational reality, which is exactly the gap that FMEA should expose.

A stronger action line names the owner, engineering decision, verification evidence, and due date. If the failure mode is "operator reaches into moving equipment to clear jam," the action should examine guarding, lockout design, jam frequency, production pressure, and supervision cues before another training session is scheduled.

Each month that FMEA actions remain stuck in retraining language leaves the organization with a growing file of known failure modes and little proof that exposure has actually changed.

6. Interface failures are excluded because no department owns them

Many severe events begin at interfaces, where maintenance, operations, engineering, procurement, contractors, and production planning each control part of the risk. FMEA misses these failures when the team analyzes a task inside one department boundary.

The Headline Podcast view is helpful here because it treats safety as a leadership conversation, not only a technical exercise. A failure mode such as "contractor starts work with outdated isolation information" belongs to procurement, EHS, operations, and the contract owner at the same time, which means a single-department FMEA will probably dilute accountability.

Senior leaders should require at least one interface review for high-risk tasks. The team should ask who hands information to whom, what can be misunderstood, which document becomes authoritative in conflict, and who has power to stop the job. This links directly to contractor interface risk, where ownership gaps create exposure that procedures alone do not close.

7. The FMEA is not connected to management of change

FMEA should be revisited when equipment, staffing, materials, contractors, layout, schedule, or production assumptions change. ISO 45001:2018 clause 8.1.3 requires organizations to control planned changes and review unintended consequences, which makes a static FMEA a poor fit for dynamic operations.

2018 is the publication year of ISO 45001, and its management-of-change requirement matters because risk changes faster than many safety documents. A worksheet completed during project launch can become misleading after a new supplier, overtime pattern, or temporary bypass changes the exposure.

The practical rule is simple enough for executives to enforce. Any management-of-change review that affects a critical task should ask whether an existing FMEA must be updated, whether new failure modes appeared, and whether old controls still match the work as performed.

Comparison: weak FMEA vs leadership-grade FMEA

DimensionWeak FMEALeadership-grade FMEA
Object of analysisNames broad hazards and repeats familiar controls.Names specific failure modes that can be verified in the field.
Severity logicLets good history soften fatal potential.Keeps credible SIF consequence visible until controls are proven.
DetectionAssumes audits and paperwork will find weak signals.Tests whether the failure can be caught before exposure reaches the worker.
Action qualityCloses with retraining, reminders, and procedure updates.Requires redesign, barrier strength, owner, deadline, and verification evidence.
GovernanceLives in EHS files and receives attention during audits.Connects to capital decisions, management of change, and executive escalation.

Conclusion: FMEA should change decisions, not decorate files

FMEA protects people when leaders use it to expose specific failure modes, preserve credible severity, test detection, strengthen barriers, and connect findings to management of change. If the method only produces a ranked list, the organization has analysis without control.

For Headline Podcast, this is the kind of real safety conversation that belongs at the leadership table, because the space where leadership and safety come together should shape better workplaces and better lives. Subscribe at Headline Podcast and bring this article to the next meeting where risk scores are treated as decisions.

#fmea #risk-management #barrier-management #ehs-manager #fatal-risk #safety-leadership

Perguntas frequentes

What is FMEA in occupational safety?
FMEA in occupational safety is a structured method for identifying how a task, asset, barrier, or process can fail before harm occurs. It asks the team to define the failure mode, effect, severity, occurrence, detection, and action. The value is not the score alone. The value comes when leaders use the analysis to strengthen controls, test detection, and change how work is planned, supervised, and verified.
Why does FMEA fail in safety programs?
FMEA fails when teams score broad hazards, average down fatal potential, overtrust audits, and close actions with training instead of redesign. A worksheet can look disciplined while the workplace remains exposed. On the Headline Podcast, Andreza Araujo and Dr. Megan Tranter often present safety as a leadership conversation, which is why FMEA must connect to decisions, resources, and field evidence.
Should safety leaders use RPN for fatal risks?
RPN can help prioritize, but it should not be the only trigger for action when a failure mode has serious injury or fatality potential. A high-severity event can receive a moderate RPN because occurrence seems low or detection is overrated. Leaders should add a SIF review rule so credible fatal risks receive barrier analysis even when the arithmetic does not place them at the top.
How often should a safety FMEA be reviewed?
A safety FMEA should be reviewed whenever a meaningful change affects the task, equipment, contractor mix, staffing level, material, layout, production pressure, or control strategy. ISO 45001:2018 requires control of planned changes and review of unintended consequences. A static FMEA becomes risky when operations change but the failure modes, detection assumptions, and action plans remain frozen.
What is the difference between FMEA and Bow-Tie analysis?
FMEA starts from a failure mode and asks what effect it could create, how severe it is, how often it may occur, and how easily it can be detected. Bow-Tie analysis starts from a top event and maps preventive and mitigative barriers around it. In safety, they work well together because FMEA exposes weak failure modes while Bow-Tie tests whether barriers are strong enough.

Sobre a autora

Host & Editorial Lead

Andreza Araujo is an international reference in EHS, safety culture and safe behavior, with 25+ years leading cultural transformation programs in multinational companies and impacting employees in more than 30 countries. Recognized as a LinkedIn Top Voice, she contributes to the public conversation on leadership, safety culture and prevention for a global professional audience. Civil engineer and occupational safety engineer from Unicamp, with a master's degree in Environmental Diplomacy from the University of Geneva. Author of 16 books on safety culture, leadership and SIF prevention, and host of the Headline Podcast.

  • Civil Engineer (Unicamp)
  • Occupational Safety Engineer (Unicamp)
  • Master in Environmental Diplomacy (University of Geneva)