Patient Safety & Systems-Based Practice

Failure mode and effects analysis (FMEA)

Clinical Overview and When to Suspect FMEA Is the Right Tool

— Originated in aerospace/military engineering; adapted to healthcare by the Joint Commission and Institute for Healthcare Improvement (IHI).

— Joint Commission requires accredited hospitals to perform at least one proactive risk assessment (typically an FMEA or HFMEA) every 18 months.

— A new process, technology, or workflow is being introduced (e.g., bar-code medication administration, new EHR module, robotic surgery program, new chemotherapy protocol).

— A high-risk, low-volume process is being redesigned (massive transfusion protocol, code blue response, neonatal resuscitation).

— Leadership asks: "What could go wrong and how do we prevent it?" — the question is forward-looking, not investigating an event that already happened.

Board pearl: On Step 3, the trigger phrase "before implementation" or "anticipate failure points" almost always points to FMEA, whereas "after a near-miss" or "following a sentinel event" points to RCA. Recognizing this single temporal cue resolves most patient-safety vignettes on this topic.

Failure Mode and Effects Analysis (FMEA) is a prospective, proactive risk-assessment method used to identify how a process could fail before harm occurs, then prioritize fixes by severity, occurrence, and detectability.

When to suspect FMEA is the correct answer on Step 3:

Key distinction: FMEA is prospective (before harm). Root Cause Analysis (RCA) is retrospective (after a sentinel event). If the stem describes a wrong-site surgery that already occurred → RCA. If the stem describes planning a new surgical checklist rollout → FMEA.

HFMEA (Healthcare FMEA) is the VA National Center for Patient Safety adaptation; uses a Hazard Score = Severity × Probability and a decision tree (detectability, criticality, existing control measures).

Conducted by a multidisciplinary team (5–10 members): frontline staff, physicians, nurses, pharmacists, IT, quality officer, ideally a patient/family advisor.

Presentation Patterns and Key History (Vignette Triggers)

— "A hospital is planning to implement a new insulin infusion protocol in the ICU. Which quality improvement tool is most appropriate to identify potential errors before rollout?"

— "The pharmacy is transitioning to a new automated dispensing cabinet. Before go-live, the safety committee wants to map every step and rank the risks."

— "A surgery department is launching a same-day discharge pathway for laparoscopic cholecystectomy and wants to anticipate handoff failures."

— "Obstetrics is creating a postpartum hemorrhage massive transfusion protocol and wants to identify high-risk steps proactively."

— Words: prospective, proactive, anticipate, prevent, before implementation, redesign, new process, high-risk process.

— A multidisciplinary team has been convened but no event has occurred.

— The team is mapping a process flow and assigning numeric risk scores.

— "After a patient received 10× the intended heparin dose…" → RCA.

— "To monitor whether central line infection rates are dropping after intervention…" → run chart / control chart / PDSA.

— "To identify which interventions account for most of the medication errors…" → Pareto chart (80/20).

— "To brainstorm all possible contributing categories of a problem…" → fishbone (Ishikawa) diagram.

Step 3 management: When the vignette explicitly states the goal is to rank failure modes by Risk Priority Number (RPN) or to use Severity × Occurrence × Detection, the answer is unambiguously FMEA. No other QI tool generates an RPN. Memorize this triad — it is the single most testable computational element of the topic.

Step 3 vignettes on FMEA rarely involve a patient at bedside — instead they present a systems scenario and ask the trainee, as a leader or committee member, to choose the correct quality tool.

Classic stem setups that should trigger "FMEA":

History elements that signal FMEA over alternatives:

Contrast vignette cues:

Structural Components and "Team Anatomy" of an FMEA

1. Select a high-risk process and define scope.

2. Assemble a multidisciplinary team (frontline workers are essential).

3. Graphically map the process — flowchart every step and sub-step.

4. Identify failure modes and their effects for each step; assign Severity (S), Occurrence (O), Detection (D) scores, typically 1–10.

5. Calculate Risk Priority Number (RPN) = S × O × D; redesign the process, mitigate highest-RPN failure modes, and re-measure.

— Severity: 1 = no noticeable effect; 10 = catastrophic/death.

— Occurrence: 1 = unlikely (<1 in 1,000,000); 10 = inevitable/frequent.

— Detection: 1 = almost certain to be caught before reaching patient; 10 = no detection mechanism exists.

— Frontline staff (nurses, techs) catch real workflow failures executives miss.

— A process owner (someone with authority to enact change) must be present.

— A facilitator trained in QI methodology keeps scoring consistent.

Key distinction: Detection scoring is inverted intuitively — higher detection score means worse (harder to detect). Many trainees miss this. A failure mode that is severe, frequent, AND invisible earns the maximum RPN and demands immediate redesign — this is the prototypical board-favorite scenario.

Although FMEA has no "physical exam," Step 3 expects you to recognize its structural elements — the analog of an exam.

Five canonical steps of an FMEA (memorize in order):

Scoring anchors (typical 1–10 scales):

RPN range: 1 to 1,000. Higher RPN = higher priority. There is no universal "action threshold", but many institutions act on RPN >100 or on any failure mode with Severity ≥ 9 regardless of RPN.

HFMEA Hazard Score = Severity × Probability (4×4 matrix, max 16); simpler than RPN and used by the VA.

Team "hemodynamics" — composition matters:

Diagnostic Workup — Process Mapping and Failure Mode Identification

— Break the process into major steps (e.g., for medication administration: prescribe → transcribe → dispense → administer → monitor).

— Break each major step into sub-steps (e.g., "administer" = verify patient, verify drug, verify dose, verify route, verify time, document).

— Aim for granularity such that each sub-step has one clear action and one clear owner.

— Failure mode = a specific way the step could go wrong (e.g., "nurse scans wrong patient's wristband").

— Effect = consequence of that failure (e.g., "wrong-patient medication administration → ADE → harm").

— Cause = underlying contributor (e.g., "two patients with same last name in adjacent beds, no second-identifier prompt").

— Fishbone diagram to brainstorm causes within categories (People, Process, Equipment, Environment, Materials, Management).

— 5 Whys to drill down to root contributors of each failure mode.

— Direct observation / Gemba walks to validate the mapped process matches reality — the mapped process is almost never the actual process.

Board pearl: A common Step 3 trap: the committee maps the process from policy documents only, never observing real workflow. The correct next step is direct observation of frontline practice before scoring — policy and reality diverge, and FMEA built on fiction generates worthless RPNs. Always validate the map at the Gemba (the actual place of work).

The "diagnostic phase" of FMEA is process mapping plus failure-mode brainstorming — the equivalent of labs and imaging in a clinical workup.

Step 1: Detailed process flow diagram.

Step 2: For each sub-step, brainstorm failure modes.

Step 3: Score S, O, D for each failure mode using institutional anchors.

Step 4: Calculate RPN and rank.

Tools that complement the "workup":

Advanced / Confirmatory Analysis — RPN Calculation and Action Prioritization

— A failure mode with S=10, O=1, D=1 (RPN=10) may still warrant action because severity is catastrophic.

— A failure mode with S=2, O=10, D=10 (RPN=200) is high-frequency but low-harm — may not deserve top priority.

— Always review Severity ≥ 9 separately, regardless of RPN.

1. Is this a single-point weakness critical to the process? If yes → act.

2. Is there an existing effective control measure? If yes → may stop.

3. Is the hazard so obvious that no control is needed? If yes → stop.

— If no exit criteria are met → proceed to redesign.

1. Forcing functions / constraints (e.g., oral syringes that physically cannot connect to IV ports — eliminates wrong-route errors).

2. Automation and computerization (CPOE with hard stops).

3. Standardization and protocols (order sets, checklists).

4. Reminders, checklists, double checks (weaker; rely on humans).

5. Rules and policies.

6. Education and training (weakest standalone intervention).

Step 3 management: When the stem asks "Which intervention is most likely to reduce the failure rate?", choose the option highest on the action hierarchy — a forcing function beats education every time. Eliminating the possibility of error trumps reminding humans not to err.

Once failure modes are scored, RPN drives action, but raw RPN alone can mislead — confirmatory analysis refines priorities.

RPN interpretation pitfalls:

HFMEA Decision Tree (confirmatory step) for each high hazard score:

Action hierarchy (strongest → weakest) — high-yield:

Re-measurement: After mitigation, rescore S, O, D; the new RPN must be documented. Without re-measurement, FMEA is incomplete.

Risk Stratification — Choosing Which Processes Deserve an FMEA

— High severity potential — could cause death or permanent harm (e.g., chemotherapy administration, blood transfusion, anticoagulation dosing, opioid PCA, neonatal resuscitation, sedation outside the OR).

— High complexity — many handoffs, multiple disciplines (e.g., OR-to-ICU transfer, ED-to-floor admission).

— New process — being introduced without local experience.

— Recent near-misses in similar processes elsewhere (sentinel event alerts from Joint Commission).

— Low volume + high risk — rare procedures where staff lack repetition (massive transfusion, malignant hyperthermia response).

— Medication management (especially high-alert drugs: insulin, heparin, opioids, chemo, KCl).

— Patient identification and specimen labeling.

— Surgical site marking and time-outs.

— Communication during handoffs.

— Blood product administration.

— Restraint use.

— Suicide risk in inpatients.

Board pearl: If the vignette describes a high-alert medication (insulin, heparin, opioids, chemotherapy, concentrated electrolytes) being introduced in a new workflow, FMEA is essentially mandatory before go-live. ISMP (Institute for Safe Medication Practices) explicitly recommends prospective risk analysis for all high-alert drug processes — a frequent distractor-resolving fact.

Not every process needs an FMEA — they are resource-intensive (often 20–40 person-hours). Stratification identifies highest-yield targets.

Criteria for selecting a process for FMEA:

Joint Commission "high-risk processes" frequently FMEA'd:

Sentinel Event Alert topics from Joint Commission often mandate local FMEA review of related processes (e.g., after a national alert on wrong-site surgery, hospitals are expected to FMEA their site-marking process).

"First-Line Therapy" — Executing the FMEA Step by Step

— Charter the project: define scope, aim, boundaries (start and end points of the process).

— Secure executive sponsorship and protected time for team members.

— Identify the process owner with authority to implement changes.

— Whiteboard the current state with frontline input.

— Validate with direct observation before next meeting.

— For each sub-step, ask: "What could go wrong here?"

— Capture failure mode, effect, and cause in a structured worksheet.

— Score S, O, D using institutional anchor tables (consistency matters more than absolute values).

— Use consensus scoring, not averaging — discussion surfaces hidden knowledge.

— Rank by RPN; flag all S ≥ 9.

— For top failure modes, design interventions using the action hierarchy (forcing functions first).

— Assign owners, deadlines, metrics.

— Define balancing measures (unintended consequences of the fix).

— Rescore S, O, D after implementation.

— Document post-RPN and percent reduction.

CCS pearl: Although CCS cases are clinical, the systems-based equivalent on Step 3 is recognizing that FMEA is a months-long cycle, not a single meeting. Stems describing a single 90-minute brainstorming session that "completed an FMEA" are flagged as inadequate methodology — the correct critique is lack of re-measurement and follow-through.

Treat the FMEA itself as the intervention. The "regimen" is the structured execution.

Pre-work (week 0):

Meeting 1 — Process mapping:

Meeting 2 — Failure mode identification:

Meeting 3 — Scoring:

Meeting 4 — Prioritization and redesign:

Meeting 5 — Implementation plan:

Months 3–6 — Re-measurement:

Tool Selection — FMEA vs Other QI Methods (Expanded Comparison)

— Purpose: prospective risk assessment of a process.

— Output: ranked failure modes by RPN; redesign plan.

— Trigger words: before, anticipate, prevent, new process, proactive.

— Purpose: retrospective investigation of a sentinel event or serious harm.

— Output: causal chain, systems contributors, corrective actions.

— Trigger words: after the event, why did this happen, sentinel event.

— Required by Joint Commission within 45 days of a sentinel event.

— Purpose: iterative testing of a small change.

— Output: data on whether the change improved a metric.

— Trigger words: small test of change, pilot, iterative.

— Purpose: reduce variation in an existing process.

— Output: process capability improved to <3.4 defects/million.

— Trigger words: variation, defect rate, statistical process control.

— Purpose: eliminate waste (the 8 wastes: DOWNTIME mnemonic).

— Output: streamlined value stream.

Key distinction: FMEA ranks risks before harm; RCA explains harm after it occurs; PDSA tests fixes iteratively; Six Sigma reduces variation; Lean removes waste. If the stem gives a rate trend over time, the answer is a control chart, not FMEA. Tool-matching is the single highest-yield testable skill in this domain.

Step 3 frequently tests tool selection rather than FMEA mechanics alone. Master this side-by-side table mentally.

FMEA (HFMEA):

Root Cause Analysis (RCA / RCA²):

PDSA (Plan-Do-Study-Act) cycle:

Six Sigma (DMAIC):

Lean:

Run chart / control chart: monitor a metric over time; detect special-cause variation.

Pareto chart: identify the vital few causes (80/20).

Fishbone diagram: organize potential causes by category.

Five Whys: drill to root cause within RCA or FMEA.

Special Populations — Small Hospitals, Critical Access, and Resource-Limited Settings

— Often lack dedicated QI staff; use abbreviated FMEA focused on 1–2 highest-risk processes per year.

— HFMEA (VA model) is preferred because its 4×4 Hazard Score is simpler than 10×10×10 RPN.

— Collaboration through state hospital associations or rural health networks allows shared FMEAs (e.g., regional sepsis pathway).

— High-risk processes deserving FMEA: anticoagulation management, refill protocols, abnormal-result follow-up (the "results management" failure that drives malpractice claims), referral closure, immunization workflows.

— Closing the loop on abnormal results is the #1 outpatient FMEA target per AHRQ — missed cancer diagnoses frequently trace to this gap.

— Common FMEA targets: falls prevention, medication reconciliation on admission/transfer, pressure injury prevention, antipsychotic use in dementia.

— FMEA on ligature risk, contraband, elopement, and 1:1 observation handoffs is Joint Commission–emphasized after multiple Sentinel Event Alerts on inpatient suicide.

Step 3 management: In a small or resource-limited setting, when asked how to sustain FMEA gains, choose forcing functions and standardized order sets over staff training programs — durability of fixes is independent of personnel turnover, which is the dominant failure mode in low-resource environments.

FMEA is resource-intensive; adaptations exist for settings with limited QI infrastructure.

Critical access hospitals (≤25 beds, rural):

Ambulatory and outpatient settings:

Long-term care / SNF:

Behavioral health units:

Renal/hepatic-equivalent in the systems world = settings with reduced "metabolic reserve" (small staff, no pharmacist on-site, limited IT). Mitigations must rely more on forcing functions and automation because education-based fixes fail when staff turnover is high.

Special Populations — Pediatrics, Obstetrics, and High-Reliability Domains

— Weight-based dosing introduces tenfold-error risk. FMEA the medication ordering → dispensing → administration chain.

— Use of kg-only weights (no lbs in EHR) is a classic forcing function that emerged from pediatric FMEAs.

— Pediatric code carts, resuscitation tape (Broselow), and dose round-down rules for chemotherapy are FMEA-derived.

— Postpartum hemorrhage and shoulder dystocia are low-volume, high-severity events ideal for FMEA + simulation.

— Magnesium sulfate infusions and oxytocin protocols are high-alert and routinely FMEA'd.

— California Maternal Quality Care Collaborative (CMQCC) bundles emerged from prospective risk analysis.

— Chemotherapy is the archetypal FMEA process — two-RN verification, independent dose recalculation, and CPOE with body-surface-area hard stops are FMEA outputs.

— WHO Surgical Safety Checklist and Universal Protocol (site marking, time-out, sign-out) are products of FMEA-style analysis of wrong-site surgery.

— ED-to-floor, ICU-to-floor, hospital-to-SNF, and discharge-to-home handoffs are FMEA-priority because information loss at handoffs causes ~80% of serious medical errors (Joint Commission data).

Board pearl: Whenever a vignette involves pediatric chemotherapy, obstetric hemorrhage, or any handoff, FMEA is almost always the correct proactive tool. These domains have disproportionate sentinel event rates and are explicitly named in Joint Commission patient safety goals.

Certain populations have higher baseline risk and warrant routine FMEA of their core processes.

Pediatrics:

Obstetrics:

NICU: Look-alike/sound-alike medications, total parenteral nutrition compounding, and milk-mix-ups are top FMEA targets.

Oncology:

Perioperative:

Transitions of care (universally high-risk):

Complications and Adverse Outcomes — Common FMEA Failures

— Scope creep: team tries to map an entire department instead of one process. Mitigation: tight charter with explicit start/stop points.

— Mapping the policy, not the practice: team uses written SOPs instead of observed workflow → fixes don't match reality. Mitigation: Gemba walk.

— Scoring inconsistency: different members interpret Severity 7 differently. Mitigation: published anchor tables and consensus scoring.

— RPN tyranny: team acts only on top RPNs and ignores high-severity/low-RPN modes. Mitigation: separate review of all S ≥ 9.

— No re-measurement: team disbands after redesign without verifying improvement. Mitigation: scheduled rescoring at 3 and 6 months.

— Lack of frontline voice: committee composed only of managers misses operational failures. Mitigation: include nurses, techs, and patients.

— Recommendations without owners or deadlines: action items languish. Mitigation: assign owner, due date, metric for each action.

— Weak interventions chosen (education-only) → no durable change. Mitigation: prefer forcing functions.

— Adding alerts → alert fatigue → bypassed alerts. New failure mode.

— Adding double checks → diffusion of responsibility → both checkers assume the other verified.

— Adding steps → workarounds as staff bypass cumbersome workflow.

Key distinction: Adding more alerts or more double checks often increases RPN over time by introducing alert fatigue and shared accountability dilution. A board-correct intervention simplifies and constrains the process rather than layering on human verification steps.

FMEAs themselves can fail. Recognizing pitfalls is testable.

Common FMEA failure modes (meta-FMEA):

Unintended consequences (balancing measures):

Cultural complication: FMEA findings that blame individuals violate Just Culture principles. FMEA outputs must address systems, not people.

When to Escalate — Governance, Sentinel Events, and Mandatory Reporting

— Identification of a near-miss already occurring in the current process → triggers parallel RCA, do not wait for FMEA to complete.

— Discovery of a hazard with S ≥ 9 and no current control → escalate to patient safety committee and executive leadership immediately; consider pausing the process.

— Failure mode involves regulatory compliance gap (e.g., EMTALA, HIPAA, Joint Commission National Patient Safety Goals) → escalate to compliance officer.

— Device-related failure mode → MedWatch (FDA) reporting; Safe Medical Devices Act requires hospitals to report device-related deaths to FDA and manufacturer (serious injury → manufacturer only).

— Patient safety event resulting in death, permanent harm, or severe temporary harm requiring intervention to sustain life.

— Specific events (wrong-site surgery, infant abduction, suicide of inpatient, retained foreign object, hemolytic transfusion reaction from ABO incompatibility) are sentinel regardless of harm.

— Hospital must conduct RCA² within 45 days.

— FMEA reports to Patient Safety Committee → Medical Executive Committee → Board Quality Committee.

— Boards of trustees have fiduciary responsibility for quality and safety (per CMS Conditions of Participation).

— PSO (Patient Safety Organization) reporting is voluntary and confidential under the Patient Safety and Quality Improvement Act of 2005.

— State-mandated event reporting varies; many states require reporting of "never events."

CCS pearl: If an FMEA uncovers an active high-severity hazard with no control, the correct immediate management is pause the workflow and escalate to leadership — do not wait for the FMEA cycle to conclude. Patient safety supersedes process completeness.

FMEA findings sometimes reveal hazards requiring escalation beyond the local team.

Escalation triggers:

Sentinel event definition (Joint Commission):

Governance structure:

External reporting:

Key Differentials — Other Prospective and Systems Tools

— Same prospective intent as FMEA but with a Hazard Score (Severity × Probability, 4×4) and an explicit decision tree to filter which hazards warrant action.

— Often considered "easier" than classic FMEA for healthcare teams.

— Quantitative, engineering-heavy method modeling fault trees and event trees.

— Rare in healthcare; common in nuclear and aviation.

— Visual tool placing a hazardous event at center with threats (left) and consequences (right), and barriers mitigating each.

— Used in high-reliability industries; growing in healthcare for systemic hazards (e.g., wrong-patient errors).

— Treats safety as a control problem; analyzes how control structures fail.

— More advanced; rarely tested but increasingly cited in modern patient safety literature.

— Decomposes tasks to identify cognitive and physical demands; complements FMEA.

— FMEA plus a criticality axis; quantifies criticality separately from RPN.

Key distinction: All of these are prospective, so distinguishing them on Step 3 comes down to named methodology in the stem. If "Risk Priority Number" or "Severity × Occurrence × Detection" appears → FMEA. If "Hazard Score" with a decision tree → HFMEA. If barriers between threats and consequences are drawn visually → bowtie. Match terminology directly.

Several methods overlap with FMEA's prospective intent. Distinguishing them is high-yield.

HFMEA (Healthcare FMEA — VA model):

Probabilistic Risk Assessment (PRA):

Bowtie analysis:

STAMP (Systems-Theoretic Accident Model and Processes) and STPA:

Hierarchical Task Analysis (HTA) and Human Factors Engineering:

Failure Mode, Effects, and Criticality Analysis (FMECA):

Key Differentials — Retrospective and Monitoring Tools (Do Not Confuse with FMEA)

— Retrospective, post-event.

— Output: causal chain, contributing factors, strong action items (RCA² emphasizes "stronger" actions like forcing functions, mirroring FMEA's action hierarchy).

— Trigger: sentinel event, serious safety event, near-miss already occurred.

— Departmental retrospective case review; educational and accountability focus.

— Less structured than RCA; not a formal QI tool.

— Voluntary frontline reports populate the data pool that triggers RCA or FMEA.

— High reporting rates indicate a strong safety culture, not an unsafe hospital.

— Retrospective chart review using triggers (e.g., naloxone administration, abrupt medication stop) to detect adverse events.

— Surveillance, not analysis.

— Monitor a metric over time; distinguish common-cause from special-cause variation.

— Trigger: trend over time, rate per month, is the change sustained?

Board pearl: The single biggest Step 3 trap is confusing FMEA with RCA. Temporal cue is decisive: before harm = FMEA; after harm = RCA. If a stem mixes both (an event occurred and leadership wants to redesign a related but separate process), both tools apply — but the question usually targets one. Read the stem's final sentence carefully — it specifies which deliverable is requested.

Equally important: tools that are not FMEA and the cues that select them.

Root Cause Analysis (RCA / RCA²):

Morbidity and Mortality (M&M) conference:

Incident reporting / safety event reporting:

Trigger tools (IHI Global Trigger Tool):

Control charts (Shewhart charts):

Run charts: simpler version; look for shifts, trends, runs.

Statistical Process Control (SPC): broader framework including control charts.

Pareto chart: identify the vital few contributors (top 20% of causes producing 80% of problems). Trigger: which causes account for most…

Fishbone (Ishikawa) diagram: organize potential causes into categories. Trigger: brainstorm contributing factors.

Long-Term Plan — Sustaining Gains and Integrating FMEA into Hospital Culture

— Standard work documentation — codify the redesigned process in policy and EHR order sets.

— Training at onboarding — new hires learn the redesigned process, not the legacy one.

— Audit and feedback — periodic compliance audits with frontline feedback loops.

— Re-FMEA every 18–36 months or when the process changes substantially (new EHR, new staffing model, new technology).

— Annual FMEA calendar prioritizing 2–4 high-risk processes per year.

— Trained facilitators (often through IHI Open School, ASQ certification, or VA NCPS courses).

— Linkage to strategic plan and board quality goals — without executive sponsorship, FMEAs die in committee.

— Integration with safety event reporting — frontline reports feed FMEA topic selection.

— Preoccupation with failure.

— Reluctance to simplify interpretations.

— Sensitivity to operations.

— Commitment to resilience.

— Deference to expertise (frontline > hierarchy).

— Hospital-Acquired Condition Reduction Program (HACRP) penalties incentivize FMEA on HACs (CLABSI, CAUTI, SSI, falls, pressure injuries).

— CMS payment penalties for readmissions drive FMEA on discharge transitions.

Step 3 management: When asked how a hospital should prevent recurrence of similar errors across departments, the long-term answer is embed prospective risk analysis (FMEA) into governance and link it to the strategic plan, not a one-off committee. Sustained safety improvement requires structural commitment, not heroic individual effort.

A single FMEA is a project; embedding FMEA into operations is cultural transformation.

Sustaining gains after individual FMEA:

Hospital-wide FMEA program elements:

High-reliability organization (HRO) principles that FMEA supports:

Value-based care linkage:

Follow-Up, Monitoring Parameters, and Re-Measurement

— Process measures — was the redesigned step followed? (e.g., "% of insulin orders using new smart-pump library").

— Outcome measures — did harm decrease? (e.g., "rate of hypoglycemia <40 mg/dL per 1000 patient-days").

— Balancing measures — did the fix cause new problems? (e.g., "time to first insulin dose," "nursing satisfaction," "alert override rate").

— Control charts to detect special-cause variation; sustained improvement requires shifts/trends per Western Electric or Nelson rules.

— Run charts for simpler trend visualization.

— Dashboards reviewed monthly at unit level, quarterly at executive level.

— Initial reassessment at 3 months post-implementation.

— Full re-FMEA at 6–12 months or sooner if incidents recur.

— Documentation of post-RPN vs pre-RPN with percent reduction.

— Frontline communication: explain why the process changed (link to patient cases when appropriate, de-identified).

— Address resistance with data, not directives.

— Recognize and celebrate teams whose FMEAs prevented harm.

— Residents on Step 3 are expected to know that FMEA is part of ACGME core competencies (Systems-Based Practice and Practice-Based Learning and Improvement).

— Residency programs increasingly require trainee participation in at least one QI/safety project.

Board pearl: A successful FMEA produces a measurable, sustained reduction in process failures, demonstrated on a control chart with documented post-intervention shift. Vignettes that describe a redesign followed by "no further events for 30 days" are inadequate evidence — that's noise, not signal. Demand statistical process control for the win.

An FMEA is incomplete without re-measurement. Step 3 may test the monitoring phase explicitly.

What to measure post-implementation:

Monitoring tools:

Re-scoring cadence:

Counseling / change management:

Education for trainees:

Ethical, Legal, and Patient Safety Considerations

— FMEA must focus on system vulnerabilities, not individuals.

— Differentiate human error (console, do not punish), at-risk behavior (coach), and reckless behavior (disciplinary action) — the Just Culture algorithm.

— Punishing individuals for system failures suppresses reporting and destroys the data pipeline FMEA depends on.

— Patient Safety and Quality Improvement Act of 2005 (PSQIA): information shared with a federally listed Patient Safety Organization (PSO) is privileged and confidential — not discoverable in malpractice litigation.

— State peer review protections also shield QI deliberations in many jurisdictions, though scope varies.

— FMEA documents themselves are generally protected when conducted under peer review, but do not assume blanket immunity — consult legal counsel.

— If FMEA reveals a previously unrecognized risk in a routine procedure (e.g., a rare device failure mode), consent forms must be updated to disclose the risk. Failure to do so creates liability.

— When FMEA review uncovers that a past patient was harmed by an unrecognized failure, disclosure to that patient is ethically required even if no lawsuit is pending — transparency is the standard of care.

— Failure to communicate pending test results at discharge is a leading malpractice driver; FMEA of discharge processes mitigates this.

Step 3 management: When an FMEA identifies that a current patient is at risk from an unaddressed failure mode, the immediate ethical and legal duty is disclosure and mitigation now — protecting the patient supersedes preserving the QI process. PSQIA confidentiality protects deliberations, not concealment of active harm.

FMEA sits at the intersection of patient safety, ethics, and law — Step 3 will probe this directly.

Just Culture and blame-free analysis:

Legal protections for QI data:

Informed consent edge case:

Disclosure of harm (CANDOR, communication-and-resolution programs):

Transition-of-care liability:

Mandatory reporting (FDA MedWatch, state never-event registries) continues to apply regardless of FMEA findings.

High-Yield Associations and Rapid-Fire Clinical Facts

Board pearl: Memorize the action hierarchy — exam writers love asking "which intervention is most likely to be effective?" and the answer is almost always the option highest on this list (typically a forcing function or CPOE hard stop).

FMEA = prospective; RCA = retrospective. The single most testable fact.

RPN = Severity × Occurrence × Detection, each 1–10, range 1–1000.

Detection score is inverted — higher = harder to detect = worse.

HFMEA uses Hazard Score = Severity × Probability (4×4) plus a decision tree (VA NCPS model).

Action hierarchy (strongest → weakest): forcing function > automation > standardization > checklists/reminders > rules > education.

Joint Commission requires proactive risk assessment every 18 months.

Sentinel event triggers RCA² within 45 days.

Joint Commission National Patient Safety Goals (NPSGs) include patient ID, communication, medication safety, alarm safety, infection prevention, suicide risk, surgical site, health equity — many derived from system FMEAs.

IHI Triple Aim (now Quadruple/Quintuple): better care, better health, lower cost, (+ clinician well-being, + equity).

PDSA cycle: Plan-Do-Study-Act for iterative testing.

Six Sigma DMAIC: Define-Measure-Analyze-Improve-Control.

Lean's 8 wastes (DOWNTIME): Defects, Overproduction, Waiting, Non-utilized talent, Transportation, Inventory, Motion, Extra-processing.

5 Whys drills causation.

Pareto principle: 80% of problems from 20% of causes.

Swiss Cheese Model (Reason): harm occurs when holes in defensive layers align.

High-alert medications (ISMP): insulin, heparin, opioids, chemotherapy, concentrated electrolytes (KCl), neuromuscular blockers.

CUSP (Comprehensive Unit-based Safety Program): AHRQ frontline safety framework.

TeamSTEPPS: AHRQ teamwork training (SBAR, CUS, handoff tools).

SBAR: Situation, Background, Assessment, Recommendation.

PSQIA 2005: federal confidentiality for PSO-shared safety data.

ACGME core competencies: SBP and PBLI directly involve FMEA participation.

Board Question Stem Patterns

— "A hospital is implementing a new IV infusion pump library. Which of the following is the most appropriate tool to identify potential errors before rollout?" → FMEA.

— Distractors: RCA (no event yet), PDSA (no iterative test described), control chart (no metric trend), Pareto (no ranking of existing causes).

— "After a patient received a tenfold heparin overdose, the safety committee convenes…" → RCA, not FMEA.

— "Before launching a new heparin protocol, the safety committee convenes…" → FMEA.

— "Which intervention is most likely to prevent recurrence of wrong-route administration?"

— Correct: oral syringes incompatible with IV ports (forcing function).

— Distractors: in-service education, reminder signs, double-check policy (all weaker).

— "A failure mode has S=10, O=2, D=8. RPN = 160. Should it be addressed?" → Yes, because Severity = 10, regardless of RPN.

— "The FMEA team is mapping the entire emergency department. What is the most appropriate next step?" → Narrow scope to a specific process (e.g., triage-to-room time, sepsis bundle initiation).

— "The team mapped the medication process from the policy manual. Next best step?" → Direct observation of frontline workflow (Gemba walk).

— "Six months after implementation, how should success be evaluated?" → Control chart of outcome and process measures with rescored RPN.

— "FMEA reveals a nurse made a calculation error. Appropriate response?" → Address system contributors (calculator availability, double-check process), not punish individual.

Key distinction: Always read the verb tense and timing in the stem first. "Will implement," "is planning," "anticipates" → FMEA. "Occurred," "received," "resulted in" → RCA. This single discriminator resolves the majority of Step 3 systems-based vignettes on quality tools.

Pattern 1 — Tool selection (most common):

Pattern 2 — Temporal cue test:

Pattern 3 — Intervention strength:

Pattern 4 — RPN interpretation:

Pattern 5 — Process scope:

Pattern 6 — Direct observation:

Pattern 7 — Re-measurement:

Pattern 8 — Just Culture:

One-Line Recap

Failure Mode and Effects Analysis (FMEA) is the prospective, multidisciplinary, team-based method that maps a high-risk process, scores each failure mode by Severity × Occurrence × Detection to generate a Risk Priority Number, and redesigns the process using the strongest available interventions — forcing functions first — to prevent patient harm before it occurs.

Board pearl: If you remember only one thing — FMEA prevents, RCA explains, PDSA tests, and forcing functions beat education every single time.

Prospective vs retrospective: FMEA before harm; RCA after harm — the single most testable distinction on Step 3 systems vignettes.

RPN = S × O × D, each 1–10, range 1–1000; but any failure mode with Severity ≥ 9 demands action regardless of RPN, and Detection is inverted (higher = harder to catch = worse).

Action hierarchy: forcing function > automation > standardization > checklists > rules > education — exam-correct interventions live at the top of this list, not the bottom.

Joint Commission mandates a proactive risk assessment every 18 months; high-alert medications, handoffs, surgical site verification, and high-risk/low-volume processes (massive transfusion, neonatal resuscitation, chemotherapy) are the canonical FMEA targets.

Sustainability requires re-measurement on control charts, frontline ownership, executive sponsorship, and integration with Just Culture — punishing individuals for system failures destroys the reporting pipeline that FMEA depends on.

PSQIA (2005) protects PSO-shared FMEA data from discovery, but does not shield concealment of active harm — disclosure to affected patients remains an ethical and legal duty.

Step 3 trigger words mapping: prospective / before / anticipate / new process → FMEA; sentinel event / after the error / why did this happen → RCA; small test of change / pilot → PDSA; variation / defect rate → Six Sigma; waste / value stream → Lean; trend over time → control chart.