Patient Safety & Systems-Based Practice
Failure mode and effects analysis (FMEA)
— Originated in aerospace/military engineering; adapted to healthcare by the Joint Commission and Institute for Healthcare Improvement (IHI).
— Joint Commission requires accredited hospitals to perform at least one proactive risk assessment (typically an FMEA or HFMEA) every 18 months.
— A new process, technology, or workflow is being introduced (e.g., bar-code medication administration, new EHR module, robotic surgery program, new chemotherapy protocol).
— A high-risk, low-volume process is being redesigned (massive transfusion protocol, code blue response, neonatal resuscitation).
— Leadership asks: "What could go wrong and how do we prevent it?" — the question is forward-looking, not investigating an event that already happened.
Board pearl: On Step 3, the trigger phrase "before implementation" or "anticipate failure points" almost always points to FMEA, whereas "after a near-miss" or "following a sentinel event" points to RCA. Recognizing this single temporal cue resolves most patient-safety vignettes on this topic.

— "A hospital is planning to implement a new insulin infusion protocol in the ICU. Which quality improvement tool is most appropriate to identify potential errors before rollout?"
— "The pharmacy is transitioning to a new automated dispensing cabinet. Before go-live, the safety committee wants to map every step and rank the risks."
— "A surgery department is launching a same-day discharge pathway for laparoscopic cholecystectomy and wants to anticipate handoff failures."
— "Obstetrics is creating a postpartum hemorrhage massive transfusion protocol and wants to identify high-risk steps proactively."
— Words: prospective, proactive, anticipate, prevent, before implementation, redesign, new process, high-risk process.
— A multidisciplinary team has been convened but no event has occurred.
— The team is mapping a process flow and assigning numeric risk scores.
— "After a patient received 10× the intended heparin dose…" → RCA.
— "To monitor whether central line infection rates are dropping after intervention…" → run chart / control chart / PDSA.
— "To identify which interventions account for most of the medication errors…" → Pareto chart (80/20).
— "To brainstorm all possible contributing categories of a problem…" → fishbone (Ishikawa) diagram.
Step 3 management: When the vignette explicitly states the goal is to rank failure modes by Risk Priority Number (RPN) or to use Severity × Occurrence × Detection, the answer is unambiguously FMEA. No other QI tool generates an RPN. Memorize this triad — it is the single most testable computational element of the topic.

1. Select a high-risk process and define scope.
2. Assemble a multidisciplinary team (frontline workers are essential).
3. Graphically map the process — flowchart every step and sub-step.
4. Identify failure modes and their effects for each step; assign Severity (S), Occurrence (O), Detection (D) scores, typically 1–10.
5. Calculate Risk Priority Number (RPN) = S × O × D; redesign the process, mitigate highest-RPN failure modes, and re-measure.
— Severity: 1 = no noticeable effect; 10 = catastrophic/death.
— Occurrence: 1 = unlikely (<1 in 1,000,000); 10 = inevitable/frequent.
— Detection: 1 = almost certain to be caught before reaching patient; 10 = no detection mechanism exists.
— Frontline staff (nurses, techs) catch real workflow failures executives miss.
— A process owner (someone with authority to enact change) must be present.
— A facilitator trained in QI methodology keeps scoring consistent.
Key distinction: Detection scoring is inverted intuitively — higher detection score means worse (harder to detect). Many trainees miss this. A failure mode that is severe, frequent, AND invisible earns the maximum RPN and demands immediate redesign — this is the prototypical board-favorite scenario.

— Break the process into major steps (e.g., for medication administration: prescribe → transcribe → dispense → administer → monitor).
— Break each major step into sub-steps (e.g., "administer" = verify patient, verify drug, verify dose, verify route, verify time, document).
— Aim for granularity such that each sub-step has one clear action and one clear owner.
— Failure mode = a specific way the step could go wrong (e.g., "nurse scans wrong patient's wristband").
— Effect = consequence of that failure (e.g., "wrong-patient medication administration → ADE → harm").
— Cause = underlying contributor (e.g., "two patients with same last name in adjacent beds, no second-identifier prompt").
— Fishbone diagram to brainstorm causes within categories (People, Process, Equipment, Environment, Materials, Management).
— 5 Whys to drill down to root contributors of each failure mode.
— Direct observation / Gemba walks to validate the mapped process matches reality — the mapped process is almost never the actual process.
Board pearl: A common Step 3 trap: the committee maps the process from policy documents only, never observing real workflow. The correct next step is direct observation of frontline practice before scoring — policy and reality diverge, and FMEA built on fiction generates worthless RPNs. Always validate the map at the Gemba (the actual place of work).

— A failure mode with S=10, O=1, D=1 (RPN=10) may still warrant action because severity is catastrophic.
— A failure mode with S=2, O=10, D=10 (RPN=200) is high-frequency but low-harm — may not deserve top priority.
— Always review Severity ≥ 9 separately, regardless of RPN.
1. Is this a single-point weakness critical to the process? If yes → act.
2. Is there an existing effective control measure? If yes → may stop.
3. Is the hazard so obvious that no control is needed? If yes → stop.
— If no exit criteria are met → proceed to redesign.
1. Forcing functions / constraints (e.g., oral syringes that physically cannot connect to IV ports — eliminates wrong-route errors).
2. Automation and computerization (CPOE with hard stops).
3. Standardization and protocols (order sets, checklists).
4. Reminders, checklists, double checks (weaker; rely on humans).
5. Rules and policies.
6. Education and training (weakest standalone intervention).
Step 3 management: When the stem asks "Which intervention is most likely to reduce the failure rate?", choose the option highest on the action hierarchy — a forcing function beats education every time. Eliminating the possibility of error trumps reminding humans not to err.

— High severity potential — could cause death or permanent harm (e.g., chemotherapy administration, blood transfusion, anticoagulation dosing, opioid PCA, neonatal resuscitation, sedation outside the OR).
— High complexity — many handoffs, multiple disciplines (e.g., OR-to-ICU transfer, ED-to-floor admission).
— New process — being introduced without local experience.
— Recent near-misses in similar processes elsewhere (sentinel event alerts from Joint Commission).
— Low volume + high risk — rare procedures where staff lack repetition (massive transfusion, malignant hyperthermia response).
— Medication management (especially high-alert drugs: insulin, heparin, opioids, chemo, KCl).
— Patient identification and specimen labeling.
— Surgical site marking and time-outs.
— Communication during handoffs.
— Blood product administration.
— Restraint use.
— Suicide risk in inpatients.
Board pearl: If the vignette describes a high-alert medication (insulin, heparin, opioids, chemotherapy, concentrated electrolytes) being introduced in a new workflow, FMEA is essentially mandatory before go-live. ISMP (Institute for Safe Medication Practices) explicitly recommends prospective risk analysis for all high-alert drug processes — a frequent distractor-resolving fact.

— Charter the project: define scope, aim, boundaries (start and end points of the process).
— Secure executive sponsorship and protected time for team members.
— Identify the process owner with authority to implement changes.
— Whiteboard the current state with frontline input.
— Validate with direct observation before next meeting.
— For each sub-step, ask: "What could go wrong here?"
— Capture failure mode, effect, and cause in a structured worksheet.
— Score S, O, D using institutional anchor tables (consistency matters more than absolute values).
— Use consensus scoring, not averaging — discussion surfaces hidden knowledge.
— Rank by RPN; flag all S ≥ 9.
— For top failure modes, design interventions using the action hierarchy (forcing functions first).
— Assign owners, deadlines, metrics.
— Define balancing measures (unintended consequences of the fix).
— Rescore S, O, D after implementation.
— Document post-RPN and percent reduction.
CCS pearl: Although CCS cases are clinical, the systems-based equivalent on Step 3 is recognizing that FMEA is a months-long cycle, not a single meeting. Stems describing a single 90-minute brainstorming session that "completed an FMEA" are flagged as inadequate methodology — the correct critique is lack of re-measurement and follow-through.

— Purpose: prospective risk assessment of a process.
— Output: ranked failure modes by RPN; redesign plan.
— Trigger words: before, anticipate, prevent, new process, proactive.
— Purpose: retrospective investigation of a sentinel event or serious harm.
— Output: causal chain, systems contributors, corrective actions.
— Trigger words: after the event, why did this happen, sentinel event.
— Required by Joint Commission within 45 days of a sentinel event.
— Purpose: iterative testing of a small change.
— Output: data on whether the change improved a metric.
— Trigger words: small test of change, pilot, iterative.
— Purpose: reduce variation in an existing process.
— Output: process capability improved to <3.4 defects/million.
— Trigger words: variation, defect rate, statistical process control.
— Purpose: eliminate waste (the 8 wastes: DOWNTIME mnemonic).
— Output: streamlined value stream.
Key distinction: FMEA ranks risks before harm; RCA explains harm after it occurs; PDSA tests fixes iteratively; Six Sigma reduces variation; Lean removes waste. If the stem gives a rate trend over time, the answer is a control chart, not FMEA. Tool-matching is the single highest-yield testable skill in this domain.

— Often lack dedicated QI staff; use abbreviated FMEA focused on 1–2 highest-risk processes per year.
— HFMEA (VA model) is preferred because its 4×4 Hazard Score is simpler than 10×10×10 RPN.
— Collaboration through state hospital associations or rural health networks allows shared FMEAs (e.g., regional sepsis pathway).
— High-risk processes deserving FMEA: anticoagulation management, refill protocols, abnormal-result follow-up (the "results management" failure that drives malpractice claims), referral closure, immunization workflows.
— Closing the loop on abnormal results is the #1 outpatient FMEA target per AHRQ — missed cancer diagnoses frequently trace to this gap.
— Common FMEA targets: falls prevention, medication reconciliation on admission/transfer, pressure injury prevention, antipsychotic use in dementia.
— FMEA on ligature risk, contraband, elopement, and 1:1 observation handoffs is Joint Commission–emphasized after multiple Sentinel Event Alerts on inpatient suicide.
Step 3 management: In a small or resource-limited setting, when asked how to sustain FMEA gains, choose forcing functions and standardized order sets over staff training programs — durability of fixes is independent of personnel turnover, which is the dominant failure mode in low-resource environments.

— Weight-based dosing introduces tenfold-error risk. FMEA the medication ordering → dispensing → administration chain.
— Use of kg-only weights (no lbs in EHR) is a classic forcing function that emerged from pediatric FMEAs.
— Pediatric code carts, resuscitation tape (Broselow), and dose round-down rules for chemotherapy are FMEA-derived.
— Postpartum hemorrhage and shoulder dystocia are low-volume, high-severity events ideal for FMEA + simulation.
— Magnesium sulfate infusions and oxytocin protocols are high-alert and routinely FMEA'd.
— California Maternal Quality Care Collaborative (CMQCC) bundles emerged from prospective risk analysis.
— Chemotherapy is the archetypal FMEA process — two-RN verification, independent dose recalculation, and CPOE with body-surface-area hard stops are FMEA outputs.
— WHO Surgical Safety Checklist and Universal Protocol (site marking, time-out, sign-out) are products of FMEA-style analysis of wrong-site surgery.
— ED-to-floor, ICU-to-floor, hospital-to-SNF, and discharge-to-home handoffs are FMEA-priority because information loss at handoffs causes ~80% of serious medical errors (Joint Commission data).
Board pearl: Whenever a vignette involves pediatric chemotherapy, obstetric hemorrhage, or any handoff, FMEA is almost always the correct proactive tool. These domains have disproportionate sentinel event rates and are explicitly named in Joint Commission patient safety goals.

— Scope creep: team tries to map an entire department instead of one process. Mitigation: tight charter with explicit start/stop points.
— Mapping the policy, not the practice: team uses written SOPs instead of observed workflow → fixes don't match reality. Mitigation: Gemba walk.
— Scoring inconsistency: different members interpret Severity 7 differently. Mitigation: published anchor tables and consensus scoring.
— RPN tyranny: team acts only on top RPNs and ignores high-severity/low-RPN modes. Mitigation: separate review of all S ≥ 9.
— No re-measurement: team disbands after redesign without verifying improvement. Mitigation: scheduled rescoring at 3 and 6 months.
— Lack of frontline voice: committee composed only of managers misses operational failures. Mitigation: include nurses, techs, and patients.
— Recommendations without owners or deadlines: action items languish. Mitigation: assign owner, due date, metric for each action.
— Weak interventions chosen (education-only) → no durable change. Mitigation: prefer forcing functions.
— Adding alerts → alert fatigue → bypassed alerts. New failure mode.
— Adding double checks → diffusion of responsibility → both checkers assume the other verified.
— Adding steps → workarounds as staff bypass cumbersome workflow.
Key distinction: Adding more alerts or more double checks often increases RPN over time by introducing alert fatigue and shared accountability dilution. A board-correct intervention simplifies and constrains the process rather than layering on human verification steps.

— Identification of a near-miss already occurring in the current process → triggers parallel RCA, do not wait for FMEA to complete.
— Discovery of a hazard with S ≥ 9 and no current control → escalate to patient safety committee and executive leadership immediately; consider pausing the process.
— Failure mode involves regulatory compliance gap (e.g., EMTALA, HIPAA, Joint Commission National Patient Safety Goals) → escalate to compliance officer.
— Device-related failure mode → MedWatch (FDA) reporting; Safe Medical Devices Act requires hospitals to report device-related deaths to FDA and manufacturer (serious injury → manufacturer only).
— Patient safety event resulting in death, permanent harm, or severe temporary harm requiring intervention to sustain life.
— Specific events (wrong-site surgery, infant abduction, suicide of inpatient, retained foreign object, hemolytic transfusion reaction from ABO incompatibility) are sentinel regardless of harm.
— Hospital must conduct RCA² within 45 days.
— FMEA reports to Patient Safety Committee → Medical Executive Committee → Board Quality Committee.
— Boards of trustees have fiduciary responsibility for quality and safety (per CMS Conditions of Participation).
— PSO (Patient Safety Organization) reporting is voluntary and confidential under the Patient Safety and Quality Improvement Act of 2005.
— State-mandated event reporting varies; many states require reporting of "never events."
CCS pearl: If an FMEA uncovers an active high-severity hazard with no control, the correct immediate management is pause the workflow and escalate to leadership — do not wait for the FMEA cycle to conclude. Patient safety supersedes process completeness.

— Same prospective intent as FMEA but with a Hazard Score (Severity × Probability, 4×4) and an explicit decision tree to filter which hazards warrant action.
— Often considered "easier" than classic FMEA for healthcare teams.
— Quantitative, engineering-heavy method modeling fault trees and event trees.
— Rare in healthcare; common in nuclear and aviation.
— Visual tool placing a hazardous event at center with threats (left) and consequences (right), and barriers mitigating each.
— Used in high-reliability industries; growing in healthcare for systemic hazards (e.g., wrong-patient errors).
— Treats safety as a control problem; analyzes how control structures fail.
— More advanced; rarely tested but increasingly cited in modern patient safety literature.
— Decomposes tasks to identify cognitive and physical demands; complements FMEA.
— FMEA plus a criticality axis; quantifies criticality separately from RPN.
Key distinction: All of these are prospective, so distinguishing them on Step 3 comes down to named methodology in the stem. If "Risk Priority Number" or "Severity × Occurrence × Detection" appears → FMEA. If "Hazard Score" with a decision tree → HFMEA. If barriers between threats and consequences are drawn visually → bowtie. Match terminology directly.

— Retrospective, post-event.
— Output: causal chain, contributing factors, strong action items (RCA² emphasizes "stronger" actions like forcing functions, mirroring FMEA's action hierarchy).
— Trigger: sentinel event, serious safety event, near-miss already occurred.
— Departmental retrospective case review; educational and accountability focus.
— Less structured than RCA; not a formal QI tool.
— Voluntary frontline reports populate the data pool that triggers RCA or FMEA.
— High reporting rates indicate a strong safety culture, not an unsafe hospital.
— Retrospective chart review using triggers (e.g., naloxone administration, abrupt medication stop) to detect adverse events.
— Surveillance, not analysis.
— Monitor a metric over time; distinguish common-cause from special-cause variation.
— Trigger: trend over time, rate per month, is the change sustained?
Board pearl: The single biggest Step 3 trap is confusing FMEA with RCA. Temporal cue is decisive: before harm = FMEA; after harm = RCA. If a stem mixes both (an event occurred and leadership wants to redesign a related but separate process), both tools apply — but the question usually targets one. Read the stem's final sentence carefully — it specifies which deliverable is requested.

— Standard work documentation — codify the redesigned process in policy and EHR order sets.
— Training at onboarding — new hires learn the redesigned process, not the legacy one.
— Audit and feedback — periodic compliance audits with frontline feedback loops.
— Re-FMEA every 18–36 months or when the process changes substantially (new EHR, new staffing model, new technology).
— Annual FMEA calendar prioritizing 2–4 high-risk processes per year.
— Trained facilitators (often through IHI Open School, ASQ certification, or VA NCPS courses).
— Linkage to strategic plan and board quality goals — without executive sponsorship, FMEAs die in committee.
— Integration with safety event reporting — frontline reports feed FMEA topic selection.
— Preoccupation with failure.
— Reluctance to simplify interpretations.
— Sensitivity to operations.
— Commitment to resilience.
— Deference to expertise (frontline > hierarchy).
— Hospital-Acquired Condition Reduction Program (HACRP) penalties incentivize FMEA on HACs (CLABSI, CAUTI, SSI, falls, pressure injuries).
— CMS payment penalties for readmissions drive FMEA on discharge transitions.
Step 3 management: When asked how a hospital should prevent recurrence of similar errors across departments, the long-term answer is embed prospective risk analysis (FMEA) into governance and link it to the strategic plan, not a one-off committee. Sustained safety improvement requires structural commitment, not heroic individual effort.

— Process measures — was the redesigned step followed? (e.g., "% of insulin orders using new smart-pump library").
— Outcome measures — did harm decrease? (e.g., "rate of hypoglycemia <40 mg/dL per 1000 patient-days").
— Balancing measures — did the fix cause new problems? (e.g., "time to first insulin dose," "nursing satisfaction," "alert override rate").
— Control charts to detect special-cause variation; sustained improvement requires shifts/trends per Western Electric or Nelson rules.
— Run charts for simpler trend visualization.
— Dashboards reviewed monthly at unit level, quarterly at executive level.
— Initial reassessment at 3 months post-implementation.
— Full re-FMEA at 6–12 months or sooner if incidents recur.
— Documentation of post-RPN vs pre-RPN with percent reduction.
— Frontline communication: explain why the process changed (link to patient cases when appropriate, de-identified).
— Address resistance with data, not directives.
— Recognize and celebrate teams whose FMEAs prevented harm.
— Residents on Step 3 are expected to know that FMEA is part of ACGME core competencies (Systems-Based Practice and Practice-Based Learning and Improvement).
— Residency programs increasingly require trainee participation in at least one QI/safety project.
Board pearl: A successful FMEA produces a measurable, sustained reduction in process failures, demonstrated on a control chart with documented post-intervention shift. Vignettes that describe a redesign followed by "no further events for 30 days" are inadequate evidence — that's noise, not signal. Demand statistical process control for the win.

— FMEA must focus on system vulnerabilities, not individuals.
— Differentiate human error (console, do not punish), at-risk behavior (coach), and reckless behavior (disciplinary action) — the Just Culture algorithm.
— Punishing individuals for system failures suppresses reporting and destroys the data pipeline FMEA depends on.
— Patient Safety and Quality Improvement Act of 2005 (PSQIA): information shared with a federally listed Patient Safety Organization (PSO) is privileged and confidential — not discoverable in malpractice litigation.
— State peer review protections also shield QI deliberations in many jurisdictions, though scope varies.
— FMEA documents themselves are generally protected when conducted under peer review, but do not assume blanket immunity — consult legal counsel.
— If FMEA reveals a previously unrecognized risk in a routine procedure (e.g., a rare device failure mode), consent forms must be updated to disclose the risk. Failure to do so creates liability.
— When FMEA review uncovers that a past patient was harmed by an unrecognized failure, disclosure to that patient is ethically required even if no lawsuit is pending — transparency is the standard of care.
— Failure to communicate pending test results at discharge is a leading malpractice driver; FMEA of discharge processes mitigates this.
Step 3 management: When an FMEA identifies that a current patient is at risk from an unaddressed failure mode, the immediate ethical and legal duty is disclosure and mitigation now — protecting the patient supersedes preserving the QI process. PSQIA confidentiality protects deliberations, not concealment of active harm.

Board pearl: Memorize the action hierarchy — exam writers love asking "which intervention is most likely to be effective?" and the answer is almost always the option highest on this list (typically a forcing function or CPOE hard stop).

— "A hospital is implementing a new IV infusion pump library. Which of the following is the most appropriate tool to identify potential errors before rollout?" → FMEA.
— Distractors: RCA (no event yet), PDSA (no iterative test described), control chart (no metric trend), Pareto (no ranking of existing causes).
— "After a patient received a tenfold heparin overdose, the safety committee convenes…" → RCA, not FMEA.
— "Before launching a new heparin protocol, the safety committee convenes…" → FMEA.
— "Which intervention is most likely to prevent recurrence of wrong-route administration?"
— Correct: oral syringes incompatible with IV ports (forcing function).
— Distractors: in-service education, reminder signs, double-check policy (all weaker).
— "A failure mode has S=10, O=2, D=8. RPN = 160. Should it be addressed?" → Yes, because Severity = 10, regardless of RPN.
— "The FMEA team is mapping the entire emergency department. What is the most appropriate next step?" → Narrow scope to a specific process (e.g., triage-to-room time, sepsis bundle initiation).
— "The team mapped the medication process from the policy manual. Next best step?" → Direct observation of frontline workflow (Gemba walk).
— "Six months after implementation, how should success be evaluated?" → Control chart of outcome and process measures with rescored RPN.
— "FMEA reveals a nurse made a calculation error. Appropriate response?" → Address system contributors (calculator availability, double-check process), not punish individual.
Key distinction: Always read the verb tense and timing in the stem first. "Will implement," "is planning," "anticipates" → FMEA. "Occurred," "received," "resulted in" → RCA. This single discriminator resolves the majority of Step 3 systems-based vignettes on quality tools.

Failure Mode and Effects Analysis (FMEA) is the prospective, multidisciplinary, team-based method that maps a high-risk process, scores each failure mode by Severity × Occurrence × Detection to generate a Risk Priority Number, and redesigns the process using the strongest available interventions — forcing functions first — to prevent patient harm before it occurs.
Board pearl: If you remember only one thing — FMEA prevents, RCA explains, PDSA tests, and forcing functions beat education every single time.

