Biostatistics & Population Health
Decision trees in clinical decision-making
— Decision nodes (squares): clinician-controlled choices (treat vs. observe vs. test)
— Chance nodes (circles): probabilistic events (disease present/absent, treatment success/failure)
— Terminal nodes (triangles): outcomes assigned utilities (QALYs, costs, mortality)
— Branches carry probabilities that must sum to 1.0 at each chance node
— Discrete, short-horizon problem (testing strategy, single treatment choice)
— Outcomes are time-independent or occur within a bounded period
— Probabilities and utilities can be estimated from literature
— Contrast with Markov models, used when patients transition repeatedly between health states over long horizons (e.g., chronic kidney disease progression)
— Cost-effectiveness analyses (screening colonoscopy vs. FIT)
— Threshold analysis (test-treat thresholds based on disease probability)
— Shared decision-making aids (anticoagulation in atrial fibrillation)
— Quality improvement and resource-allocation discussions

— Stem gives pretest probability and asks whether to test, treat empirically, or do neither
— Below testing threshold → no test, no treat
— Between thresholds → test, then act on result
— Above treatment threshold → treat without testing (e.g., classic angina in a 70-year-old smoker — treat, don't stress-test first)
— Two strategies compared with costs and QALYs
— Calculate ICER = (Cost_A − Cost_B) / (QALY_A − QALY_B)
— Threshold typically cited as $50,000–$150,000/QALY in the US
— Patient with borderline indication (e.g., CHA₂DS₂-VASc 1, prostate cancer screening at age 70)
— Tree clarifies tradeoffs between bleeding vs. stroke, overdiagnosis vs. mortality benefit
— Baseline disease probability (prevalence in this population)
— Test characteristics (sensitivity, specificity, LR+/LR−)
— Treatment efficacy and harm rates
— Patient values/preferences (a 75-year-old who prioritizes quality over longevity)
— Perfect health = 1.0, death = 0.0
— Stroke with disability ≈ 0.3–0.5; major bleed ≈ 0.6–0.7
— These weights drive the "right" branch

— LR+ = sensitivity / (1 − specificity)
— LR− = (1 − sensitivity) / specificity
— Posttest odds = pretest odds × LR
— LR >10 or <0.1 → large, often conclusive shift
— LR 2–5 or 0.2–0.5 → small to moderate shift
— LR ≈ 1 → useless test/finding
— S3 gallop for heart failure: LR+ ≈ 11
— Calf tenderness alone for DVT: LR+ ≈ 1.1 (minimally useful — explains why Wells score + D-dimer dominate the DVT decision tree)
— Murphy sign for cholecystitis: LR+ ≈ 2.8
— Start at decision node "test or not"
— Chance node splits into disease+/disease−
— Each splits into test+/test− using sensitivity/specificity
— Terminal utilities assigned by true/false positive/negative status
— Testing threshold = probability below which testing causes more harm than benefit
— Treatment threshold = probability above which empiric treatment beats testing-then-treating
— Both depend on test characteristics AND on the harm-to-benefit ratio of treatment

— Step 1: Frame the clinical question with mutually exclusive, collectively exhaustive options
— Step 2: Identify the time horizon (acute event vs. lifetime)
— Step 3: Draw decision nodes (squares) for clinician choices
— Step 4: Add chance nodes (circles) for probabilistic events; branches must sum to 1.0
— Step 5: Assign terminal utilities (QALYs, life-years, costs, or composite)
— Step 6: Fold back — calculate expected value at each chance node, choose max at each decision node
— Probabilities: systematic reviews, meta-analyses, registry data, local epidemiology
— Utilities: standard gamble, time trade-off, or validated catalogs (e.g., CDC HRQOL, EQ-5D)
— Costs: Medicare reimbursement schedules, Red Book pricing, micro-costing studies
— Pretest probability ≈ 80% with classic symptoms
— Empiric treatment branch: high cure rate, modest C. diff/resistance risk
— Test-first branch: 1–2 day delay, possible undertreatment
— Folding back typically favors empiric treatment — exactly why guidelines recommend it

— One-way: vary a single parameter across its plausible range, observe whether the preferred strategy changes
— Two-way: vary two parameters simultaneously, often visualized on a 2D plot with a "threshold line"
— Probabilistic (Monte Carlo): assign distributions to all parameters, run thousands of simulations, report % of iterations favoring each strategy
— Tornado diagram: ranks parameters by impact on the result — widest bars = most influential variables
— Example: if anticoagulation is preferred when annual stroke risk >1.7%, that 1.7% is the threshold and it should drive your CHA₂DS₂-VASc cutoff
— Face validity (does it match clinical intuition?)
— Internal validity (math correct, probabilities sum to 1?)
— External validity (predictions match observed real-world outcomes?)
— Cross-validation against published trials or registries
— Ignoring discounting of future costs and benefits (standard 3%/year in US analyses)
— Double-counting (e.g., counting a stroke both as event and as utility decrement without aligning time frames)
— Using disease-specific rather than all-cause mortality when comparing screening strategies — biases toward the screening arm
— Conflating efficacy (RCT) with effectiveness (real world)

— Zone 1 (below testing threshold): disease so unlikely that testing harms exceed benefits → do not test, do not treat
— Zone 2 (between thresholds): test, then treat based on result
— Zone 3 (above treatment threshold): disease so likely that empiric treatment dominates → treat without testing
— Ptt = [(1 − Sp) × H] / [(1 − Sp) × H + Sn × B]
— where H = harm of treating disease-free patients, B = benefit of treating diseased patients
— Prx = [Sp × H] / [Sp × H + (1 − Sn) × B + ...] (full form includes test risks)
— Healthy 25-year-old with atypical chest pain → below testing threshold for stress test
— 55-year-old diabetic with exertional pressure → between thresholds → stress test
— 70-year-old with classic crescendo angina → above treatment threshold → cath/treat, skip stress test
— Safer treatment → lowers treatment threshold (easier to justify empiric Rx)
— More dangerous test → raises testing threshold
— Higher test sensitivity → lowers testing threshold (won't miss disease)
— Higher test specificity → lowers treatment threshold (positive test is trusted)

— Decision node: warfarin vs. DOAC vs. no anticoagulation vs. left atrial appendage occlusion
— Chance nodes: annual stroke risk (driven by CHA₂DS₂-VASc), major bleed risk (HAS-BLED), intracranial hemorrhage rate
— Utilities: stroke with disability ≈ 0.3, GI bleed ≈ 0.7, well on anticoagulant ≈ 0.99
— Folding back: DOACs typically dominate warfarin (similar efficacy, ~50% less ICH, no INR monitoring) — except in mechanical valves and moderate-severe mitral stenosis
— ASCVD 10-year risk <5% → below threshold, no statin
— 5–7.5% → shared decision-making zone (use risk enhancers)
— ≥7.5% → above threshold, statin recommended
— Pooled Cohort Equations feed the chance node
— Tree branches on cardiovascular vs. renal vs. cost priorities
— SGLT2 inhibitor preferred when HFrEF, CKD, or established ASCVD present (high-utility branch)
— GLP-1 RA preferred for ASCVD without HF and for weight benefit
— Metformin remains first-line absent these comorbidities
— NNT = 1 / absolute risk reduction
— NNT 20 with a benign drug is favorable; NNT 100 with a toxic drug rarely is

— Decision node: surgery vs. medical management vs. minimally invasive alternative
— Chance nodes: perioperative mortality, complication rates, long-term benefit, recurrence
— Terminal utilities must include perioperative QALY decrement (typically 1–3 months of reduced health utility)
— Benefit only if perioperative stroke/death rate <3% at the operating center
— Above that, medical therapy (statin, antiplatelet, BP control) dominates
— Threshold is center- and surgeon-specific, embedding quality data into the tree
— SYNTAX score serves as the pretest probability input
— High SYNTAX (>33) + diabetes → CABG branch dominates (FREEDOM trial)
— Low SYNTAX → PCI competitive
— Colonoscopy: highest sensitivity, highest procedure risk (perforation 1/1000), 10-year interval
— FIT: lower sensitivity per round but annual repetition compensates
— Cost-effectiveness analyses generally show all three within acceptable ICER bands
— Clinical Frailty Scale ≥5 dramatically raises perioperative chance-node probabilities
— Often flips an otherwise-favorable surgery branch into the medical-management branch
— Step 3 increasingly tests functional status over chronologic age

— Shorter remaining life expectancy compresses the time horizon — long-latency benefits (e.g., cancer screening) lose value
— Lead time to benefit must be shorter than life expectancy for screening or preventive therapy to make sense
— Examples: colon cancer screening benefit lead time ≈ 7–10 years; mammography ≈ 10 years; statins for primary prevention ≈ 2–5 years
— A 78-year-old with 6-year life expectancy gains little from new screening colonoscopy but may still benefit from statin therapy
— Death from other causes becomes a major terminal node
— Disease-specific interventions yield diminishing marginal benefit
— Aggregate comorbidity indices (Charlson, ePrognosis) recalibrate baseline probabilities
— DOAC dosing thresholds (apixaban 2.5 mg BID if 2 of: age ≥80, weight ≤60 kg, Cr ≥1.5)
— Avoid dabigatran if CrCl <30; avoid rivaroxaban/edoxaban if CrCl <15
— Contrast nephropathy risk reshapes the imaging branch (favor MRI or non-contrast CT)
— Avoid acetaminophen >2 g/day in cirrhosis
— Statin selection: pravastatin and rosuvastatin preferred (less hepatic metabolism)
— DOACs contraindicated in Child-Pugh C
— Each added medication ~10% increase in adverse event probability
— Beers and STOPP/START criteria operationalize this in elderly populations

— Each chance node may split into maternal AND fetal outcomes
— Utilities for fetal outcomes are ethically and methodologically fraught (often life-years gained, not QALYs)
— Example: anticoagulation in pregnant patient with mechanical valve — warfarin (better maternal valve outcomes, fetal embryopathy 5–10% in first trimester) vs. LMWH (safer for fetus, higher maternal thrombosis if poorly monitored)
— ACE inhibitors, ARBs, warfarin, isotretinoin, valproate, methotrexate carry high-probability fetal-harm branches
— Always substitute before pregnancy when planned, or immediately on diagnosis when unplanned
— Prior unprovoked VTE → antepartum + postpartum LMWH
— Prior provoked VTE → postpartum only
— Tree pivots on recurrence probability vs. bleeding/cost
— Longer time horizon dramatically amplifies QALY gains from preventive interventions (vaccines, lead screening)
— Radiation exposure utility decrement is higher (lifetime cancer risk) — favors ultrasound and MRI branches
— Weight-based dosing introduces error nodes — EHR decision support reduces this branch's probability
— Decision trees built on registry data may underrepresent racial/ethnic minorities, distorting probability estimates
— Race-based equations (eGFR pre-2021, ASCVD Pooled Cohort) are being recalibrated to avoid embedded bias
— Step 3 increasingly tests recognition that structural determinants modify both baseline risk and treatment access

— Omitted branches: missing a relevant outcome (e.g., neglecting contrast-induced nephropathy in a CT-vs-MRI tree)
— Non-exhaustive nodes: probabilities not summing to 1.0
— Overlapping branches: double-counting events that span multiple terminal nodes
— Inappropriate independence assumptions: treating correlated events (stroke and MI, both atherosclerotic) as independent inflates joint probabilities
— Using single trial point estimates without confidence intervals
— Extrapolating efficacy from a trial population to a very different real-world population
— Stale data — guidelines and effect sizes change (e.g., aspirin for primary prevention reversed direction in 2019)
— Patients and clinicians weigh outcomes differently — clinicians often underestimate quality-of-life impact of disability
— Standard gamble vs. time trade-off yield systematically different utilities for the same state
— Anchoring on initial probability estimates
— Availability heuristic inflates recently encountered diagnoses
— Base-rate neglect — ignoring prevalence when interpreting a positive test
— Omission bias — preferring inaction harms over equivalent action harms
— Overtesting cascade: false positives → confirmatory tests → procedural complications
— Therapeutic momentum: empiric treatment that becomes hard to stop
— Financial toxicity: cost as an unmeasured harm utility

— Long time horizon (chronic disease over years/decades)
— Recurring events (multiple strokes, repeated hospitalizations)
— Health states that patients move between (CKD stages, NYHA class)
— Example: hepatitis C treatment with DAAs requires modeling fibrosis progression over 20+ years
— Resource constraints matter (OR availability, ICU beds, organ transplant queues)
— Patient interactions affect outcomes (infectious disease transmission)
— Individual-level heterogeneity is critical
— Outcomes can't be reduced to a single utility measure (equity, ethics, patient experience)
— Stakeholder values diverge — guideline panels, formulary committees
— EHR-integrated tools (e.g., MDCalc, UpToDate Pathways) provide validated, updated trees
— Reduces calculation errors and incorporates current evidence
— The "right" branch by expected utility conflicts with patient values, family wishes, or institutional policy
— Goals of care unclear at end of life
— Resource scarcity forces explicit rationing decisions
— Patient explicitly rejects the model-preferred option after informed discussion → honor autonomy
— Rapidly changing clinical status invalidates input probabilities → reassess
— New evidence (recent practice-changing trial) supersedes the tree's data

— Cyclic health states with transition probabilities per cycle
— Used for chronic disease (HIV, CKD, dementia)
— Assumes Markov property — future depends only on current state, not history (a limitation often violated and corrected with "tunnel states")
— Individual-level Markov modeling
— Tracks patient history, allowing memory
— Computationally intensive but more realistic
— Outcomes in natural units (life-years, cases prevented)
— ICER expressed as $/life-year
— Outcomes in QALYs
— Most common in modern HTA (NICE, ICER, CADTH)
— Both costs and outcomes monetized
— Controversial because it requires assigning dollar value to life/health
— Short-term affordability for a payer
— Complements but doesn't replace CEA
— Quantifies expected gain from collecting additional data (e.g., funding a new trial)
— Expected value of perfect information (EVPI) sets an upper bound
— Borrowed from finance — values flexibility to defer or modify decisions as information accrues

— Synthesize decision-analytic and trial evidence into actionable recommendations
— GRADE methodology grades both quality of evidence and strength of recommendation
— Step 3 expects familiarity with USPSTF (A/B/C/D/I), ACC/AHA (Class I/IIa/IIb/III), and IDSA recommendation schemes
— Convert tree outputs into patient-facing visuals (Option Grids, pictographs)
— Particularly important for preference-sensitive decisions: PSA screening, lumpectomy vs. mastectomy, AF anticoagulation
— Improve knowledge and decisional satisfaction; effect on outcomes mixed
— Centor criteria, Wells score, PERC, CURB-65, HEART score
— Compressed decision trees designed for rapid bedside use
— Each carries a hidden threshold approach beneath it
— Principlism (autonomy, beneficence, non-maleficence, justice)
— When the expected-utility-maximizing branch conflicts with autonomy, autonomy wins (in adults with capacity)
— Patient story and values that resist quantification
— Particularly relevant in end-of-life, chronic pain, mental health decisions
— PDSA cycles, root cause analysis, FMEA (failure modes and effects analysis)
— FMEA is sometimes called the "decision tree of patient safety" — proactively maps how a process can fail
— Highly individualized decisions with unique patient values
— Insufficient or poor-quality data for inputs
— Emergencies where time precludes formal analysis (use protocols instead)

— Disease probabilities evolve (post-MI, post-stroke — risks recalibrate)
— New comorbidities shift utility weights
— Patient preferences mature, especially around end-of-life
— Plan annual or semiannual review of long-term therapies
— High-intensity statin = above treatment threshold for all with established disease
— Add ezetimibe if LDL >70 on max statin; consider PCSK9 inhibitor if LDL still >70 in very-high-risk
— Decision tree increasingly favors aggressive LDL lowering (FOURIER, ODYSSEY trials)
— Provoked, transient risk factor → 3 months
— Unprovoked → indefinite if bleeding risk acceptable
— Cancer-associated → indefinite while cancer active, DOAC preferred (except GI/GU malignancy)
— HERDOO2 rule helps women below threshold for indefinite therapy
— Aspirin + P2Y12 inhibitor (DAPT typically 12 months)
— High-intensity statin
— Beta-blocker (especially if reduced EF or recent MI)
— ACE inhibitor/ARB if EF <40%, HTN, DM, or CKD
— Aldosterone antagonist if EF <40% with HF or DM
— A1c goals individualized (general 7%, relaxed to 8% in frail/elderly/limited life expectancy)
— SGLT2 inhibitor or GLP-1 RA regardless of A1c in ASCVD/CKD/HF

— Post-MI: cardiology in 2–6 weeks, primary care in 7–14 days
— New anticoagulation: 1–2 weeks for adherence and bleeding check, then 3-month intervals
— New antihypertensive: 4 weeks for response, then 3–6 months once controlled
— Diabetes: A1c q3 months if uncontrolled, q6 months if stable
— Chronic kidney disease: depends on stage and albuminuria — KDIGO heat-map drives frequency
— Each lab/imaging follow-up is a decision: continue, escalate, de-escalate
— Example: warfarin INR — out of range → dose adjustment branch; in range → continue
— Teach-back method confirms understanding (a quality measure)
— Address health literacy (5th–8th grade reading level for written materials)
— Disclose probabilities in natural frequencies ("3 out of 100 patients") rather than percentages alone — improves comprehension
— Visual aids (pictographs, icon arrays) outperform verbal disclosure
— Cardiac rehab post-MI: Class I recommendation, ~25% mortality reduction, dramatically underutilized
— Pulmonary rehab in COPD with GOLD B+ disease
— Stroke rehab — inpatient vs. SNF vs. outpatient determined by functional status
— Medication reconciliation at every transition
— Post-discharge phone call within 48 hours reduces readmission
— Follow-up appointment within 7 days for high-risk diagnoses (HF, COPD exacerbation)

— Disclosure must include alternatives, including no treatment
— Numerical risks should be communicated in absolute terms with confidence intervals, not just relative risks
— Patient with capacity may refuse the expected-utility-maximizing option — this is autonomy, not non-adherence
— Four components: understanding, appreciation, reasoning, expressing a choice
— Capacity is decision-specific — a patient may have capacity for some choices but not others
— Disagreement with clinician does not imply incapacity
— Cost-effectiveness thresholds raise equity concerns — strategies cost-effective in average populations may not serve marginalized groups
— Step 3 tests recognition that denying care based purely on cost without transparent process is ethically problematic
— Decision-tree authors with industry funding may inflate efficacy or omit harms
— Disclosure required in any guideline or analysis
— Suspected child/elder abuse, certain infectious diseases, gunshot wounds, impaired drivers (state-dependent), Tarasoff-style duty to warn
— These bypass standard confidentiality — a hard-coded branch
— Every transition of care (admission, transfer, discharge) is a high-risk node
— Studies show ~50% of patients have at least one medication discrepancy at discharge without formal reconciliation
— Pharmacist-led reconciliation reduces adverse drug events by ~30%
— This is a concrete Step 3 patient safety priority rooted in transition-of-care risk
— Ethical duty to disclose harm-causing errors transparently
— Apology laws in most states protect compassionate disclosure from use as evidence of liability


— "A 65-year-old man with chest pain has 40% pretest probability of CAD. Stress test sensitivity 80%, specificity 75%. Next step?"
— Recognize the between-thresholds zone → test, don't treat empirically, don't dismiss
— "Disease prevalence 10%, test sensitivity 90%, specificity 80%. PPV of a positive test?"
— Build 2×2 table with 1,000 patients; PPV ≈ 33% — illustrates base-rate neglect trap
— Two treatments, each with probability/outcome pairs
— Calculate Σ(p × outcome) for each; pick higher EV — beware "highest possible outcome" distractor
— Strategy A: $100K, 5 QALY; Strategy B: $150K, 6 QALY
— ICER = $50K/QALY → cost-effective at typical US threshold
— Strategy that is both cheaper AND more effective → dominant, always preferred
— Strategy dominated by a mixture → extended dominance, also eliminated
— "Recommendation unchanged across plausible range of all inputs" → robust, adopt
— "Recommendation flips with small change in stroke risk" → fragile, individualize
— Patient with capacity refuses expected-value-best option after full disclosure
— Correct answer: honor refusal, document, offer alternatives
— Discharge scenario with missing follow-up or unreconciled medications
— Correct answer: medication reconciliation, scheduled follow-up, clear return precautions
— Elderly patient with limited life expectancy considering screening
— Correct answer: weigh lead time vs. life expectancy — often defer

High-yield recap bullets:

