Biostatistics & Population Health
Markov models for chronic disease decisions
— Disease has a chronic, progressive, or relapsing-remitting course (diabetes, HIV, CKD, HF, cancer surveillance, hepatitis C, atrial fibrillation anticoagulation).
— Outcomes depend on time spent in a state, not just a single event.
— A simple decision tree would become unwieldy because of repeated events or long horizons.
Board pearl: Suspect a Markov framework whenever a question describes a lifetime horizon, recurring events, and QALYs — decision trees alone cannot efficiently represent looping back into prior states.

— "A cost-effectiveness analysis used a Markov model with states well, post-MI, post-stroke, dead…" → expect questions on state definitions or transitions.
— "Annual probability of progression from CKD stage 3 to stage 4 is 0.08…" → expect a transition probability calculation or interpretation.
— "The ICER of drug B vs drug A is $80,000/QALY…" → expect willingness-to-pay threshold interpretation.
— "Results were most sensitive to the utility of the post-stroke state…" → expect sensitivity analysis reasoning.
— Time horizon (lifetime vs 10-year vs 5-year) — lifetime is standard for chronic disease.
— Cycle length (often 1 month for acute conditions, 1 year for chronic) — should be short enough that ≥2 transitions per cycle are unlikely.
— Perspective (healthcare sector vs societal) — societal includes productivity losses and patient time; healthcare sector does not.
— Discount rate — almost always 3% in US analyses per the Second Panel on Cost-Effectiveness in Health and Medicine.
— Cycle length too long relative to event rate.
— No absorbing state (death) in a lifetime analysis.
— Transition probabilities >1 or that don't sum to 1 across exit options from a state.
Key distinction: A rate (events per person-time, can exceed 1) differs from a probability (bounded 0–1 over a fixed interval). Step 3 may give a rate and require conversion: p = 1 − e^(−rt).

— Health states: Mutually exclusive, collectively exhaustive (every patient is in exactly one state each cycle). Example for atrial fibrillation: well on anticoagulation, post-stroke, post-major bleed, dead.
— Absorbing state: Once entered, never left. Death is always absorbing. Some models also absorb to "post-event" states for simplicity.
— Transition probabilities: Each cycle, probability of moving from state i to state j. Rows of the transition matrix must sum to 1.0.
— Cycle length: Discrete time step. Must be short enough that the Markov assumption (no more than one transition per cycle) holds reasonably.
— Rewards: Each state has an associated cost (dollars/cycle) and utility (QALY weight/cycle, 0–1).
— Violation example: Risk of recurrent MI depends on time since first MI → need tunnel states (sequential temporary states) or a semi-Markov model.
Board pearl: If a question says "risk depends on duration of disease" or "history of prior events," a standard Markov model is inadequate — you need tunnel states or microsimulation to carry memory.

— RCTs and meta-analyses for treatment effects (e.g., relative risk of stroke on warfarin vs DOAC).
— Registry/cohort data for natural history (e.g., progression rate in untreated hep C).
— Life tables (CDC/SSA) for background mortality by age and sex, layered onto disease-specific mortality.
— Probability over time t: p = 1 − e^(−r·t), where r is the instantaneous rate.
— Annual rate from annual probability: r = −ln(1 − p).
— Critical when cycle length differs from the published data (e.g., 5-year probability converted to 1-year cycle).
— Anchored: 1.0 = perfect health, 0 = dead, negative values possible (states worse than death).
— Measured via standard gamble, time trade-off, or EQ-5D (most common in US CEAs).
— Catalogs: Tufts CEA Registry, HUI3.
— Example values: well = 0.85; post-stroke disabled = 0.40; post-MI stable = 0.80.
— Use direct medical costs (healthcare perspective) or add indirect/productivity costs (societal perspective).
— Inflation-adjust to a common year using the medical care CPI.
— Discount future costs and QALYs at 3%/year (US standard).
— Probabilities in each row sum to 1.
— Utilities bounded ≤1.
— Background mortality never set to 0.
Step 3 management: When asked which input most needs validation, prioritize transition probabilities derived from observational data and utility weights, because both carry the largest uncertainty and drive ICER variability.

— Use: identifies value of information — where future research has highest payoff.
— Output: cost-effectiveness acceptability curve (CEAC) — probability the intervention is cost-effective at each willingness-to-pay threshold.
— Output: scatter on cost-effectiveness plane (Δcost vs ΔQALY).
— Face validity: Experts agree the structure is reasonable.
— Internal validity: Code does what it claims (debug).
— Cross-validity: Results agree with other independent models.
— Predictive validity: Model predictions match observed long-term outcomes (gold standard but rare).
— Calibration: Adjust unobservable inputs so model reproduces observed epidemiology.
Board pearl: A tornado diagram answers "which parameter matters most?"; a CEAC answers "how confident are we the intervention is cost-effective at a given threshold?" — Step 3 distractors often swap these.

— ICER = (Cost_B − Cost_A) / (QALY_B − QALY_A)
— Units: dollars per QALY gained.
— Adopt B if ICER < λ.
— US conventional thresholds: $50,000/QALY (historic), $100,000–$150,000/QALY (contemporary, per ICER Institute and WHO-adjacent benchmarks).
— NE (more cost, more QALYs): Trade-off — compare ICER to λ.
— SE (less cost, more QALYs): Dominant — always adopt.
— NW (more cost, fewer QALYs): Dominated — never adopt.
— SW (less cost, fewer QALYs): Trade-off in reverse — adopt only if savings exceed λ per QALY lost.
— Comparing ICERs to average cost-effectiveness instead of incremental.
— Ignoring dominated strategies in the ranking.
— Forgetting discounting changes the ICER.
Step 3 management: When a question lists multiple strategies with costs and QALYs, sort by increasing cost, eliminate dominated and extendedly dominated options, then compute ICERs sequentially against the next-cheapest non-dominated strategy.

— Tracks proportions of a hypothetical cohort across states.
— Fast, transparent, easily reproduced.
— Limitation: cannot carry individual memory (e.g., cumulative drug exposure).
— Simulates individuals one at a time through the state diagram.
— Each person can carry attributes/history (age, prior events, biomarker values).
— Needed when heterogeneity or memory matters (e.g., HCV with fibrosis stages depending on duration of infection).
— Subdivide a state into sequential temporary states (e.g., "year 1 post-MI," "year 2 post-MI") so transition probabilities can vary by time-in-state.
— Time-to-event based, not cycle-based.
— Use when events are highly time-varying or queues/resources matter (e.g., transplant waiting lists).
— Because transitions are assumed to occur at cycle boundaries but events accrue continuously, add a half-cycle adjustment to costs and QALYs (or use life-table style integration) — improves accuracy especially with longer cycles.
Key distinction: Cohort Markov = proportions, memoryless; microsimulation = individuals, memory possible. Choose microsimulation whenever patient history must influence future transitions.

— Population, Intervention, Comparator, Outcomes, Timing, Setting/perspective.
— Mutually exclusive, collectively exhaustive, clinically meaningful, and consistent with available data.
— Avoid state explosion: combine clinically similar states when transition rates are similar.
— Short enough that ≤1 transition/cycle is realistic; common: 1 year for chronic, 1 month for cancer/HIV, 1 day for ICU.
— Use best available evidence; explicitly cite source for each probability.
— Layer age-/sex-specific background mortality every cycle.
— Cost (USD) and utility (QALY weight) per state per cycle; one-time event costs/disutilities for transitions (e.g., cost of acute MI hospitalization).
— Iterate the matrix across the time horizon, applying discounting.
— Apply half-cycle correction.
— Transition probabilities derived from one time interval applied to a different cycle length without rate conversion.
— Double-counting acute event costs (in both state and transition).
— Failing to discount.
— No background mortality.
CCS pearl: Treat model construction like a CCS case — order labs (gather inputs), choose interventions (define strategies), monitor (sensitivity analysis), and document (CHEERS reporting) before "discharging" the analysis to publication.

— Background all-cause mortality from SSA life tables must be applied each cycle.
— In elderly cohorts, competing death often shrinks the QALY benefit of preventive interventions (a 75-year-old gains fewer QALYs from a statin than a 55-year-old because they may die from something else first).
— This is why lung cancer screening with LDCT is cost-effective in ages 50–80 but ICERs worsen sharply beyond.
— Risks of MI, stroke, fracture, dementia all rise with age — use age-stratified probabilities, not a single value.
— Multiple comorbid states should not naively multiply utilities below biologically plausible floors. Use additive or minimum combination rules per published methodology rather than multiplication, which over-penalizes.
— In CKD models, GFR-defined stages (1–5, 5D for dialysis) are natural states; transitions reflect annual eGFR decline (~3–5 mL/min/year average untreated).
— Hepatic models (HCV, NAFLD) use METAVIR fibrosis stages F0–F4 → decompensated cirrhosis → HCC → liver transplant → death.
— Realistic models incorporate adherence-adjusted effectiveness (efficacy × adherence proportion), particularly important in elderly with pill burden.
Board pearl: In elderly cohorts, a preventive intervention's ICER worsens with age primarily because of shorter remaining life expectancy and competing mortality, not declining drug efficacy — a common Step 3 distractor swap.

— Long time horizons (often 80+ years) amplify the effect of the discount rate — a 3% rate halves value every ~23 years, making downstream benefits appear small.
— Pediatric utilities are harder to elicit; standard adult instruments (EQ-5D-3L) are inappropriate for young children — use EQ-5D-Y, HUI2/3, or PedsQL utility mappings.
— Often use short-horizon decision trees rather than Markov, because the decision window is months.
— When Markov is used (e.g., HIV antiretrovirals in pregnancy), states must include maternal health, fetal/infant transmission, and infant outcomes jointly.
— Standard CEA maximizes aggregate QALYs — can inadvertently disadvantage groups with lower baseline life expectancy (older adults, marginalized populations).
— DCEA partitions health outcomes by subgroup (race, income, geography) and uses equity weights to reflect societal preference for reducing disparities.
— Increasingly required by ICER and some payers.
Key distinction: Standard cost-effectiveness analysis is efficiency-focused (maximize total QALYs); distributional CEA is equity-focused (also weight who gains those QALYs). Step 3 ethics questions may pivot on this.

— Risk often depends on duration in a state (e.g., longer dialysis time → higher mortality). Fix: tunnel states or microsimulation.
— Adding attributes (age × sex × biomarker × prior events) multiplies states geometrically. Microsimulation handles this but at computational cost.
— Transition probabilities extrapolated beyond trial duration (10-year RCT → lifetime model) carry massive uncertainty. Address with scenario analyses and probabilistic uncertainty.
— Interventions with upfront costs and distant benefits (childhood vaccination, prevention) appear less cost-effective at higher discount rates. A common policy critique.
— Industry-sponsored models tend to favor sponsor product. Mitigation: independent replication (e.g., ICER's evidence reports), open code, CHEERS reporting.
— Healthcare-sector perspective omits caregiver time, productivity, education — may understate value of pediatric or mental health interventions.
— Standard Markov models are static (one person's outcomes don't affect another's). Wrong for infectious disease vaccination, where herd immunity matters — use dynamic transmission models (SIR/SEIR compartmental).
— A strict $100K/QALY rule can deny modestly beneficial therapies for rare or severe disease; rigid application is ethically contested.
Step 3 management: When a stem describes an infectious disease vaccination program, a static Markov model is insufficient — recognize the need for a dynamic transmission model capturing herd effects.

— Individual heterogeneity dominates (e.g., genetic risk scores, prior event count).
— State definitions would require >20–30 states.
— Memory or time-in-state effects are pervasive.
— Resource constraints or queues matter (transplant lists, ICU bed availability).
— Event timing is critical and continuous.
— Modeling infectious disease where one person's infection status affects another's risk (vaccination policy, antimicrobial resistance, HIV PrEP at population scale).
— Behavior, social networks, and spatial interactions matter (obesity policy, smoking cessation, drug use).
— Chronic non-communicable disease, individual decisions, sufficiently small state space, available data align with cycle length.
— Health economists, decision analysts, biostatisticians for advanced builds.
— Patient and clinician input for state definitions and utility plausibility.
— Submit per CHEERS 2022 and consider posting code on public repositories (Github, OSF) for transparency.
— ICER, NICE, CADTH, and the Institute for Clinical and Economic Review provide independent re-analysis.
Board pearl: A useful mnemonic: "Memory → microsim; herd → dynamic; queue → DES; networks → agent-based; chronic stable → cohort Markov." Step 3 distractors mix these.

— Branches represent choices and chance events with one-time probabilities.
— Best for short-horizon, single-decision problems (acute appendicitis, single screening test).
— Limit: clumsy when events recur — exponential branching.
— State-transition over time; cohort proportions.
— Best for chronic disease with recurring events.
— Individuals tracked one at a time; can carry memory.
— Best when heterogeneity or history drives outcomes.
— Continuous time, event-driven, resource constraints possible.
— Best for operational problems with queues or scarce resources.
— Common in oncology; uses observed survival curves (PFS, OS) directly rather than transition probabilities.
— Critique: doesn't enforce internal consistency between transitions; can mis-extrapolate.
— Models infection spread through a population; force of infection depends on prevalence.
— Required for vaccine cost-effectiveness.
— Bottom-up; agents with behaviors and interactions.
— Best for social/behavioral dynamics.
Key distinction: A partitioned survival model uses observed Kaplan–Meier curves (popular in oncology submissions to payers) while a Markov model uses transition probabilities between explicit states. The former extrapolates curves; the latter enforces a mechanistic structure.

— CEA: outcome in natural units (life-years gained, cases averted).
— CUA: outcome in QALYs (most common Markov output); allows comparison across diseases.
— Step 3 often uses "cost-effectiveness" loosely to mean CUA.
— Both costs and outcomes monetized (willingness-to-pay in dollars).
— Allows cross-sector comparison (health vs education) but ethically fraught when monetizing life.
— Estimates affordability over a short horizon (3–5 years) from a payer perspective.
— Complements but does not replace CEA; a cost-effective drug can still be unaffordable.
— Single-event metric; doesn't capture lifetime or multi-event outcomes.
— QALY: years × utility (0–1); maximize.
— DALY: years of life lost + years lived with disability; minimize. Used by WHO and Global Burden of Disease.
— Multi-criteria approaches incorporating clinical benefit, toxicity, cost; not strictly Markov but often informed by Markov outputs.
Step 3 management: When a payer asks "Can we afford this drug for our population next year?" the right tool is a budget impact analysis, not a Markov CEA — different question, different time horizon.

— A Markov model's ICER informs policy and guideline development, not individual prescribing — but guidelines (USPSTF, ACC/AHA, ADA) often incorporate cost-effectiveness implicitly.
— Examples: USPSTF lung cancer screening (LDCT) age expansion to 50–80 was informed by CISNET Markov/microsimulation modeling; statin primary prevention thresholds reflect lifetime ASCVD models.
— Translate state utilities into patient-relevant outcomes ("on average, this drug adds 0.3 years of healthy life over 20 years").
— Acknowledge uncertainty from PSA when counseling.
— Some US payers (e.g., ICER-informed plans, Medicaid in select states, the VA) use cost-effectiveness in formulary decisions.
— CMS is legally restricted from using QALYs in Medicare coverage (per the ACA, with recent IRA nuances around the Equal Access to Care Act / nondiscrimination provisions) — important Step 3 health-systems fact.
— Models should be re-run when new RCTs, new prices, or new comparators emerge — particularly after generic entry, which often dramatically improves ICERs.
— When using model-informed guidance for an individual (e.g., choosing DOAC over warfarin), document shared decision-making, patient preferences, and bleeding/stroke risk scores.
Board pearl: US Medicare is statutorily restricted from using QALY-based cost-effectiveness thresholds for coverage decisions — a high-yield health-policy fact contrasting with NICE in the UK.

— New RCT data altering treatment effects.
— Price changes (generic entry, biosimilar launch).
— Updated background mortality (life tables refreshed every decade).
— New comparators entering the market.
— Compare modeled long-term outcomes to registry data (e.g., did the model's predicted 10-year MACE rate match what we now observe?).
— Adjust unobservable inputs to match observed trends (calibration).
— Some agencies (NICE, CADTH) commission living systematic reviews and update Markov models iteratively as evidence emerges — particularly during COVID-19 and in oncology.
— Publish per CHEERS 2022 checklist; share code where possible.
— Disclose funding source and conflicts of interest.
— Predictive validity against new observational data.
— Concordance with independent models on the same question (cross-validation).
— Sensitivity of conclusions to structural assumptions (e.g., did adding an extra state change the decision?).
— Emphasize that ICERs are point estimates with uncertainty; PSA-based CEAC is more informative than a single ICER.
— Highlight value of information analysis — where additional research would most reduce decision uncertainty.
Step 3 management: When a generic version of a drug launches, the ICER usually drops sharply — Markov-based guidelines should be re-evaluated promptly because the cost-effectiveness landscape can flip.

— QALYs implicitly value life-years lived with disability less than years in perfect health, raising concerns under the Americans with Disabilities Act and disability-rights advocacy.
— US law (Section 1182(e) of the SSA, ACA provisions) prohibits Medicare from using QALYs to deny coverage; the Inflation Reduction Act drug-pricing negotiations explicitly cannot rely on QALYs for "negotiated maximum fair price" determinations.
— Standard CEA can systematically disadvantage older adults, disabled persons, and racial/ethnic minorities with shorter baseline life expectancy. Distributional CEA and equity weights attempt to correct this.
— Industry-sponsored Markov models frequently produce more favorable ICERs than independent ones. Mandatory disclosure and independent re-analysis (ICER, CADTH) are key safeguards.
— When a clinician uses guideline-based recommendations informed by Markov modeling, the patient is entitled to know the uncertainty and trade-offs (e.g., "this screening test prevents 1 death per 320 screened over 10 years but causes 17 false positives").
— Population-level cost-effectiveness conclusions can be misapplied at the individual level, particularly during transitions (hospital discharge, primary-to-specialty handoff). A drug that is "not cost-effective on average" may be highly appropriate for a high-risk individual — clinicians must avoid rote application.
— Per CHEERS 2022, full disclosure of methods, funding, and code is the field's transparency standard.
Board pearl: Federal law prohibits Medicare from using QALY-based thresholds to deny coverage — a uniquely American constraint that distinguishes US HTA from NICE (UK) and CADTH (Canada).

— HCV DAAs: Highly cost-effective despite high upfront cost due to avoided cirrhosis/HCC/transplant.
— PCSK9 inhibitors: ICERs initially >$300K/QALY → price renegotiation dropped to <$100K/QALY.
— Lung cancer LDCT screening: Cost-effective in ages 50–80 with ≥20 pack-year history.
— SGLT2 inhibitors in HF/CKD: Cost-effective across multiple Markov analyses.
Key distinction: Cost-effective ≠ cost-saving. Cost-effective = additional cost worth additional benefit. Cost-saving = cheaper AND better (SE quadrant) — rare; vaccines and smoking cessation often qualify.

— Stem provides costs and QALYs for two strategies. Compute (ΔCost/ΔQALY) and compare to $100K/QALY threshold.
— Trap: Forgetting to use incremental values; using one strategy's total cost-effectiveness ratio.
— Three or four strategies; identify which are dominated/extendedly dominated before ranking.
— Trap: Computing ICERs on dominated strategies.
— "Annual incidence rate is 0.15 per person-year; what is the 1-year probability?"
— Answer: 1 − e^(−0.15) ≈ 0.139.
— Trap: Reporting 0.15 as the probability directly.
— Tornado diagram shows utility of post-stroke state as widest bar → most influential parameter, warranting better measurement.
— Vignette describes vaccination policy → dynamic transmission, not static Markov.
— Vignette describes ICU bed allocation → DES.
— Vignette describes recurrent MI in chronic CAD → Markov cohort.
— Stem includes productivity losses and caregiver time → societal perspective.
— Stem includes only direct medical costs → healthcare-sector perspective.
— Stem describes Medicare formulary debate → recall statutory QALY prohibition.
— Stem describes disability advocacy critique → recognize QALY equity concern and DCEA as a response.
Step 3 management: When asked which input deserves further research investment, the answer is usually whichever parameter has the widest tornado bar — formalized as expected value of perfect information (EVPI).

Markov models simulate cohorts (or individuals) moving through discrete health states over time to estimate lifetime costs and QALYs for chronic-disease decisions, with ICERs interpreted against a willingness-to-pay threshold (commonly $100,000–$150,000/QALY in the US), refined by sensitivity analyses, and bounded by structural assumptions and equity considerations.
Board pearl: If you remember nothing else — rates ≠ probabilities, dominated strategies are eliminated before computing ICERs, and US Medicare cannot use QALYs — these three facts answer the majority of Step 3 questions on Markov models.

