Biostatistics & Population Health
Population attributable risk and fraction
— Formula: PAR = Incidence(total population) − Incidence(unexposed)
— Units: cases per person-time (e.g., cases per 100,000 person-years)
— Answers: "How many cases per year in the whole population are due to this exposure?"
— PAF = PAR / Incidence(total population)
— Alternative formula using prevalence of exposure (Pe) and relative risk (RR): PAF = Pe(RR−1) / [1 + Pe(RR−1)]
— Answers: "What fraction of disease in the population would disappear if we eliminated the exposure?"
— Public health prioritization questions ("Which intervention would prevent the most cases?")
— Tobacco, hypertension, obesity, alcohol, vaccine-preventable disease vignettes
— Quality improvement / value-based care stems asking which risk factor to target first in a panel
— Comparing individual-level risk (relative risk, attributable risk) with population-level impact

— A public health officer or medical director asks which modifiable risk factor to target in a defined population to maximally reduce disease incidence
— A table provides prevalence of exposure and relative risk for several risk factors; you must compute or rank PAFs
— A cohort study reports incidence in exposed and unexposed groups; you are asked the proportion of cases in the population attributable to the exposure
— A vignette contrasts an individual patient's smoking cessation benefit (use ARR/NNT) versus the community's benefit (use PAF)
— Incidence in exposed (Ie) and unexposed (Iu) → enables attributable risk (AR = Ie − Iu) and attributable risk percent in the exposed (ARP = (Ie−Iu)/Ie)
— Incidence in the total population (It) and unexposed (Iu) → enables PAR = It − Iu
— Prevalence of exposure (Pe) plus RR → enables PAF via Levin's formula
— Attributable risk (AR) = Ie − Iu → excess risk in exposed individuals
— Attributable risk percent (ARP) = (Ie − Iu)/Ie → fraction of disease in the exposed due to exposure
— Population attributable risk (PAR) = It − Iu → excess risk in the whole population
— Population attributable fraction (PAF) = PAR/It → fraction of disease in the whole population due to exposure

| • Since PAR/PAF is an epidemiologic construct, the "physical exam" equivalent is inspecting the 2×2 table and the population structure before computing anything. | |||
| • Step-by-step inspection of a cohort 2×2 table: | |||
| Disease + | Disease − | Total | |
| Exposed | a | b | a+b |
| Unexposed | c | d | c+d |
| — Incidence in exposed: Ie = a/(a+b) | |||
| — Incidence in unexposed: Iu = c/(c+d) | |||
| — Total incidence: It = (a+c)/(a+b+c+d) | |||
| — Relative risk: RR = Ie/Iu | |||
| — Prevalence of exposure: Pe = (a+b)/(a+b+c+d) | |||
| • Hemodynamic analogy: Just as you check blood pressure, HR, and perfusion before treating shock, in epidemiology you check: | |||
| — Is the design a cohort (RR available) or case-control (only OR available, use OR as RR approximation only if disease is rare, <10%)? | |||
| — Is the exposure prevalent enough to drive a meaningful PAF? | |||
| — Is the RR statistically significant (CI excludes 1)? | |||
| • Key distinction: In case-control studies, you cannot calculate true incidence, so PAR in absolute terms is not directly obtainable; PAF can be estimated using OR ≈ RR (rare disease assumption) plus prevalence of exposure among controls as a proxy for population exposure prevalence. | |||
| • Board pearl: A high RR with Pe near zero yields a tiny PAF. Example: a workplace carcinogen with RR = 20 but Pe = 0.001 produces PAF ≈ 1.9%, whereas hypertension with RR = 2 and Pe = 0.45 produces PAF ≈ 31%. Always multiply prevalence by (RR−1) mentally before ranking. | |||
| • Step 3 management: Before answering, restate the question — "Are they asking about risk in the exposed or risk in the whole population?" This single check prevents the most common wrong-answer trap (choosing ARP when PAF is asked, or vice versa). |

— Attributable risk (AR/risk difference): AR = Ie − Iu
— Attributable risk percent (ARP): ARP = (Ie − Iu)/Ie = (RR − 1)/RR
— Population attributable risk (PAR): PAR = It − Iu
— Population attributable fraction (PAF), direct: PAF = (It − Iu)/It
— PAF, Levin's formula: PAF = Pe(RR − 1) / [1 + Pe(RR − 1)]
— PAF, alternative: PAF = Pd × (RR − 1)/RR, where Pd = proportion of cases exposed
— Smoking and lung cancer: Ie = 180/100,000, Iu = 10/100,000, It (population with 25% smokers) = 52.5/100,000
— PAR = 52.5 − 10 = 42.5 per 100,000/year
— PAF = 42.5/52.5 = 81% of lung cancer cases attributable to smoking
— Hypertension and stroke: Pe = 0.40, RR = 2.5
— PAF = 0.40(1.5) / [1 + 0.40(1.5)] = 0.60/1.60 = 37.5%
— Of 1000 MI cases, 700 are smokers; RR for MI from smoking = 3
— PAF = (700/1000) × (3−1)/3 = 0.70 × 0.667 = 46.7%

— Because a single case can be attributable to several causes (multicausality), the sum of PAFs across risk factors often exceeds 1.0
— Example: smoking PAF for cardiovascular death = 20%, hypertension = 35%, dyslipidemia = 30%, diabetes = 15% → sum = 100%, but joint elimination would not eliminate 100% of deaths
— Use combined (joint) PAF formula: PAF_joint = 1 − Π(1 − PAFi) when exposures are independent
— Use adjusted RR (from multivariable regression — Cox, logistic, Poisson) in Levin's formula instead of crude RR
— Failing to adjust → overestimates PAF if confounders inflate the crude RR
— Miettinen's formula: PAF = Pd(RR_adj − 1)/RR_adj uses proportion of cases exposed and the adjusted RR, and is preferred when confounding is present
— Reported as point estimate with 95% CI (e.g., PAF = 35%, 95% CI 28–42%)
— Wide CI → imprecise estimate, often due to small exposed group or borderline RR
— If the CI for the underlying RR crosses 1, the PAF CI will cross 0 → exposure may not contribute meaningfully
— PAF assumes complete elimination of a harmful exposure
— Preventable fraction (PF) = Pe(1 − RR) / [1 − Pe(1 − RR)] is used for protective exposures (RR < 1), e.g., vaccination, statins — estimates fraction of disease that would be prevented if everyone received the protective factor

— Step 1: List candidate modifiable exposures with their local prevalence and RR for the outcome
— Step 2: Compute PAF for each (Levin's formula)
— Step 3: Multiply PAF × baseline disease incidence × population size → expected absolute cases preventable
— Step 4: Weight by intervention efficacy (rarely 100%), feasibility, and cost
— Step 5: Choose the intervention with the largest preventable case burden per dollar
— Hypertension: Pe ~45%, RR for stroke ~2.5 → PAF ~40%
— Atrial fibrillation: Pe ~2%, RR for stroke ~5 → PAF ~7%
— Even though AF confers a stronger individual risk, hypertension control prevents more strokes population-wide
— A preventive measure that benefits the population substantially often offers little benefit to each participating individual (e.g., universal sodium reduction)
— Conversely, high-risk strategies (treat only people with BP >160) miss the large reservoir of moderate-risk individuals who collectively generate most events
— Population strategy maximizes PAF reduction; high-risk strategy maximizes individual ARR

— Dyslipidemia: PAF ~49% → statin therapy is the single highest-yield pharmacologic lever
— Smoking: PAF ~36% → varenicline, bupropion, nicotine replacement
— Hypertension: PAF ~18% → ACEi/ARB, thiazide, CCB per JNC 8/ACC/AHA
— Diabetes: PAF ~10% → metformin, GLP-1 RA, SGLT2 inhibitor
— Abdominal obesity: PAF ~20% → GLP-1 RA, lifestyle, bariatric referral
— Combined modifiable PAF: ~90% of first MI
— Hypertension: ~48% → antihypertensives are the #1 population-level stroke preventive
— Physical inactivity: ~36%
— Dyslipidemia: ~27%
— Diet, smoking, cardiac causes, alcohol, stress, diabetes round out >90%

— Tobacco taxation and smoke-free laws: cut smoking prevalence by 10–20% → large drop in lung cancer, COPD, CVD PAFs
— Universal childhood vaccination: drives PAF for measles, polio, HPV-related cancers, Hib meningitis toward zero
— Folic acid fortification: reduced neural tube defect PAF ~30–50%
— Trans-fat bans: measurable decline in CHD events
— Lead abatement: dramatic reduction in pediatric neurodevelopmental disease PAF
— Seatbelt and airbag mandates: large PAF reduction in motor-vehicle mortality
— Screening reduces disease burden when the screenable risk factor or precursor has high prevalence and effective intervention exists
— Examples: colonoscopy (CRC PAF reducible by ~50–60% with population screening), mammography, cervical cytology/HPV co-testing, AAA ultrasound in male smokers 65–75
— HPV vaccine: PF for cervical cancer approaches 90% with high uptake
— Pneumococcal vaccine: substantial reduction in invasive pneumococcal disease PAF in elderly
— Influenza vaccine: modest PF (~40–60%) but huge absolute case prevention given enormous Pe of exposure
— A CABG benefits the individual via large ARR for the multivessel-disease patient
— A statewide hypertension control program benefits the population via large PAF reduction across millions of moderate-risk individuals

— As populations age, prevalence of exposures (HTN, DM, AF, polypharmacy) rises → PAF for downstream outcomes like stroke, dementia, falls increases
— Competing mortality from other causes can paradoxically reduce PAF for any single exposure (people die of something else before the studied outcome occurs) — this is competing risk bias
— Hypertension for stroke and heart failure (Pe >60% over age 65)
— Atrial fibrillation for stroke (Pe ~10% over age 80, RR ~5 → PAF ~30% of strokes in the very old)
— Polypharmacy for falls, delirium, hospitalization
— Frailty and sarcopenia for postoperative complications
— Untreated hearing loss for incident dementia (Lancet Commission PAF ~8%)
— CKD inflates RR for cardiovascular events and bleeding; Pe of CKD (~15% US adults) gives meaningful PAF for cardiovascular mortality
— Hepatic impairment increases medication-related adverse event PAF — relevant for QI programs that audit hepatotoxic drug prescriptions in cirrhosis panels
— Less education (early life), hearing loss, hypertension, obesity, smoking, depression, physical inactivity, social isolation, diabetes, excessive alcohol, traumatic brain injury, air pollution, untreated visual impairment
— Step 3 may ask: "Which modifiable factor has the largest PAF for dementia?" → hearing loss (most recent estimates) or hypertension depending on midlife window

— Hypertensive disorders of pregnancy — large PAF for maternal mortality and preterm birth; Pe ~10% of pregnancies
— Smoking during pregnancy — high PAF for low birth weight and SIDS
— Inadequate prenatal care — high PAF for preventable maternal-fetal complications, particularly in under-resourced populations
— Untreated maternal syphilis — near-100% PAF for congenital syphilis cases
— Folate deficiency — PAF for NTDs cut dramatically by fortification
— Unsafe sleep environments — large PAF for SIDS; "ABC" counseling (Alone, Back, Crib) is the highest-yield intervention
— Unvaccinated status — drives PAF for measles, pertussis outbreaks in communities with declining coverage
— Secondhand smoke exposure — substantial PAF for otitis media, asthma exacerbations
— Lead exposure — PAF for cognitive deficits, behavioral disorders
— Childhood obesity — rising PAF for adult cardiometabolic disease
— PAFs for many outcomes (preterm birth, maternal mortality, hypertensive complications) are disproportionately driven by structural racism, food insecurity, housing instability — these social exposures often have higher PAF than biologic risk factors
— Step 3 increasingly tests recognition of social determinants of health as modifiable, measurable exposures

— Mistaking association for causation: PAF assumes the exposure–outcome relationship is causal. If confounding, reverse causation, or selection bias inflates the RR, PAF is overestimated
— Ignoring the rare-disease assumption when using OR from case-control studies as RR — overestimates PAF for common outcomes
— Summing PAFs across multiple exposures and concluding total preventable burden — sums often exceed 100% because of multicausality; use joint PAF formula
— Assuming 100% intervention efficacy: real interventions reduce exposure imperfectly; the achievable preventable fraction is PAF × intervention coverage × intervention efficacy
— Generalizing PAF across populations: PAF depends on Pe, which varies geographically and temporally — a US PAF may not apply to a low-income country, or to 2025 versus 1990
— Wide confidence intervals when exposed group is small
— Residual confounding from unmeasured variables (socioeconomic status, genetics)
— Misclassification of exposure (recall bias in case-control studies) → biases RR toward null → underestimates PAF
— Effect modification: PAF may differ across strata (e.g., smoking PAF for MI is higher in young adults than elderly)
— Over-aggressive targeting of a high-PAF risk factor can divert resources from outcomes with lower PAF but high severity (rare cancers, suicide)
— Risk-factor-focused metrics may stigmatize patients (e.g., obesity, addiction) — ethical concern in QI programs

— High PAF (>20%) for a serious outcome with effective, scalable intervention → strong case for public health program
— Rising PAF over time (e.g., obesity-related cancers) → signal for new policy
— High PAF concentrated in vulnerable subgroup → equity-focused intervention
— Low PAF but high severity (e.g., rare lethal exposures) → targeted regulation rather than population campaign
— Quantify PAF with adjusted RR and best-available Pe
— Estimate intervention impact fraction (IIF) = PAF × coverage × efficacy
— Compare cost-effectiveness across candidate interventions (cost per QALY, cost per case prevented)
— Engage stakeholders, equity review, and implementation science
— Just as a hypotensive patient escalates from PO → IV → vasopressors → ICU consult, a public health priority escalates from clinical counseling → office-based protocol → community program → policy/legislation
— Tobacco trajectory: physician counseling (low reach) → pharmacotherapy (moderate) → quitlines (broad) → taxation and smoke-free laws (population-level, largest PAF impact)
— Health economists for cost-effectiveness analysis
— Implementation scientists for uptake and fidelity
— Community partners for cultural tailoring
— Legislators for policy levers (taxation, mandates, fortification)

— Compares risk in exposed vs unexposed
— Measures strength of association, not population burden
— Used in cohort studies; in RCTs and prospective designs
— Used in case-control studies
— Approximates RR when disease is rare (<10%)
— Cannot directly calculate PAR; PAF can be estimated using OR with caveats
— Time-to-event analog of RR, from Cox proportional hazards
— Used for survival analyses (cancer, transplant)
— Same population impact translation applies
— Ie − Iu (when exposure is harmful) or Iu − Ie (when intervention is protective)
— Drives the NNT = 1/ARR
— Individual-level counseling metric
— Limited to exposed subgroup
— Useful for occupational or niche-exposure questions
— Whole-population perspective
— Used for policy, allocation, and quality-improvement decisions
— For protective exposures (RR < 1) — vaccines, statins, screening
— Estimates population-level benefit if uptake were universal
— Strength of association → RR/OR/HR
— Individual benefit → ARR/NNT
— Burden in exposed → AR/ARP
— Burden in population → PAR/PAF
— Benefit of protective exposure → PF

— Incidence = new cases per person-time → drives PAR calculations
— Prevalence = existing cases at a point in time → not directly used in PAR; prevalence of exposure (Pe) is, however, central to PAF
— These are test performance measures, not exposure-impact measures
— PPV depends on disease prevalence; commonly confused with PAR because both involve prevalence, but they answer different questions (test accuracy vs. population disease burden)
— Diagnostic test domain; unrelated to PAR/PAF
— Individual-level intervention effect metrics
— NNT = 1/ARR; useful for patient counseling, not for population-level prioritization
— Effect modification (interaction) — RR differs across strata; report stratum-specific PAFs
— Confounding — distorts crude RR; use adjusted RR in PAF calculations
— Confusing the two leads to wrong PAF interpretation
— Bias in screening/observational studies that can inflate apparent RR → overestimate PAF
— Always check for these before accepting a published PAF
— Rose's prevention paradox: large population gains from small individual shifts (PAF lens)
— High-risk strategy: clinical management of individuals at greatest absolute risk (NNT lens)

— Continued surveillance of exposure prevalence (BRFSS, NHANES, registries)
— Monitoring of disease incidence for trend reversal
— Maintenance funding — many successful programs (e.g., tobacco quitlines, vaccine outreach) lose efficacy when budget cuts reduce coverage
— Equity audits to ensure benefits reach high-PAF subgroups, not only low-risk ones
— Tobacco control (US, 1965–present): adult smoking prevalence fell from 42% to <12%; lung cancer mortality declining; PAF for many CVD/cancer outcomes shrinking
— HPV vaccination: declining HPV prevalence and pre-cancer rates in vaccinated cohorts
— Childhood lead exposure: mean blood lead in children fell >90% after leaded gasoline phaseout
— Folate fortification: sustained reduction in neural tube defects since 1998
— Obesity prevalence continues rising → PAFs for diabetes, NAFLD, colon cancer, endometrial cancer increasing
— Opioid use disorder → rising PAF for overdose mortality
— Vaccine hesitancy → rising PAF for measles, pertussis
— Climate-related exposures (heat, air pollution, vector-borne disease) → expanding PAFs
— Just as a post-MI patient needs aspirin, statin, beta-blocker, ACEi, and cardiac rehab indefinitely, populations need ongoing investment in high-PAF interventions — these are not "one and done"

— Exposure prevalence (Pe) over time — primary process measure
— Disease incidence — primary outcome measure
— Disease mortality — long-term outcome
— Subgroup disparities — equity measure
— Cost per case prevented / cost per QALY — efficiency measure
— Intervention coverage and adherence — implementation fidelity
— Translate PAF into personally relevant terms: "Smoking causes about 85% of lung cancer cases in the US. For you specifically, quitting now reduces your lung cancer risk by about half within 10–15 years."
— Use absolute risk and NNT/NNH for individual decisions; use PAF for explaining societal stakes
— Acknowledge uncertainty: PAFs are estimates with confidence intervals
— Smoking: pack-years documented, cessation status at every visit (5 A's: Ask, Advise, Assess, Assist, Arrange)
— Hypertension: home BP monitoring, follow-up every 1 month until controlled, then every 3–6 months
— Diabetes: A1c every 3–6 months
— Obesity: BMI, waist circumference, lifestyle counseling
— Alcohol: AUDIT-C annually
— Vaccination status: reviewed at every preventive visit
— Quality measures (HEDIS, MIPS, CMS Star Ratings) often track high-PAF exposures: BP control, statin use, tobacco screening, A1c control, vaccination rates
— These metrics are deliberately chosen because moving them improves population health

— Stigmatization: Targeting high-PAF behaviors (obesity, substance use, sexual practices) risks blaming individuals for socially patterned exposures. Frame interventions around structural drivers, not personal failure
— Equity vs. efficiency: Maximizing total PAF reduction may neglect smaller, marginalized groups whose disease burden is concentrated. Public health ethics requires balancing utility and justice
— Autonomy: Mandatory interventions (vaccine mandates, sugar taxes, motorcycle helmet laws) reduce PAF but may conflict with individual liberty. Step 3 frequently tests recognition of this tension
— Informed consent in screening: Programs justified by population PAF data must still disclose individual-level benefits, harms (false positives, overdiagnosis), and uncertainty
— Mandatory reporting of communicable diseases (TB, syphilis, measles, HIV in most states) enables PAF surveillance
— De-identification of public health data must comply with HIPAA; aggregate PAF reporting is permitted
— Workplace and product safety regulation (OSHA, FDA, EPA) is grounded in PAR estimates — e.g., asbestos bans, lead removal, food safety standards
— Transition-of-care risk (hospital discharge): medication reconciliation, follow-up scheduling, and identification of high-PAF readmission drivers (heart failure, COPD, sepsis) are central CMS quality measures
— A discharged HF patient should have a follow-up appointment within 7–14 days, weight-monitoring plan, and reinforced medication adherence — addressing the high-PAF readmission exposures
— Concrete Step 3 example: A patient is discharged with newly diagnosed AF and a CHA₂DS₂-VASc of 4 but is not started on anticoagulation. AF carries a high stroke PAF in the elderly; failure to anticoagulate at discharge is both a patient safety event and a missed population-level opportunity — the correct action is to start anticoagulation before discharge and ensure outpatient follow-up

— Lung cancer → smoking (~85%)
— COPD → smoking (~80%)
— Cervical cancer → HPV (~99%)
— Hepatocellular carcinoma → hepatitis B/C, alcohol, NAFLD (combined >80%)
— Stroke → hypertension (~48%)
— MI → dyslipidemia + smoking + HTN + DM (combined ~90%)
— Type 2 diabetes → obesity + physical inactivity (>70%)
— HIV → unprotected sex + IV drug use (population-dependent)
— Mesothelioma → asbestos (~80%)
— Bladder cancer → smoking (~50%)
— Dementia (modifiable share) → 12 Lancet factors (~40%)
— AR = Ie − Iu
— ARP = (RR − 1)/RR
— PAR = It − Iu
— PAF = (It − Iu)/It = Pe(RR−1)/[1 + Pe(RR−1)]
— PF = Pe(1−RR)/[1 − Pe(1−RR)] (protective exposure)
— RR = 2, Pe = 0.5 → PAF = 33%
— RR = 3, Pe = 0.3 → PAF = 38%
— RR = 10, Pe = 0.01 → PAF = 8%
— RR = 1.5, Pe = 0.6 → PAF = 23%
— RR = 5, Pe = 0.2 → PAF = 44%
— A common risk factor with modest RR may have larger PAF than a rare risk factor with huge RR
— PAF assumes causality
— ARP applies to exposed only, PAF applies to population
— Summed PAFs across exposures can exceed 100% (multicausality)
— Levin's formula uses prevalence of exposure and RR
— In case-control studies, use OR as RR approximation only if disease is rare

— Table with several exposures, each with Pe and RR
— Compute PAF for each; pick the largest PAF (assuming intervention efficacy similar)
— Trap: choosing the exposure with the highest RR rather than the highest PAF
— Direct PAF computation
— Trap: confusing with ARP (only the exposed subgroup)
— Wants ARP = (RR−1)/RR
— Trap: applying PAF, which is lower because it includes nonsmokers
— Wants PAR (absolute units) × population size
— Trap: reporting PAF (a fraction) instead of an absolute count
— Combine PAF with intervention efficacy and coverage to estimate realized impact
— Trap: assuming intervention efficacy is 100%
— Compare preventable fraction × Pe × intervention efficacy × cost
— Trap: picking the more sensitive test without considering disease prevalence and PAF
— Possible flaws: residual confounding, reverse causation, recall bias, generalizability, rare-disease assumption violation
— Trap: accepting the headline PAF without methodological scrutiny
— Use preventable fraction formula
— Trap: applying PAF formula meant for harmful exposure (RR > 1)
— Wants individual-level measure (ARR, NNT)
— Trap: quoting PAF to an individual (relevant only at population scale)

Population attributable risk (PAR) and population attributable fraction (PAF) quantify how much disease in an entire population is attributable to a specific causal exposure, and they are driven jointly by the strength of association (RR) and the prevalence of the exposure (Pe) — making common, modestly harmful exposures often the highest-yield public health targets.
— PAR = It − Iu (absolute population excess)
— PAF = (It − Iu)/It = Pe(RR−1)/[1 + Pe(RR−1)] (Levin's formula)
— ARP = (RR−1)/RR applies only to the exposed subgroup
— Preventable fraction (PF) is the protective-exposure analog
— A high-prevalence, modest-RR exposure (hypertension, dyslipidemia) typically has a larger PAF than a rare, high-RR exposure
— PAF assumes causality and complete elimination of the exposure; real-world impact is PAF × coverage × efficacy
— Sums of PAFs across multiple exposures can exceed 100% due to multicausality — use joint PAF when needed
— Adjust for confounding using multivariable-adjusted RR before computing PAF
— In case-control studies, OR approximates RR only when disease is rare
— Population/policy decisions → PAF (which exposure to attack first)
— Individual patient decisions → ARR and NNT (whether to treat this person)
— Quality measures (HEDIS, MIPS) deliberately target the highest-PAF clinical exposures: tobacco, BP, lipids, diabetes, vaccination, cancer screening

