How we compute everything
Methodology
Every source, every formula, every limitation — disclosed so you can verify what we publish. Last reviewed by Evan Brooks, Data Editor.
Data Sources
HealthByCounty publishes only statistics from primary government sources. We do not aggregate from third-party data brokers. Three datasets drive every page on the site:
- CDC County Health Rankings & Roadmaps (CHR). A program of the Robert Wood Johnson Foundation and the University of Wisconsin Population Health Institute. Provides life expectancy, the share of adults reporting poor or fair health, the uninsured rate, and underlying provider density. Published annually each spring. countyhealthrankings.org. Most recent release: 2024 data.
- CMS NPPES — the federal National Plan and Provider Enumeration System. The licensed-provider registry that lets us compute primary-care, mental-health, dental, and nurse-practitioner provider density per 100,000 county residents. npiregistry.cms.hhs.gov. Pulled and resolved to county FIPS quarterly.
- EPA AirNow / Air Quality System (AQS). Daily PM2.5, PM10, and ozone concentrations rolled up to an Air Quality Index by county. airnow.gov. Annual rollup.
For demographic context (population, median household income used in some narratives), we also reference the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates. ACS is not the source of any health statistic on the site; it provides denominator context only.
Why CDC County Health Rankings, Not BRFSS or PLACES Alone
The CDC publishes multiple county-relevant datasets. We chose County Health Rankings (CHR) as our primary backbone for three concrete reasons:
- National county-level composite outcomes. CHR is the only source that publishes life expectancy and ranked health outcomes at the county level for every county in the United States, in a single annual release. BRFSS is state-level and behavioral; PLACES is tract-level but disease-prevalence only.
- Stable methodology. CHR has used the same conceptual framework (health outcomes vs. health factors) since 2010, which makes year-over-year reading meaningful. PLACES has methodology that has changed meaningfully over the last two release cycles.
- Public commercial use under attribution. CHR is a Robert Wood Johnson Foundation program with licensing that explicitly permits commercial reuse with attribution. We credit them on every county page.
We are evaluating CDC PLACES (chronic-disease prevalence measures at the tract level) for future integration; that work is tracked in our roadmap and will be cited separately when it ships.
How the Composite Health Score Is Computed
Each county receives a single 0–100 composite score we call the Health Score. This is not a CDC product; we build it from the underlying CHR + NPPES variables using a published, fixed formula. The score is a relative ranking across the 3,144 US counties, not an absolute clinical measure.
The formula:
- Standardize each input as a z-score across all counties with non-null values for that variable.
- Invert metrics where lower is better (uninsured rate, poor/fair health rate) by negating the z-score, so that higher always means "healthier".
- Weighted average: Life Expectancy (0.40), Poor/Fair Health Rate inverted (0.20), Uninsured Rate inverted (0.20), Primary Care Provider Density (0.10), Mental Health Provider Density (0.10).
- Min-max rescale the weighted z-composite across all counties to a 0–100 range.
- Round to one decimal for display; full precision retained in source JSON.
Score interpretation
- 75–100: top quartile nationally — consistently strong life expectancy and access metrics.
- 55–74: roughly middle 50% of counties.
- 0–54: below the national median — typically a mix of shorter life expectancy, higher uninsured rates, or thin provider density.
We do not use the composite score to recommend places to live, work, or seek care. The score is a relative-ranking tool to help you compare counties; the underlying variables are always shown on the page so you can decide which dimensions matter to your situation.
Geographic Coverage
HealthByCounty covers all 3,144 counties and county equivalents in the United States — including Louisiana parishes, Alaska boroughs and census areas, and the independent cities of Virginia. Coverage spans all 50 states and the District of Columbia.
When a source dataset doesn't include a specific county (typical for the smallest census areas), we show the value as "N/A" and exclude that variable from the composite for that county. Counties missing too many inputs are not assigned a composite score; they appear with a "Data pending" label on their page.
Known Limitations
The site's accuracy is constrained by its sources. We disclose those constraints explicitly:
- CHR uses rolling estimates. Most CHR variables are 3- to 5-year rolling averages, which trade recency for statistical reliability. Sharp single-year changes in a county may take two releases to fully appear.
- Wide confidence intervals in small counties. Counties below ~10,000 residents have meaningful uncertainty around survey-based metrics. We don't suppress small-county values, but we also don't claim they're as precise as large-county figures.
- NPPES provider density lags. Provider NPI updates lag actual relocations and retirements by 30 to 90 days. Counties with high provider churn may show a slightly stale density.
- County is a coarse unit. Health outcomes vary inside counties — a wealthy ZIP can sit next to a low-life-expectancy one in the same county. Sub-county granularity is on our roadmap (CDC USALEEP at the census-tract level is a top integration candidate).
- The site is for comparison, not diagnosis. Health scores are relative rankings, not clinical assessments. They cannot substitute for medical or public-health guidance.
Update Cadence
CDC County Health Rankings releases its annual update each spring; we ingest the new release within 30 days. CMS NPPES is refreshed quarterly. EPA AirNow rollups happen annually.
When we change the composite-score formula or replace a data source, we update this page, increment its "Last reviewed" date, and document the change in a changelog entry below. Statistic changes that affect specific counties are noted on the county page itself.
AI-Generated Narratives
The per-county narrative summary on each page is generated with Claude (Anthropic) from the same statistics shown on the page. The prompt is constrained to forbid:
- Medical advice, treatment recommendations, or diagnostic claims.
- Causation language about specific health outcomes that the data does not support.
- Invented statistics, sources, or named individuals.
- The formulaic comparative phrases ("falls below the state average of X") that AdSense flagged as scaled content on our earlier version.
The Data Editor reviews the prompt and spot-checks output before publication. When source data is refreshed, narratives are regenerated against the new data. The full prompt is documented in our editorial-standards repository. See editorial standards for the AI-usage policy in full.
Who Reviews This Methodology
This methodology is reviewed and signed off by Evan Brooks, Data Editor. The data-editor role is responsible for: verifying source-data ingestion against publisher releases, maintaining the composite-score formula and any weight changes, reviewing AI-prompt output for compliance with this methodology, and publishing corrections within ten business days of substantiated reports.
Source Citations
- CHR 2024 Methodology (PDF) — official methodology document published by RWJF + UW Population Health Institute.
- CMS NPPES Data Dissemination API — federal licensed-provider registry documentation.
- EPA AirNow API documentation — air-quality monitoring data documentation.
- U.S. Census ACS 5-Year Estimates — used for denominator/demographic context only.
This methodology page was last reviewed on by Evan Brooks, Data Editor. Substantive changes are logged below as the site evolves.
← Back to Home