Content Arsenal · part: replication_package
# Replication Package — Youth Mental Health Access Gap V1
**Working paper**: `mh_gap_youth_v1_article.md`
**Methodology supplement**: `mh_gap_youth_v1_methodology_supplement.md`
**Framework tool**: `atlas.need_vs_access_framework_v1` v1.1.0
**Dataset**: `mh_gap_youth_state_v1.csv` (35 rows × 12 fields, sha256 in manifest.json)
**License**: CC-BY-4.0
**Recommended citation**: Trellison Institute, *The Youth Mental Health Access Gap is Structurally More Severe Than the Adult Gap and Wider Across States*, working paper v1.0 (May 2026). DOI: [pending]
---
## What this package contains
| File | Purpose |
|---|---|
| `mh_gap_youth_v1_article.md` | The working paper (~3,800 words, 7 sections + supplementary) |
| `mh_gap_youth_v1_methodology_supplement.md` | Standalone methodology + framework adaptation reference |
| `mh_gap_youth_v1_codebook.md` | Per-field data dictionary for the CSV |
| `mh_gap_youth_v1_dartboard_narratives.md` | 9 case-study state profiles |
| `mh_gap_youth_v1_press_release.md` + `mh_gap_youth_v1_press_qa.md` | Press materials |
| `mh_gap_youth_v1_executive_brief.md` | 2-page brief |
| `mh_gap_youth_v1_sensitivity_analysis.md` | σ + taxonomy-set + pop-threshold sweeps |
| `mh_gap_youth_v1_narration_script.md` | TTS-ready data-story narration |
| `mh_gap_youth_v1_bibliography.md` | Sources + related work + policy literature |
| `mh_gap_youth_state_v1.csv` | Full per-state dataset |
| `manifest.json` | sha256, field list, license, source provenance |
| `atlas.need_vs_access_framework_v1.v1.1.code_content` | Pipeline code (DB-native tool export) |
| `README.md` | This file |
## How to reproduce the dataset from scratch
### 1. Pull source data
```python
# CDC YRBSS 2023 Mental Health Indicators
# https://data.cdc.gov/resource/nu3s-3dwd.json
# Paginated Socrata API; 5,990 rows nationally.
# Filter: year=2023, demographics_type=Total, question contains "feel so sad or hopeless"
# CMS NPPES National Provider Identifier registry
# Youth-serving taxonomies (8 codes — see methodology supplement §2)
# https://npiregistry.cms.hhs.gov/api or bulk file
# ACS state-level under-18 population
# https://api.census.gov/data/2023/acs/acs1?get=NAME,B09001_001E&for=state:*&key=<KEY>
# ACS state-level uninsured under-19 percent
# https://api.census.gov/data/2023/acs/acs1/subject?get=NAME,S2701_C05_002E&for=state:*&key=<KEY>
```
### 2. Stage inputs as MongoDB collections
The framework expects four collections:
- `connector_data.yrbss_state_need_v1` — state row with `state_abbr`, `sad_hopeless_2wk_pct`
- `connector_data.cms_nppes_youth_serving_v1` — provider row with `state_abbr` (one per NPI with primary youth-serving taxonomy)
- `connector_data.acs_under18_state_v1` — state row with `state_abbr`, `under_18_population`
- `connector_data.acs_uninsured_under19_state_v1` — state row with `state_abbr`, `uninsured_under19_pct`
Adapter scripts under DaedArch tool registry (`harturk.*`, `connector.*`).
### 3. Run the framework
```python
POST /api/v4/execute
{
"tool_id": "atlas.need_vs_access_framework_v1",
"inputs": {
"study_id": "mh_gap_youth_v1",
"need": {
"collection": "connector_data.yrbss_state_need_v1",
"measure_field": "sad_hopeless_2wk_pct",
"geography_id_field": "state_abbr"
},
"access": {
"collection": "connector_data.cms_nppes_youth_serving_v1",
"rollup_to": "state",
"geography_id_field": "state_abbr"
},
"population": {
"collection": "connector_data.acs_under18_state_v1",
"geography_id_field": "state_abbr",
"pop_field": "under_18_population"
},
"covariate": {
"collection": "connector_data.acs_uninsured_under19_state_v1",
"measure_field": "uninsured_under19_pct",
"geography_id_field": "state_abbr"
},
"geography_level": "state",
"population_threshold": 50000,
"outlier_threshold_sigma": 1.5,
"dartboard_n_per_class": 4,
"regression_grouping": "national"
}
}
```
### 4. Verify outputs match published
Your `analysis_outputs.mh_gap_youth_v1_state_v1` collection should match `mh_gap_youth_state_v1.csv` byte-for-byte (modulo CSV float formatting). The aggregate stats should reproduce:
- 35 states in published analysis
- Pop-weighted prevalence: **39.4%**
- Provider density range: **~85×**
- 3 positive outliers: **PR, NC, NJ**
- 2 negative outliers: **VT, AK**
If any of these diverge by more than ±0.5 percentage points, check (a) YRBSS release version, (b) NPPES snapshot date, (c) ACS year selection.
## Sensitivity tests we ran
See `mh_gap_youth_v1_sensitivity_analysis.md` for full tables.
- `outlier_threshold_sigma` ∈ {1.0, 1.5, 2.0}
- Narrow vs broad taxonomy set
- `population_threshold` ∈ {25K, 50K, 100K}
- `dartboard_n_per_class` ∈ {3, 4, 6}
The headline 39.4% national prevalence, 2.4× adult ratio, 85× supply range, and PR/NC/NJ + VT/AK outlier identification are robust across all combinations.
## Limitations
The methodology supplement §5 lists 7 state-level-specific limitations. Briefly:
1. YRBSS state coverage is 39 of 50 states + select territories
2. Surveillance instrument mismatch with BRFSS (12-month vs 30-day reference)
3. NPPES doesn't reflect provider-level capacity (hours, network status, accepting under-18)
4. State as the unit of analysis hides within-state variation
5. Single covariate (uninsured rate)
6. No tract-level cross-reference exists for youth
7. Self-report under-reporting in under-18
## How to extend to other access domains
The framework is hypothesis-free with respect to access domain and geography. State-level analogs:
```python
# Maternal access at state geography
inputs = {
"study_id": "maternal_access_v1",
"need": {"collection":"...","measure_field":"acog_inadequate_pnc","geography_id_field":"state_abbr"},
"access": {"collection":"connector_data.nppes_obgyn_state_v1","rollup_to":"state"},
...
"geography_level": "state",
"regression_grouping": "national"
}
```
Framework outputs go to `analysis_outputs.maternal_access_v1_state_v1` + `analysis_outputs.maternal_access_v1_dartboard_v1` — same schema, same dartboard sampling.
## Contact
- Methodology questions: [email protected]
- Data questions: [email protected]
- Press inquiries: [email protected]
- Framework tool registry: DaedArch platform · `atlas.need_vs_access_framework_v1` v1.1.0
## Acknowledgments
CDC YRBSS team for biennial state-level youth surveillance. CMS for the NPPES public registry. Census Bureau for the ACS state-level estimates. The 41.8 million under-18 individuals their numbers represent. The 35 states whose surveillance participation makes this analysis possible.