Data Contract
Canonical ingestion model
- Canonical source of truth:
data/canonical/rdu_timeline_data.csv - Ingestion mechanism: reviewed pull requests only.
- Provenance ledger:
data/provenance.csv
Required columns
userfield_officei-485 receipt dateinterview datei-130 approval datei-485 approval datereceipt to interviewinterview to i130interview to i485i130 to i485days since interviewdays totalcase closed
Validation gates
CI fails on:
- missing required columns
- malformed dates
- status/date inconsistency (
case closedvs approval date) - impossible time direction (approval before receipt/interview)
- derived duration mismatch beyond tolerance
- duplicate
user + receipt_daterows
Office scope
Analysis pipeline defaults to field_office == Raleigh/Durham.