Four phases, each gated on tests / evals.
AutoP1 reconciler (eval F1 0.982, crews 4→2, 13 unit tests) · P2 survivorship + provenance + undo · P3 review-queue API + gap detector + a React review screen · P4 live wiring + write-by-canonical-id + 4 migrations applied to the production DB + a rolled-back integration test.
GateHuman approvals at the seams: "merge to main", "proceed to Phase 4", "use the production environment" — and a domain unblock (the working DB proxy path) when the agent hit a stale credential.