Institutionalize the registry-referential-integrity assertion as the c…

Decision

Institutionalize the registry-referential-integrity assertion as the canonical pre-claim gate for all future model-rename cascades in OpenClaw and any similar keyed-registry config (Paperclip, Hermes). The gate — for ref in primary+fallbacks+subagents: assert ref ∈ registry.keys() — runs in ~100ms, catches both the key-value desync failure mode and the self-reference blindspot in one shot, and is the minimum-work maximum-signal invariant for this failure class. Any agent (NOVA, Claude Code, future) must output gate=PASS before claiming a model rename is done.

Rationale

First-principles decomposition of this session’s NOVA partial-upgrade failure identified two orthogonal failure modes: (a) structural — model registries have three positional classes (pointee definitions, registry keys, reference arrays) which grep-based audits flat-match identically, masking key-value desync, and (b) cognitive — self-directed agent edits systematically miss the self-file because working-memory enumerates the PEER set, not the self. The referential-integrity assertion closes both atomically: key-value desync fails when the registry key wasn’t renamed; self-blindspot closes because the assertion runs against the top-level config (the self-file for NOVA). Evidence: session’s post-cascade check passed with miss=0; pre-cascade it would have surfaced moonshot/kimi-k2.6 as missing from registry. The rule is higher-leverage than any number of checklist entries because it’s executable and deterministic. Captured in error-playbook.md as the prevention recipe; session-learnings.md holds the behavioral context. Next time any agent claims a model rename is complete, the 5-line Python assertion runs against openclaw.json before claim acceptance. Registry-miss=0 is the gate. Zero future “fallback points to nonexistent registry key” failures. Pattern generalizes to any keyed-registry config (Paperclip division_ids, Hermes model-mesh). Over 6-12 months, the invariant proves itself by preventing a recurrence with zero additional tooling.

Alternatives Rejected

Outcome

Pending