Vault underwent a major bloat cleanup on 2026-04-17 (2512 → 376 live notes, −85%). You need to know
What
Vault underwent a major bloat cleanup on 2026-04-17 (2512 → 376 live notes, −85%). You need to know the details before your next weekly vault-hygiene-sweep (Sun 01:30 IST) so you don’t misinterpret the archive as broken links to restore.
What changed
- 2137 notes archived under
vault/_archive/cleanup-2026-04-17-conversation-flush-bulk/(9.1 MB). Mostly conversation-flush output with consulted=0 — 100% of that agent’s corpus. Included massive duplicate clusters: 88× stop-hook variants, 30× react-native-web variants, 29× drizzle-fk variants. transcript-capture.pyStop/PreCompact hook DISABLED at both/root/.claude/hooks/and/home/claude/.claude/hooks/— theconversation-flush.pyscript it spawned had no pre-submit dedup and produced 2052 write-only notes with 0% consultation rate.DISABLED = Trueconstant at top of file.- New institutional-only discipline captured at
aj-workspace:/memory/feedback_vault_institutional_only.mdwith a 7-row ✅/❌ matrix + 4-test filter (cross-agent benefit / non-stack-specific / search-first dedup / would-user-want-this). - session-close skill updated — Phase 3.5 Quality Gate rewritten with 4-test filter + ≤3 submissions-per-session hard ceiling; Phase 2 now runs a Vault Bloat Audit every session-close (duplicate-cluster check, conversation-flush regression guard, unconsulted-ratio alert).
- Evolution backlog updated (aj-workspace:memory/evolution-backlog.md 17-Apr-2026 entries) with three items: conversation-flush structural dedup fix, Vault Submission Discipline rule, superpowers:subagent-driven-development scale guidance.
Git trace
/opt/second-brainrepo: commitse206c4e(over-specific removal) →7330819(session dedup) →f697525(mass archive 2137 files) → stragglers for subsequent conversation-flush re-writes and 2 more dupes.
Rationale
User directive 2026-04-17: “vault is institutional intelligence, not an activity log … pattern recognition or pattern-related decisions or certain kinds of best practices … I don’t want redundancies and unnecessary garbage.” Quality over size. Conversation-flush was the bloat engine; 82% of vault was its output, 100% unconsulted.
Why
Your weekly vault-hygiene-sweep checks for broken wikilinks, missing Related sections, and orphan notes. Without this handoff, it could (a) flag the 2137 archived files as “broken links” from remaining wiki backlinks and attempt auto-restore — re-bloating the vault; (b) flag the disabled transcript-capture.py hook as a regression and try to re-enable it; (c) apply its own submission gate without incorporating the new institutional-only discipline rule, producing the same bloat pattern over time.
You also own architecture/orchestration quality-gating — this cleanup + discipline rule is architectural hygiene your lane should formally own going forward.
Action Required
On next weekly vault-hygiene-sweep (Sun 01:30 IST) and thereafter:
-
TREAT
_archive/cleanup-2026-04-17-conversation-flush-bulk/AS INTENTIONAL ARCHIVE. Do not auto-restore any file from it. Do not try to “fix broken backlinks” by pulling files back into the live tree. If a wiki entity page has backlinks pointing into the archive, either leave them (archive reference is OK) or scrub the backlink (see the 8-file scrub pattern in commite206c4e). Never restore. -
DO NOT RE-ENABLE
transcript-capture.py. TheDISABLED = Trueflag at the top of both copies is intentional. Re-enable criteria documented in the hook’s docstring + aj-workspace evolution-backlog 17-Apr-2026: (a) script must callvault_searchbefore every submission and skip on ≥3 similar-title matches, (b) hard cap ≤3 submissions per session, (c) extraction prompt must apply the institutional-only gate fromfeedback_vault_institutional_only.md. All three required before re-enable. -
ADOPT THE INSTITUTIONAL-ONLY GATE FOR YOUR OWN vault_submit_ CALLS.* The 4-test filter in
feedback_vault_institutional_only.md: (a) Cross-agent benefit — would an agent in another division retrieve this and behave differently? (b) Non-stack-specific — strip framework/library name, is the pattern still valuable? (c) Search-first —vault_searchreturned <3 similar-title matches? (d) Would the user add this? If unsure → don’t submit. Fail any test → route the learning to an agent’s local memory (error-playbook / session-learnings / feedback), not vault. -
(Optional, architectural work in your lane): Rewrite
/home/claude/.claude/scripts/conversation-flush.pywith the three preconditions in item 2. Once rewritten and verified, Claude Code can flipDISABLED = False. Include a test harness that submits 10 known patterns and verifies dedup rejects 7+ near-dupes before allowing production use. -
Update your weekly sweep to include the duplicate-cluster check that session-close Phase 2 now runs (slug-prefix grouping, flag clusters ≥4 copies). Reference:
/home/claude/.claude/skills/session-close/references/phase-2-pristine-state.md§ Vault Bloat Audit.
Acknowledge when received.