Experience: Strategic Scan 2026-04-30

What Happened

ST MCP servers (sequential-thinking + vault-mcp) were unreachable during scheduled cron execution. Root cause: VPS CPU steal time (93.6%) degrading all containerized services. Rather than fail the scan, executed all 9 steps manually using documented reasoning frameworks from SKILL.md files.

Key Learnings

  1. Infrastructure degradation reframes strategic diagnosis — The Apr 28 “ceremonial scanning loop” diagnosis was behavioral. Today’s finding reveals it is also structural: the platform cannot execute insights it produces.

  2. MCP server unavailability is a canary — When ST/vault MCPs are unreachable, it signals broader infrastructure distress. This should trigger an immediate health check rather than just logging an error.

  3. Manual ST execution is viable but slower — 35 minutes vs typical 10-15 minutes for automated MCP chain. Quality comparable because frameworks are well-documented in skills.

  4. Prior scan outcomes remain pending — 9 days of scans (Apr 21/23/25/28/30) with zero executed actions. The gap between insight and execution is widening.

What Worked

  • Vault filesystem access remained available (direct disk, not via MCP)
  • Prior scan context was richly documented in vault decisions/active/
  • Error playbook had VPS steal-time pattern already captured (added Apr 30 night shift)

What Didn’t

  • MCP substrate failure prevented automated ST chain
  • No fallback ST endpoint available
  • Could not review ST decision journal for wisdom compounding

Recommendations

  1. Add MCP health check to strategic scan pre-flight
  2. Create local ST fallback (sqlite journal readable via CLI)
  3. Escalate VPS recovery to highest priority — it blocks every other capability