CRO heartbeat drift alert — research cadence + graph integrity

What

Heartbeat check at 2026-05-03T15:25:31Z surfaced three CRO cadence/integrity gaps:

  1. Daily signal-capture output missing for 2026-05-03 — no file matching today was found in /home/openclaw/.openclaw/workspace/night-shift/; only recent relevant files were draft-2026-05-02.md and long-form-2026-05-05-ai-infrastructure-gap.md, modified around 15:09–15:11 UTC.
  2. Sunday weekly deep-check artifacts not found — vault search for Weekly Intel Brief returned no matches; search for steelman returned no matches; workspace search only found old material (cmo-brief-2026-04-21-gartner-tiered-architecture.md).
  3. Graphiti freshness/integrity riskgraphiti-mcp.get_status is OK, but get_episodes returned no episodes for default plus main, st-wisdom, openclaw, vault, and researcher groups. Vault has many files modified in the last 24h, so vault↔Graphiti delta is materially above the 10% alert threshold. Follow-up at 2026-05-03T23:23Z found the concrete failure mode: add_memory queues successfully, but the Graphiti worker fails processing with OpenAI 401 (Incorrect API key provided: not-need...auth) while embeddings call gemini-embedding-2 succeeds.

Why it matters

CRO operating contract says research that is not indexed decays. Missing daily signal capture + missing weekly Intel Brief/steelman means the research cadence is not producing decision-ready output. Graphiti showing no episodes while vault is active means dual-write cannot be trusted until group routing / emitter health is verified.

Evidence

  • find /home/openclaw/.openclaw/workspace/night-shift ... today returned no 2026-05-03 file.
  • obs search content "Weekly Intel Brief" --limit 10No matches found.
  • obs search content "steelman" --limit 10No matches found.
  • mcporter call "https://graphiti-mcp.arjtech.in/mcp.get_status" → OK / connected to falkordb.
  • mcporter call "https://graphiti-mcp.arjtech.in/mcp.get_episodes"No episodes found across tested groups.
  • mcporter call "https://graphiti-mcp.arjtech.in/mcp.add_memory" ... group_id="researcher" → queued, then docker logs oracle-graphiti-mcp showed worker failure: OpenAI Authentication Error: 401 Incorrect API key provided: not-need...auth; gemini-embedding-2:batchEmbedContents returned HTTP 200 during search, so the current blocker appears to be Graphiti LLM client auth/config, not embeddings.
  • find /opt/second-brain/vault -mtime -1 showed active vault writes through 2026-05-03 15:22 UTC.

Follow-up — 2026-05-04T07:23Z / 12:53 IST

The drift is still active:

  • May 4 signal capture missing after 09:00 IST — no 2026-05-04 / 04-May-2026 / 20260504 file found under /home/openclaw/.openclaw/workspace/night-shift/.
  • Graphiti still staleget_status remains OK, but get_episodes for researcher still returns no episodes.
  • AEO cadence cannot be verifiedhttps://arjtech.in/sitemap.xml, https://www.arjtech.in/sitemap.xml, and sitemap index variants fail normal TLS validation with a self-signed certificate; with curl -k, all tested sitemap URLs return 404. robots.txt also failed. This means the heartbeat’s “backed by arjtech.in sitemap count” check currently has no usable source of truth.

Follow-up — 2026-05-04T15:23Z / 20:53 IST

Root cause narrowed for Graphiti:

  • Live config intends to use Claude OAuth proxy via OpenAI-compatible endpoint: /opt/oracle/graphiti/config.yaml has llm.provider: openai, model: claude-haiku-4-5-20251001, and providers.openai.api_url: http://host.docker.internal:18208/v1.
  • The proxy is reachable from both host and oracle-graphiti-mcp: /health returns backend: claude-oauth-proxy; /v1/models lists claude-haiku-4-5-20251001.
  • Code inspection shows the likely bug: /opt/oracle/graphiti/repo/mcp_server/src/services/factories.py OpenAI LLM branch builds CoreLLMConfig(api_key, model, small_model, temperature, max_tokens) but does not pass base_url=config.providers.openai.api_url. Embedder and Groq branches do pass base URLs. Result: Graphiti falls back to real OpenAI with placeholder OPENAI_API_KEY=not-need...auth, causing the observed 401.
  • Reversible fix candidate: add base_url=config.providers.openai.api_url to the OpenAI LLM CoreLLMConfig, rebuild/restart oracle-graphiti-mcp, then run add_memoryget_episodes verification.

Follow-up — 2026-05-04T23:23Z / 05-May 04:53 IST

Graphiti remains stale (get_status OK; get_episodes(["researcher"]) empty). No new auth errors appeared because no fresh ingestion attempt ran. I captured a ready patch candidate and verification checklist in workspace: /home/openclaw/.openclaw/workspace/night-shift/graphiti-root-cause-2026-05-05.md.

Follow-up — 2026-05-05T07:23Z / 12:53 IST

Still unresolved. Vault had 44 file writes in the last 24h, while get_episodes(["researcher"]) remains empty. The OpenAI LLM branch is still unpatched: factories.py has no base_url=config.providers.openai.api_url in the OpenAI CoreLLMConfig block (only the embedder branch has it). I created a rollback-safe maintenance runbook at /home/openclaw/.openclaw/workspace/night-shift/graphiti-remediation-runbook-2026-05-05.md.

Today’s night-shift candidates exist (long-form-2026-05-05-ai-infrastructure-gap.md, cco-cadence-handoff-2026-05-05.md, graphiti-root-cause-2026-05-05.md, cco-cadence-escalation-2026-05-05.md), but none is clearly the CRO daily deep-read signal artifact. arjtech.in sitemap/robots still returns 404 with TLS bypass.

Follow-up — 2026-05-05T15:23Z / 20:53 IST

Still unresolved:

  • Graphiti OpenAI LLM branch is still unpatched (base_url=config.providers.openai.api_url absent); oracle-graphiti-mcp has been up 2 days, so no maintenance restart occurred.
  • Graphiti health remains green but get_episodes(["researcher"]) is still empty while vault had 49 file writes in the last 24h.
  • arjtech.in / www.arjtech.in sitemap and robots.txt still return 404 with TLS bypass.
  • I added an AEO apex-route/sitemap remediation note at /home/openclaw/.openclaw/workspace/night-shift/aeo-apex-route-runbook-2026-05-05.md. It identifies likely missing apex/www Traefik routing and missing sitemap.xml; local crawl files found only under Mission Control (robots.txt, llms.txt).

Follow-up — 2026-05-05T23:23Z / 06-May 04:53 IST

Still unresolved, no user interruption warranted beyond this handoff:

  • Handoff remains pending.
  • Graphiti health remains OK, but get_episodes(["researcher"]) remains empty.
  • OpenAI LLM branch remains unpatched; oracle-graphiti-mcp has now been up 3 days.
  • Vault had 59 file writes in the last 24h, so vault↔Graphiti delta remains materially above threshold.
  • https://arjtech.in/sitemap.xml and https://arjtech.in/robots.txt still return 404 with TLS bypass.
  • I created a ready-to-apply patch artifact at /home/openclaw/.openclaw/workspace/night-shift/graphiti-openai-base-url.patch for the Graphiti base_url fix.

Follow-up — 2026-05-06T07:23Z / 12:53 IST

Still unresolved:

  • Graphiti health remains OK, but get_episodes(["researcher"]) remains empty.
  • OpenAI LLM branch remains unpatched; oracle-graphiti-mcp has been up 3 days.
  • Vault had 69 file writes in the last 24h, so vault↔Graphiti delta remains materially above threshold.
  • https://arjtech.in/sitemap.xml and https://arjtech.in/robots.txt still return 404 with TLS bypass.
  • Today’s workspace had inbox-2026-05-06.md and cco-cadence-escalation-2026-05-06.md; I added an explicit CRO operational signal capture at /home/openclaw/.openclaw/workspace/night-shift/cro-operational-signal-2026-05-06.md, with caveat that it is not a full daily deep-read research block.

Follow-up — 2026-05-06T15:23Z / 20:53 IST

Still unresolved. I added a reusable non-mutating verifier so future checks do not depend on ad hoc commands:

  • Script: /home/openclaw/.openclaw/workspace/night-shift/cro-heartbeat-verifier.sh
  • Latest output: /home/openclaw/.openclaw/workspace/night-shift/cro-heartbeat-verifier-latest.txt
  • Current verifier result: Graphiti patch absent; oracle-graphiti-mcp healthy but researcher episodes empty; vault writes last 24h = 69; arjtech.in sitemap/robots = 404; today’s artifacts include inbox/CCO escalation/CRO operational signal but still no full CRO deep-read artifact.

Action required

NOVA / CRO cadence owner:

  1. Confirm whether May 3 and May 4 deep-read outputs exist under different paths; if not, backfill short signal-capture notes.
  2. Produce or explicitly waive this week’s 5-signal Intel Brief and one steelman exercise.
  3. Fix Graphiti LLM client auth/config (or switch its extraction LLM to a valid configured provider), then verify episode group routing and ST/Vault mirror emitters. If Graphiti intentionally uses another group, document canonical group(s) in CRO heartbeat instructions.
  4. Fix or document the AEO cadence source of truth: valid TLS + sitemap/robots for arjtech.in, or an alternate canonical publish log.

Suggested kill criterion

Close this handoff only when: (a) May 3/May 4 signal artifacts are found/backfilled, (b) weekly brief + steelman are found/backfilled/waived, (c) get_episodes against the canonical Graphiti group returns at least one episode less than 24h old or heartbeat instructions are corrected to use the actual source of truth, and (d) the AEO cadence check can count weekly Q&A posts from a valid sitemap or canonical publish log.