systemd socket activation for non-LISTEN_FDS daemons with cold-start guard

systemd socket activation pattern for non-LISTEN_FDS upstream daemons (with cold-start guard)

Context: brainstorm.service was crash-looping for 2 weeks (380+ restarts) because Restart=always fought the upstream server’s built-in 30-min idle exit. Fixed 2026-05-16 via socket activation.

Pattern

For any vendor daemon that binds its own port (doesn’t speak the systemd LISTEN_FDS protocol) but whose upstream contract includes “I exit cleanly when idle”, canonical socket-activation under systemd is three units:

  1. <svc>.socket — persistent listener on the user-visible port

    • NO PartOf= on the proxy (must outlive proxy lifecycle)
    • Just [Socket] + ListenStream=127.0.0.1:<port> + [Install] WantedBy=sockets.target
  2. <svc>.service — socket-activated proxy

    • ExecStart=/lib/systemd/systemd-socket-proxyd 127.0.0.1:<backend-port>
    • Requires=<svc>.socket <svc>-backend.service
    • After= both
    • BindsTo=<svc>-backend.service (proxy follows backend down)
    • Cold-start guard (mandatory): ExecStartPre=/bin/bash -c 'for i in {1..50}; do ss -tln | grep -q "127.0.0.1:<backend-port> " && exit 0; sleep 0.1; done; exit 1' — waits up to 5s for backend to actually bind the port before socket-proxyd accepts connections. Without this, first request after each backend idle-exit returns HTTP 000 (proxy starts before node finishes listen()).
  3. <svc>-backend.service — actual daemon on internal backend port

    • ExecStart= real binary, env points it to internal port (NOT the user-visible port)
    • Restart=no — let upstream idle-exit terminate the process cleanly

Verification matrix

StateRequestExpected
Cold (only socket up, backend down)curl :/HTTP 200, ~140ms (backend spawn cost)
Warm (everything active)curl :/HTTP 200, ~2ms
Force-stop backend, request againcurl :/HTTP 200, ~140ms (re-spawn cycle)
systemctl is-active .socket (always)active(listening)

Why PartOf= on the socket would be wrong

If <svc>.socket has PartOf=<svc>.service, then when the proxy service stops (because backend died), the socket also stops, and the port goes dead. That breaks the “socket always listening, daemon on demand” contract — first connection after idle exit gets ECONNREFUSED at the OS level, no chance for re-activation.

Generalizes to

Any vendor daemon you can’t modify but want socket-activated under systemd — Node servers, Python ASGI servers, Go binaries that don’t read LISTEN_FDS, anything that binds its own port and has an idle-shutdown contract.

Reference deployment

  • Units: /etc/systemd/system/brainstorm.{socket,service} + brainstorm-backend.service
  • Runbook: /root/aj-workspace/docs/runbooks/brainstorm-socket-activation.md
  • Backup of pre-change unit: /etc/systemd/system/brainstorm.service.bak-20260516
  • Upstream daemon: Superpowers visual brainstorming companion (server.cjs from claude-plugins-official/superpowers)
  • Date deployed: 2026-05-16