Rewrote file-manager (files.arjtech.in / dl.arjtech.in) archive download path from blocking execFileSync + /tmp materialization + readFileSync-buffer model to streaming archiver→Hono Response model. A
Decision
Rewrote file-manager (files.arjtech.in / dl.arjtech.in) archive download path from blocking execFileSync + /tmp materialization + readFileSync-buffer model to streaming archiver→Hono Response model. Added HTTP Range/resume support to single-file downloads (createReadStream + 206 Partial Content + 416 on bad range). Bumped container memory limit 2G→4G. Added non-regular-entry filter (skip sockets/FIFOs/devices) on archiver directory callbacks in both /api/download and /api/bulk-download to prevent ENTRYNOTSUPPORTED queue stalls.
Rationale
Original implementation had 5 stacked failure modes that would corrupt any large download: (1) 60s hard timeout on tar/zip would kill operations past that threshold mid-write, (2) execFileSync blocked the Node event loop so healthcheck would fail during op and risk container restart mid-stream, (3) fs.readFileSync(tmpZip) materialized the entire archive in RAM so the 2GB cgroup cap would OOMKill any download larger than ~1.5GB, (4) bulk-download cpSync’d everything to /tmp before zipping, doubling I/O and transient disk pressure, (5) single-threaded blocking zlib. Streaming via archiver npm piped through Readable.toWeb() into a Hono Response: zero /tmp staging, ~250MB peak RAM for 1.9GB zip (vs ~1GB+ old path), healthcheck remained responsive throughout the operation (zlib runs in libuv threadpool, main thread free). Range/resume on single-file path supports the same flawless-large-file mandate (verified 200/206/416 responses on test fixture). Non-regular-entry filter added after live integration test revealed archiver stalled on /root/.claude/remote/rpc.sock — sockets, FIFOs, char/block devices now skipped with info-log instead of stalling the queue. Reversible: rollback = redeploy previous image SHA afdcd244d70c.
Alternatives Rejected
Outcome
Pending
Related
- fix-sfdescribereporttype-hang-via-two-layer-timeout-enforcem
- two-layer-salesforce-mcp-timeout-enforcement-final-session-l
- ccvs-gate-script-codex-gatesh-corrected-post-pristine-sweep
- two-layer-timeout-enforcement-for-salesforce-mcp-session-lay
- add-userpromptsubmit-hook-codex-findings-surfacepy-to-surfac
- codex-cli-is-set-up-as-first-class-peer-to-claude-code-in-aj
- onboard-new-aws-account-292600392118-runwal-enterprises-limi