MCP tool hangs Claude session — asyncio.to_thread(blocking_io) without timeout anywhere
Diagnosis
A single MCP tool call hangs indefinitely; Claude session freezes. /health still returns 200 (container alive; FastMCP event loop stuck on one worker thread waiting on a blocked HTTP call). Root cause: asyncio.to_thread(conn.restful, …) wraps a blocking I/O call with no asyncio.wait_for cap; the underlying HTTP session never passes timeout= kwarg → requests defaults to None → infinite wait. Contract.yaml may DOCUMENT a guardrail timeout but if the code never reads/applies it, you have contract-vs-code drift. FastMCP serializes tool dispatch on a small worker pool, so one hung call freezes the entire MCP transport. Empirically surfaced on salesforce-mcp v7.1.0 with sf_describe_report_type on heavy custom report type Booking_With_or_W_o_Loan__c stalling Analytics API past 2 min.
Fix
Two-layer defense in depth: (1) Session-layer backstop in the auth-refreshing HTTP session wrapper: kwargs.setdefault(“timeout”, (10.0, 300.0)) on every oauth_session.request(…). Covers ALL tools as backstop, overridable per-call for bulk/deploy. (2) Per-tool tight cap: wrap each interactive describe/list/read tool’s asyncio.to_thread(…) in asyncio.wait_for(…, 120.0) and return structured {status: error, error_code: TOOL_NAME_TIMEOUT, …} payload on TimeoutError. Add enforcement_layers: field to contract entry naming exact code sites — resolves contract-vs-code drift permanently. Applied: salesforce-mcp v7.1.1 patch 2026-05-11 — RefreshingSalesforceSession.request in core/api_client.py + sf_describe_report_type in server.py. Audit rule: every asyncio.to_thread(blocking_io, …) site MUST have either (a) sibling asyncio.wait_for OR (b) documented session-layer timeout backstop covering the call path.