The reconnect + events.list backfill in c283d88 is correct but never ran:
the previous v2 Node Function was killed at ~27 s (well before the 20 min
reconnect budget could matter), so streams always died after the first MCP
tool batch.
Move the proxy to a Netlify Edge Function (Deno runtime) which has no
streaming-duration cap as long as we keep writing to the response body.
Same reconnect / backfill / dedupe-by-event-id pattern; same NDJSON wire
protocol to the browser. Implemented with plain fetch() against the
Anthropic REST API (npm packages on Edge are beta) so we have no SDK
runtime dependency.
Frontend now POSTs to /api/agent-proxy. The Anthropic SDK is removed
from the package; @netlify/edge-functions is added for ambient types.
Co-Authored-By: alex <alex@semipublic.co>
The Anthropic Managed-Agents SSE stream (`/v1/sessions/{id}/events/stream`)
is not guaranteed to stay open for the life of a multi-turn session.
Inter-turn gaps (e.g. while MCP tools or the next model request run) of
~25s+ are common and can trigger upstream / proxy read timeouts; when
that happens, the SDK's `for await` iterator just ends and the proxy
silently closed the downstream response, leaving the browser with only
the events from the first tool batch.
Implement the official 'consolidation' pattern: treat any upstream
end-of-stream that isn't a terminal `session.status_*` as a transient
drop, fetch what we missed via `sessions.events.list()` (deduped by
event.id), reopen the live stream, and keep going — up to a 20 minute
wall-clock budget so a wedged session can't pin the function forever.
Also switch the wire format from inline-markdown-with-status-comments to
NDJSON, so the React side can render in-flight tool activity in a
separate status banner from the streaming brief. The existing zero-width
heartbeat is replaced by an explicit `{"type":"heartbeat"}` line.
Drop the `agent.tool_use` write-tool special case — the agent now
streams the brief directly via `agent.message` text blocks.
Co-Authored-By: alex <alex@semipublic.co>