fix(agent-proxy): conservative segment budget + drop unnecessary upstream.cancel()

Two related fixes after re-testing on the deploy preview: 1. SEGMENT_BUDGET_MS: 54_000 -> 40_000. Empirically, Netlify Edge Function responses get cut anywhere between ~49 s and ~60 s wall-clock — significantly more variance than the docs imply. With a 54 s budget, our segment_end write + controller.close() sometimes lands AFTER the platform has already killed the response (curl: 'HTTP/2 stream 1 was not closed cleanly before end of the underlying stream'). 40 s gives ~10 s of headroom on the low end of the observed range while still keeping segment overhead reasonable (~1 segment per 40 s of brief). 2. HEARTBEAT_MS: 10_000 -> 5_000. Tighter keep-alive so the connection stays warm even during long model-thinking gaps where the agent isn't emitting events. 3. Drop the 'await upstream.cancel()' between backfill and tail. The upstream fetch's body is already cleaned up by segmentAbort.abort() (or will be by the runtime once we drop our reference). On Deno Edge, awaiting cancel() on an already-aborted body can hang, which prevents segment_end from being written. Co-Authored-By: alex <alex@semipublic.co>
2026-05-13 13:16:44 +00:00
parent df3c8e7d1c
commit 25c4e2442c
1 changed files with 18 additions and 13 deletions
@@ -59,13 +59,19 @@ const ANTHROPIC_VERSION = '2023-06-01'
 const ANTHROPIC_API_BASE = 'https://api.anthropic.com'

 // Downstream keep-alive cadence. The browser's fetch reader appears
-// stuck if no bytes flow for too long.
-const HEARTBEAT_MS = 10_000
+// stuck if no bytes flow for too long. We keep this tighter than the
+// segment budget so the connection stays warm even during long
+// model-thinking gaps with no agent events.
+const HEARTBEAT_MS = 5_000

-// Close the current response just before Netlify's empirical ~60 s
-// streaming cap. Leaves a few seconds for a final `segment_end` write
-// + `controller.close()` to flush cleanly.
-const SEGMENT_BUDGET_MS = 54_000
+// Close the current response well before Netlify's empirical streaming
+// cap. We've observed responses cut anywhere between ~49 s and ~60 s,
+// so a 54 s budget is too aggressive — sometimes the platform pulls
+// the rug out before our final `segment_end` write + `controller.close()`
+// can flush. 40 s gives ~10 s of headroom on the low end of the
+// observed range while still amortising session setup (~1 segment per
+// ~40 s of brief) across a typical 3-5 min brief.
+const SEGMENT_BUDGET_MS = 40_000

 // Total wall-clock allowance across ALL segments for a single user
 // brief. Past this we emit a final `error` line and stop reopening.
@@ -660,13 +666,12 @@ Please follow your instructions to produce the funding outlook brief.`
          }

          if (done || segmenting) {
-            // Drain the live-stream body we opened above so we don't
-            // leak the connection.
-            try {
-              await upstream.cancel()
-            } catch {
-              /* ignore */
-            }
+            // The upstream fetch is already cleaned up by
+            // `segmentAbort.abort()` (or will be by the runtime once
+            // we drop our reference). Don't `await upstream.cancel()`
+            // here — on an already-aborted body that call can hang
+            // on Deno Edge, which would prevent us from writing the
+            // final `segment_end` line.
            break
          }