Skip to content

feat: SSE keepalive for chat completions streaming + fix cancellation propagation#1745

Open
spitfire55 wants to merge 2 commits intoexo-explore:mainfrom
spitfire55:feat/sse-keepalive-and-cancellation-fix
Open

feat: SSE keepalive for chat completions streaming + fix cancellation propagation#1745
spitfire55 wants to merge 2 commits intoexo-explore:mainfrom
spitfire55:feat/sse-keepalive-and-cancellation-fix

Conversation

@spitfire55
Copy link

@spitfire55 spitfire55 commented Mar 16, 2026

Summary

Two related fixes for the OpenAI chat completions SSE streaming endpoint:

1. SSE keepalive during silent streaming periods

During thinking phases, the model generates tokens but the parser consumes them — no data flows on the SSE stream. HTTP connections can drop after prolonged silence.

Wraps the chat completions SSE output with sse_with_keepalive() which sends SSE comments (: keepalive\n\n) every 15 seconds when no data chunks are available. SSE comments are part of the spec and are ignored by compliant clients.

2. Catch asyncio.CancelledError in stream cleanup

_token_chunk_stream only caught anyio.get_cancelled_exc_class() for cancellation cleanup. Since the keepalive wrapper uses asyncio.create_task, client disconnects arrive as asyncio.CancelledError, bypassing the cleanup that sends TaskCancelled to the worker. This left the worker generating indefinitely after the client disconnected.

Now catches both anyio.get_cancelled_exc_class() and asyncio.CancelledError.

Test plan

  • SSE keepalive comments sent during long thinking phases
  • Worker stops generating within seconds of client disconnect
  • Normal streaming unaffected
  • Stream terminates cleanly with data: [DONE] on normal completion

🤖 Generated with Claude Code

spitfire55 and others added 2 commits March 16, 2026 17:35
anyio task groups can't be used with yield (crossing task boundaries).
Instead, wrap the final SSE string stream at the StreamingResponse
level using a plain asyncio queue + task pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sse_with_keepalive wrapper uses asyncio.create_task for its
producer. When the HTTP client disconnects, the producer task gets
cancelled with asyncio.CancelledError. But _token_chunk_stream only
catches anyio's cancellation exception, so it never sends
TaskCancelled to the worker — leaving the model generating
indefinitely after the client is gone.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@spitfire55 spitfire55 changed the title feat: SSE keepalive during streaming + fix cancellation propagation feat: SSE keepalive for chat completions streaming + fix cancellation propagation Mar 17, 2026
@spitfire55 spitfire55 marked this pull request as ready for review March 17, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant