feat: SSE keepalive for chat completions streaming + fix cancellation propagation#1745
Open
spitfire55 wants to merge 2 commits intoexo-explore:mainfrom
Open
feat: SSE keepalive for chat completions streaming + fix cancellation propagation#1745spitfire55 wants to merge 2 commits intoexo-explore:mainfrom
spitfire55 wants to merge 2 commits intoexo-explore:mainfrom
Conversation
anyio task groups can't be used with yield (crossing task boundaries). Instead, wrap the final SSE string stream at the StreamingResponse level using a plain asyncio queue + task pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sse_with_keepalive wrapper uses asyncio.create_task for its producer. When the HTTP client disconnects, the producer task gets cancelled with asyncio.CancelledError. But _token_chunk_stream only catches anyio's cancellation exception, so it never sends TaskCancelled to the worker — leaving the model generating indefinitely after the client is gone. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes for the OpenAI chat completions SSE streaming endpoint:
1. SSE keepalive during silent streaming periods
During thinking phases, the model generates tokens but the parser consumes them — no data flows on the SSE stream. HTTP connections can drop after prolonged silence.
Wraps the chat completions SSE output with
sse_with_keepalive()which sends SSE comments (: keepalive\n\n) every 15 seconds when no data chunks are available. SSE comments are part of the spec and are ignored by compliant clients.2. Catch asyncio.CancelledError in stream cleanup
_token_chunk_streamonly caughtanyio.get_cancelled_exc_class()for cancellation cleanup. Since the keepalive wrapper usesasyncio.create_task, client disconnects arrive asasyncio.CancelledError, bypassing the cleanup that sendsTaskCancelledto the worker. This left the worker generating indefinitely after the client disconnected.Now catches both
anyio.get_cancelled_exc_class()andasyncio.CancelledError.Test plan
data: [DONE]on normal completion🤖 Generated with Claude Code