feat: token usage tracking in responses api#11302
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis pull request introduces per-response usage metrics (input_tokens, output_tokens, total_tokens) throughout the codebase. A new Changes
Sequence DiagramsequenceDiagram
participant Client
participant OpenAIAPI as OpenAI API
participant Handler as OpenAI Response Handler
participant Schema as Message Properties
participant Response as OpenAI Response Object
Client->>OpenAIAPI: Send prompt (streaming or non-streaming)
OpenAIAPI->>Handler: Return response with usage metadata
alt Streaming Path
loop For each chunk
OpenAIAPI-->>Handler: Stream chunk with usage_metadata
Handler->>Schema: Extract usage (input, output, total tokens)
Handler->>Handler: Accumulate usage_data
end
Handler->>Handler: Message reaches "complete" state
Handler->>Response: Emit completion event with usage_data
Response-->>Client: Send response.completed with usage
else Non-Streaming Path
Handler->>Schema: Extract usage from result outputs
Handler->>Handler: Construct usage_data dict
Handler->>Response: Populate OpenAIResponsesResponse with usage field
Response-->>Client: Send response with usage_data
end
Response-->>Client: Final response includes usage metrics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 2❌ Failed checks (1 error, 1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (28.23%) is below the target coverage (40.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #11302 +/- ##
==========================================
+ Coverage 35.38% 35.45% +0.07%
==========================================
Files 1527 1527
Lines 73484 73563 +79
Branches 11041 11059 +18
==========================================
+ Hits 25999 26080 +81
+ Misses 46073 46068 -5
- Partials 1412 1415 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
edwinjosechittilappilly
left a comment
There was a problem hiding this comment.
Tested in LF. LGTM
Update test_openai_error_propagation to ignore response.completed events emitted by the stream and only validate response.chunk entries. The stream can send a response.completed event containing a nested response id, but per OpenAI spec only response.chunk messages have top-level id/object/delta. Filter chunks accordingly, require at least one response.chunk, and assert the presence of id/object fields.
|
@HimavarshaVS @RamGopalSrikar Please coordinate with QA and Merge this PR as soon as available? |
Token Usage Tracking Implementation
- New Usage class with input_tokens, output_tokens, total_tokens fields
- Added usage field to Properties class
- New extract_usage() method to extract token usage from AIMessage response_metadata
- Supports OpenAI format (token_usage.prompt_tokens/completion_tokens/total_tokens)
- Supports Anthropic format (usage.input_tokens/output_tokens)
- Modified _get_chat_result() to set usage on returned Message for non-streaming
- Modified _stream_message() to return usage data from final chunk
- Modified send_message() to set usage on message properties after streaming
- Non-streaming: Extract usage from component_output.results Message properties
- Streaming: Capture usage from add_message event with state=complete
- Added response.completed event with usage in streaming responses
- Added programmatic model configuration for LanguageModelComponent
- Set API key directly in template (workaround for global variable lookup issue in tests)
- Added usage validation assertions
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.