Skip to content

Respect caller timestamp options in batched transcribe#1428

Open
ponpaku wants to merge 1 commit intoSYSTRAN:masterfrom
ponpaku:codex/pr1-batched-caller-options
Open

Respect caller timestamp options in batched transcribe#1428
ponpaku wants to merge 1 commit intoSYSTRAN:masterfrom
ponpaku:codex/pr1-batched-caller-options

Conversation

@ponpaku
Copy link

@ponpaku ponpaku commented Mar 16, 2026

Summary

This PR restores caller-provided condition_on_previous_text and max_initial_timestamp in BatchedInferencePipeline.transcribe().

Related issue: #1427

Previously, the batched path hardcoded these values instead of using the caller request, making it behaviorally non-equivalent to WhisperModel.transcribe() for timestamped decoding.

What this PR changes

  • pass through condition_on_previous_text in batched transcription options
  • pass through max_initial_timestamp in batched transcription options
  • add regression tests that verify both values are preserved in batched transcription options

What this PR does not try to solve

This PR is intentionally narrow.

It does not attempt to fix all direct/batched quality gaps such as:

  • prompt/history update design
  • fallback behavior differences
  • VAD chunk packing strategy

Validation

  • added lightweight regression tests for option propagation
  • verified the new tests pass locally with pytest
  • confirmed on a public FLEURS Japanese fixture that upstream batched transcription reports the overridden values, while the patched branch preserves the caller values

@ponpaku ponpaku force-pushed the codex/pr1-batched-caller-options branch from 2b3f513 to 519481d Compare March 16, 2026 04:50
@ponpaku ponpaku force-pushed the codex/pr1-batched-caller-options branch from 519481d to 58d1fe9 Compare March 16, 2026 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant