Skip to content

[Fix][Generator] Correct chat history length for retoknenize codepath with env.init#214

Merged
CharlieFRuan merged 2 commits intoNovaSky-AI:mainfrom
CharlieFRuan:fix-0828-init-len
Aug 28, 2025
Merged

[Fix][Generator] Correct chat history length for retoknenize codepath with env.init#214
CharlieFRuan merged 2 commits intoNovaSky-AI:mainfrom
CharlieFRuan:fix-0828-init-len

Conversation

@CharlieFRuan
Copy link
Member

@CharlieFRuan CharlieFRuan commented Aug 28, 2025

This pull request addresses a bug in the retokenize_chat_history codepath within SkyRLGymGenerator. The issue arose when env.init() altered the length of the chat_history, as the original length was incorrectly assumed. The fix introduces initial_chat_history_length to correctly capture the length of the chat history after the env.init() call. This new variable is then used to accurately slice the chat history, separating the initial prompt from the generated turns.

This PR fixes this issue and closes #213

… with env.init

Co-authored-by: Alexander Conway <alex-dr@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in the retokenize_chat_history codepath within SkyRLGymGenerator. The issue arose when env.init() altered the length of the chat_history, as the original length was incorrectly assumed. The fix introduces initial_chat_history_length to correctly capture the length of the chat history after the env.init() call. This new variable is then used to accurately slice the chat history, separating the initial prompt from the generated turns. The change is correct and effectively resolves the bug. The implementation is clean and I have no further suggestions.

@CharlieFRuan CharlieFRuan merged commit dabf1ee into NovaSky-AI:main Aug 28, 2025
3 checks passed
@CharlieFRuan CharlieFRuan deleted the fix-0828-init-len branch August 28, 2025 18:17
dzorlu referenced this pull request in fleet-ai/SkyRL Feb 4, 2026
… with env.init (#214)

This pull request addresses a bug in the `retokenize_chat_history`
codepath within SkyRLGymGenerator. The issue arose when env.init()
altered the length of the chat_history, as the original length was
incorrectly assumed. The fix introduces initial_chat_history_length to
correctly capture the length of the chat history after the env.init()
call. This new variable is then used to accurately slice the chat
history, separating the initial prompt from the generated turns.

---------

Co-authored-by: Alexander Conway <alex-dr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Incorrect computation of response_ids with custom chat template and multi-turn conversation when init() modifies chat history

2 participants