[train] Enable custom chat template for get_response_ids_and_loss_mask_from_messages by CharlieFRuan · Pull Request #981 · NovaSky-AI/SkyRL

CharlieFRuan · 2026-01-28T07:33:17Z

We add an optional chat_template kwarg to get_response_ids_and_loss_mask_from_messages(), which is used to tokenize the messages into token IDs for custom agents.

The motivation is that, if you used a custom chat template to perform rollout, you should use the same custom chat template to tokenize it.

For more motivation, see the PR description here: mlfoundations#12

Note

token-in-token-out is supported in SkyRLGymGenerator, so this PR is irrelevant to that codepath. For custom agent, to really get on-policy training, we would need step-wise training (support coming soon). But empirically tokenizing at the end is not catastrophic for many tasks.

…rom_messages

gemini-code-assist

Code Review

This pull request introduces an optional chat_template parameter to several utility functions to allow for custom tokenization, which is crucial for on-policy training with custom agents. The changes are logical and well-implemented, propagating the chat_template through get_generation_prompt_ids, encode_messages_subset, and get_response_ids_and_loss_mask_from_messages. The accompanying tests are thorough, covering both default and custom template behaviors, including edge cases with Qwen3's thinking blocks.

My review focuses on minor code simplifications for improved readability and maintainability. I've suggested simplifying how the chat_template is passed to tokenizer.apply_chat_template and using pathlib for more readable path construction in the tests. Overall, this is a solid contribution.

skyrl-train/skyrl_train/generators/utils.py

skyrl-train/tests/cpu/generators/test_utils.py

[train] Add custom chat template for get_response_ids_and_loss_mask_f…

2a668e4

…rom_messages

CharlieFRuan changed the title ~~[train] Add custom chat template for get_response_ids_and_loss_mask_from_messages~~ [train] Enable custom chat template for get_response_ids_and_loss_mask_from_messages Jan 28, 2026

CharlieFRuan mentioned this pull request Jan 28, 2026

[skyrl-train][Harbor] Upstream Harbor training code to main skyrl-train #866

Closed

gemini-code-assist bot reviewed Jan 28, 2026

View reviewed changes

skyrl-train/skyrl_train/generators/utils.py Outdated Show resolved Hide resolved

skyrl-train/skyrl_train/generators/utils.py Outdated Show resolved Hide resolved

skyrl-train/tests/cpu/generators/test_utils.py Show resolved Hide resolved

address gemini comments

bb75bc1

vercel bot deployed to Preview January 28, 2026 07:38 View deployment

CharlieFRuan merged commit eaeb4a8 into main Jan 28, 2026
4 checks passed

CharlieFRuan deleted the charlie/pr012726-custom-chat-template branch January 28, 2026 07:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[train] Enable custom chat template for get_response_ids_and_loss_mask_from_messages#981

[train] Enable custom chat template for get_response_ids_and_loss_mask_from_messages#981
CharlieFRuan merged 2 commits intomainfrom
charlie/pr012726-custom-chat-template

CharlieFRuan commented Jan 28, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

CharlieFRuan commented Jan 28, 2026

Note

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant