Skip to content

[skyrl-train] assert that the policy loss type is regular/dual clip for tis#546

Merged
erictang000 merged 2 commits intoNovaSky-AI:mainfrom
erictang000:tis_check
Oct 21, 2025
Merged

[skyrl-train] assert that the policy loss type is regular/dual clip for tis#546
erictang000 merged 2 commits intoNovaSky-AI:mainfrom
erictang000:tis_check

Conversation

@erictang000
Copy link
Collaborator

TIS is currently only enabled for policy loss types that use the ppo_policy_loss code path

@erictang000 erictang000 requested a review from SumanthRH October 21, 2025 19:27
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a validation check to ensure that Truncated Importance Sampling (TIS) is only used with compatible policy loss types, which is a good safeguard. My feedback suggests a small improvement to make this validation more robust by using a ValueError instead of an assert, aligning with the existing validation patterns in the file.

raise ValueError(
"Gneration with `trainer.algorithm.use_tis` needs to be batched with only single turn generation"
)
assert cfg.trainer.algorithm.policy_loss_type in ["regular", "dual_clip"], "TIS is only implemented for regular and dual_clip policy loss types"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For configuration validation, it's better to raise a ValueError instead of using assert. assert statements can be disabled when Python is run with the -O (optimize) flag, which would cause this important validation to be skipped. Using ValueError ensures the check is always performed and is consistent with other checks in this function.

Suggested change
assert cfg.trainer.algorithm.policy_loss_type in ["regular", "dual_clip"], "TIS is only implemented for regular and dual_clip policy loss types"
if cfg.trainer.algorithm.policy_loss_type not in ["regular", "dual_clip"]:
raise ValueError(
f"TIS is only implemented for 'regular' and 'dual_clip' policy loss types, but got '{cfg.trainer.algorithm.policy_loss_type}'"
)

@erictang000 erictang000 merged commit b714003 into NovaSky-AI:main Oct 21, 2025
3 checks passed
@erictang000 erictang000 deleted the tis_check branch October 21, 2025 19:33
atemaguer pushed a commit to atemaguer/SkyRL that referenced this pull request Oct 24, 2025
…or tis (NovaSky-AI#546)

TIS is currently only enabled for policy loss types that use the
`ppo_policy_loss` code path
li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025
…or tis (NovaSky-AI#546)

TIS is currently only enabled for policy loss types that use the
`ppo_policy_loss` code path
dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026
…or tis (NovaSky-AI#546)

TIS is currently only enabled for policy loss types that use the
`ppo_policy_loss` code path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants