Skip to content

[skyrl-train] Enforce eager by default#569

Merged
SumanthRH merged 2 commits intoNovaSky-AI:mainfrom
SumanthRH:enforce-eager
Oct 29, 2025
Merged

[skyrl-train] Enforce eager by default#569
SumanthRH merged 2 commits intoNovaSky-AI:mainfrom
SumanthRH:enforce-eager

Conversation

@SumanthRH
Copy link
Member

What does this PR do?

Set enforce_eager to true by default.

We've seen issues with cuda graphs affecting convergence on some training runs. This is likely drift due to train <> rollout logprobs mismatch, but we need to investigate this better.

Until then, it is best to turn off cuda graphs by default. Training runs are stable by default, more performant if the user chooses so.

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
@SumanthRH
Copy link
Member Author

The only example scripts where we explicitly set enforce_eager=false are FlashRL scripts, which IMO is fine to stay as is due to TIS

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request changes the default configuration to enforce eager execution by setting enforce_eager: true in ppo_base_config.yaml. This is a sensible change to prioritize training stability over performance by default, as explained in the description. My review includes a suggestion to add a comment to the configuration file to document this trade-off for future reference. Additionally, I noticed that the tests in tests/gpu/gpu_ci/test_engine_generation.py use a hardcoded configuration for creating inference engines. It would be beneficial to refactor the test setup to use the configuration from the YAML file to ensure tests remain aligned with the application's behavior, improving maintainability.

x
Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
@SumanthRH SumanthRH merged commit 5157c5f into NovaSky-AI:main Oct 29, 2025
3 checks passed
li-boxuan pushed a commit to li-boxuan/SkyRL that referenced this pull request Nov 23, 2025
# What does this PR do?

Set `enforce_eager` to true by default. 

We've seen issues with cuda graphs affecting convergence on some
training runs. This is likely drift due to train <> rollout logprobs
mismatch, but we need to investigate this better.

Until then, it is best to turn off cuda graphs by default. Training runs
are stable by default, more performant if the user chooses so.

---------

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
dzorlu pushed a commit to fleet-ai/SkyRL that referenced this pull request Feb 4, 2026
# What does this PR do?

Set `enforce_eager` to true by default. 

We've seen issues with cuda graphs affecting convergence on some
training runs. This is likely drift due to train <> rollout logprobs
mismatch, but we need to investigate this better.

Until then, it is best to turn off cuda graphs by default. Training runs
are stable by default, more performant if the user chooses so.

---------

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants