[skyrl-train] Enforce eager by default by SumanthRH · Pull Request #569 · NovaSky-AI/SkyRL

SumanthRH · 2025-10-24T23:48:19Z

What does this PR do?

Set enforce_eager to true by default.

We've seen issues with cuda graphs affecting convergence on some training runs. This is likely drift due to train <> rollout logprobs mismatch, but we need to investigate this better.

Until then, it is best to turn off cuda graphs by default. Training runs are stable by default, more performant if the user chooses so.

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH · 2025-10-24T23:49:06Z

The only example scripts where we explicitly set enforce_eager=false are FlashRL scripts, which IMO is fine to stay as is due to TIS

gemini-code-assist

Code Review

This pull request changes the default configuration to enforce eager execution by setting enforce_eager: true in ppo_base_config.yaml. This is a sensible change to prioritize training stability over performance by default, as explained in the description. My review includes a suggestion to add a comment to the configuration file to document this trade-off for future reference. Additionally, I noticed that the tests in tests/gpu/gpu_ci/test_engine_generation.py use a hardcoded configuration for creating inference engines. It would be beneficial to refactor the test setup to use the configuration from the YAML file to ensure tests remain aligned with the application's behavior, improving maintainability.

skyrl-train/skyrl_train/config/ppo_base_config.yaml

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

# What does this PR do? Set `enforce_eager` to true by default. We've seen issues with cuda graphs affecting convergence on some training runs. This is likely drift due to train <> rollout logprobs mismatch, but we need to investigate this better. Until then, it is best to turn off cuda graphs by default. Training runs are stable by default, more performant if the user chooses so. --------- Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

Enforce eager by default

f37c85c

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

gemini-code-assist bot reviewed Oct 24, 2025

View reviewed changes

skyrl-train/skyrl_train/config/ppo_base_config.yaml Show resolved Hide resolved

x

27ee6ad

Signed-off-by: SumanthRH <sumanthrh@anyscale.com>

SumanthRH assigned tyler-griggs Oct 25, 2025

SumanthRH requested a review from tyler-griggs October 25, 2025 00:02

SumanthRH merged commit 5157c5f into NovaSky-AI:main Oct 29, 2025
3 checks passed

SamD770 mentioned this pull request Oct 31, 2025

[skyrl-train] New enforce_eager: true default makes LoRA generation 2x slower #607

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[skyrl-train] Enforce eager by default#569

[skyrl-train] Enforce eager by default#569
SumanthRH merged 2 commits intoNovaSky-AI:mainfrom
SumanthRH:enforce-eager

SumanthRH commented Oct 24, 2025

Uh oh!

SumanthRH commented Oct 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SumanthRH commented Oct 24, 2025

What does this PR do?

Uh oh!

SumanthRH commented Oct 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants