[skyrl-train] Enforce eager by default#569
Conversation
Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
|
The only example scripts where we explicitly set |
There was a problem hiding this comment.
Code Review
This pull request changes the default configuration to enforce eager execution by setting enforce_eager: true in ppo_base_config.yaml. This is a sensible change to prioritize training stability over performance by default, as explained in the description. My review includes a suggestion to add a comment to the configuration file to document this trade-off for future reference. Additionally, I noticed that the tests in tests/gpu/gpu_ci/test_engine_generation.py use a hardcoded configuration for creating inference engines. It would be beneficial to refactor the test setup to use the configuration from the YAML file to ensure tests remain aligned with the application's behavior, improving maintainability.
# What does this PR do? Set `enforce_eager` to true by default. We've seen issues with cuda graphs affecting convergence on some training runs. This is likely drift due to train <> rollout logprobs mismatch, but we need to investigate this better. Until then, it is best to turn off cuda graphs by default. Training runs are stable by default, more performant if the user chooses so. --------- Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
# What does this PR do? Set `enforce_eager` to true by default. We've seen issues with cuda graphs affecting convergence on some training runs. This is likely drift due to train <> rollout logprobs mismatch, but we need to investigate this better. Until then, it is best to turn off cuda graphs by default. Training runs are stable by default, more performant if the user chooses so. --------- Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
What does this PR do?
Set
enforce_eagerto true by default.We've seen issues with cuda graphs affecting convergence on some training runs. This is likely drift due to train <> rollout logprobs mismatch, but we need to investigate this better.
Until then, it is best to turn off cuda graphs by default. Training runs are stable by default, more performant if the user chooses so.