[agent] Migrate skyrl-agent to use the new skyrl package#1235
[agent] Migrate skyrl-agent to use the new skyrl package#1235
skyrl-agent to use the new skyrl package#1235Conversation
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request successfully migrates skyrl-agent from the deprecated skyrl-train package to the new skyrl package. The changes include updating dependency configurations, shell script arguments, and Python import paths and API usage. The modifications are logical and align with the migration goal. My review includes suggestions to improve dependency management for better clarity and reproducibility.
| # TODO (sumanthrh): Improve dependency management here. We should use the skyrl package from the local version of the repo. | ||
| skyrl = { git = "https://github.com/NovaSky-AI/SkyRL", rev = "main" } |
There was a problem hiding this comment.
Pinning the skyrl dependency to the main branch is not recommended as it can lead to non-reproducible builds. Any new commit to main could introduce breaking changes. It's much safer to pin to a specific commit hash or a tag.
The author's TODO comment acknowledges this, and this comment serves to emphasize the importance of addressing it for production stability. Please replace main with a specific commit hash or tag.
| # TODO (sumanthrh): Improve dependency management here. We should use the skyrl package from the local version of the repo. | |
| skyrl = { git = "https://github.com/NovaSky-AI/SkyRL", rev = "main" } | |
| # TODO (sumanthrh): Improve dependency management here. We should use the skyrl package from the local version of the repo. | |
| skyrl = { git = "https://github.com/NovaSky-AI/SkyRL", rev = "<SPECIFIC_COMMIT_OR_TAG>" } |
|
|
||
| [project.optional-dependencies] | ||
| skyrl-train = ["skyrl-train[vllm]", 'torchdata'] | ||
| skyrl-train = ["skyrl[fsdp]", 'torchdata'] |
There was a problem hiding this comment.
For clarity and consistency, it would be better to rename this optional dependency extra from skyrl-train to skyrl, since it now installs the skyrl package. This will make the project's dependencies easier to understand.
You would also need to update the conflicts section under [tool.uv] to use { extra = "skyrl" }.
| skyrl-train = ["skyrl[fsdp]", 'torchdata'] | |
| skyrl = ["skyrl[fsdp]", 'torchdata'] |
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
| generator_input, uids = prepare_generator_input( | ||
| rand_prompts, | ||
| self.cfg.generator.n_samples_per_prompt, | ||
| get_sampling_params_for_backend(self.cfg.generator.backend, self.cfg.generator.sampling_params), | ||
| get_sampling_params_for_backend( | ||
| self.cfg.generator.inference_engine.backend, self.cfg.generator.sampling_params | ||
| ), | ||
| self.cfg.environment.env_class, | ||
| "train", | ||
| self.global_step, |
There was a problem hiding this comment.
🔴 asyncio.run() called inside async def train() crashes with RuntimeError
The SkyRLAgentPPOTrainer.train() method was changed from a sync def train(self) to async def train(self) (line 282), matching the parent class signature. However, the method body still uses asyncio.run() in many places (lines 301, 307, 312, 342, 353, 421, 427, 438, 458) instead of await.
Root Cause and Impact
Since BasePPOExp.run() at skyrl/train/entrypoints/main_base.py:424 invokes the training loop via asyncio.run(trainer.train()), there is already a running event loop when train() executes. Calling asyncio.run() from within a running event loop raises:
RuntimeError: asyncio.run() cannot be called from a running event loop
The parent class RayPPOTrainer.train() at skyrl/train/trainer.py:173 correctly uses await throughout (e.g., await self.dispatch.save_weights_for_sampler(), await self.eval(), await self.generate(...), await self.inference_engine_client.sleep()). When this method was migrated to be async, the asyncio.run(...) calls should have been converted to await as well.
Impact: The entire training loop will crash on the first asyncio.run() call at line 301, making training completely non-functional.
(Refers to lines 282-342)
Prompt for agents
In skyrl-agent/skyrl_agent/integrations/skyrl_train/trainer.py, the train() method (line 282) is declared async but uses asyncio.run() throughout instead of await. All asyncio.run() calls inside the async def train() method need to be replaced with await:
1. Line 301: asyncio.run(self.inference_engine_client.wake_up(tags=["weights"])) -> await self.inference_engine_client.wake_up(tags=["weights"])
2. Line 307: asyncio.run(self.inference_engine_client.wake_up(tags=["kv_cache"])) -> await self.inference_engine_client.wake_up(tags=["kv_cache"])
3. Line 312: eval_metrics = asyncio.run(self.eval()) -> eval_metrics = await self.eval()
4. Line 342: generator_output = asyncio.run(self.generate(generator_input)) -> generator_output = await self.generate(generator_input)
5. Line 353: asyncio.run(self.inference_engine_client.sleep()) -> await self.inference_engine_client.sleep()
6. Line 421: asyncio.run(self.inference_engine_client.wake_up(tags=["weights"])) -> await self.inference_engine_client.wake_up(tags=["weights"])
7. Line 427: asyncio.run(self.inference_engine_client.wake_up(tags=["kv_cache"])) -> await self.inference_engine_client.wake_up(tags=["kv_cache"])
8. Line 438: eval_metrics = asyncio.run(self.eval()) -> eval_metrics = await self.eval()
9. Line 458: asyncio.run(self.inference_engine_client.sleep()) -> await self.inference_engine_client.sleep()
This matches the pattern used in the parent class RayPPOTrainer.train() in skyrl/train/trainer.py.
Was this helpful? React with 👍 or 👎 to provide feedback.
# What does this PR do? Fixes `SkyRLAgentPPOTrainer` after #1235 . Previously the `SkyRLAgentPPOTrainer.train` was a sync function, even though we switched to making the base class's method `RayPPOTrainer.train` async in #868 . Training still progressed as usual but it would have errored out at the end of training when the return value would be evaluated by `asyncio.run(...)` This PR is a follow-up to #1235 to transition the `SkyRLAgentPPOTrainer.train` to an async function. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/novasky-ai/skyrl/pull/1237" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end --> --------- Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
What does this PR do?
Migrates skyrl-agent to use the new
skyrlpackage for the skyrl inference backend. It was still using the deprecated skyrl-train package.Note that dependency management is a bit more messy now, given that skyrl-agent depends on skyrl and skyrl is a root-level package. The current hack I have is to simply pull the latest skyrl package from Git, but obviously we should move towards using the local version of the repo (and the editable installation should be compatible with the uv + ray integration)