Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
55 changes: 55 additions & 0 deletions skyrl-train/docs/getting-started/overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
SkyRL System Overview
=====================

SkyRL breaks the RL stack into modular components and provides public APIs for each of them.

Specifically, as shown in figure below, SkyRL separates training into two major components, **Trainer** and **Generator**, and the Generator is further divided into **InferenceEngine** and **Environment**, with a single **Controller** managing setup and execution of each component.

.. figure:: images/system-overview.png
:alt: SkyRL System Overview
:align: center
:width: 80%

The components' responsibilities are as follows:

Trainer
~~~~~~~
Performs the optimization steps based on configured RL algorithm. Updates model parameters based on generated trajectories and their assigned rewards.

- `Trainer Worker interface <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/workers/worker.py#L162>`_
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding the Trainer class here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that is the controller. Got it.

Isn't the main abstraction the ActorGroup and not Worker? Can we add ActorGroup here

- `FSDP Worker <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/workers/fsdp/fsdp_worker.py>`_
- `DeepSpeed Worker <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/workers/deepspeed/deepspeed_worker.py>`_

Generator
~~~~~~~~~
Generates complete trajectories and computes their rewards. The Generator encompasses both the InferenceEngine (to get model completions) and Environment (to execute actions) as well as custom agentic or data generation logic build around model inference, such as context management, sampling methods, or tree search.

- `Base Generator interface <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/generators/base.py>`_
- `Generator built for SkyRL-Gym <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/generators/skyrl_gym_generator.py>`_

InferenceEngine
~~~~~~~~~~~~~~~
Executes inference on the policy model to produce model outputs (i.e., the RL agent's actions). Typically, multiple InferenceEngines are deployed to process prompts in parallel.

- `Base InferenceEngine interface <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/inference_engines/base.py>`_
- `InferenceEngine client to manage multiple engines <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/inference_engines/inference_engine_client.py>`_
- `vLLM backend <https://github.com/NovaSky-AI/SkyRL/tree/main/skyrl-train/skyrl_train/inference_engines/vllm>`_
- `SGLang backend <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-train/skyrl_train/inference_engines/sglang/sglang_server.py>`_


Environment
~~~~~~~~~~~
Presents a task for the policy model to solve, and provides the logic for executing the policy's actions (i.e., model outputs) and computing the resulting observations and rewards.

- `Base Environment interface <https://github.com/NovaSky-AI/SkyRL/blob/main/skyrl-gym/skyrl_gym/core.py>`_
- `SkyRL-Gym <https://github.com/NovaSky-AI/SkyRL/tree/main/skyrl-gym>`_, our ready-built library of tool-use environments

- `Example environments <https://github.com/NovaSky-AI/SkyRL/tree/main/skyrl-gym/skyrl_gym/envs>`_


Controller
~~~~~~~~~~
Manages physical placement, initialization, and control flow of training execution for each of the above components.

- The training control loop currently sits in `trainer.py <https://github.com/NovaSky-AI/SkyRL/blob/5a82809e218b2e0c3dd431377fb672e35ecc4a84/skyrl-train/skyrl_train/trainer.py#L194>`_
- It is a WIP to move the control loop to a separate component for even greater flexibility.
1 change: 1 addition & 0 deletions skyrl-train/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ SkyRL is a full-stack RL library designed for modularity and extensibility.

getting-started/installation
getting-started/quickstart
getting-started/overview

.. toctree::
:maxdepth: 2
Expand Down