[train] Pythonic Configs 1/N - Introduce configuration dataclasses for the current YAML and migrate tests by SumanthRH · Pull Request #1001 · NovaSky-AI/SkyRL

SumanthRH · 2026-01-31T20:17:50Z

What does this PR do?

1/2 PRs for introducing pythonic configurations in SkyRL-train.

A short design doc is here: https://docs.google.com/document/d/1YzLPCmVZpBauRyh4l-17AUhXiXALmOX-N1gXZkA9slA

This is going to be lightweight in terms of configuration hierarchy changes to keep user experience similar.

Summary:

Introduce dataclasses corresponding to the current configuration hierarchy - PolicyConfig, ModelConfig, AlgorithmConfig, PlacementConfig, etc.
Preserves backwards compatibility - internal config structure matches YAML 1:1
Entrypoint scripts continue to use generic DictConfig - only migrate tests to new dataclasses.
Existing user scripts don’t break

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

SumanthRH · 2026-01-31T20:47:55Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces typed dataclasses for configuration, moving away from generic DictConfig objects, which is a significant step towards improving type safety, code clarity, and maintainability. However, it introduces a critical security regression: sensitive information from environment variables (such as API keys) may now be leaked into logs. This occurs because the new configuration system resolves interpolations upfront and logs these resolved values, whereas the previous implementation logged the unresolved configuration. Additionally, a potential bug in fsdp_utils.py and areas for future configuration management improvements have been identified.

skyrl-train/skyrl_train/distributed/fsdp_utils.py

gemini-code-assist · 2026-01-31T20:51:54Z

skyrl-train/skyrl_train/config/config.py

+    def from_dict_config(cls, cfg: DictConfig) -> "BaseConfig":
+        """Construct a typed BaseConfig from a Hydra DictConfig."""
+        raw = OmegaConf.to_container(cfg, resolve=True)
+        return build_nested_dataclass(cls, raw)


The from_dict_config method resolves configuration interpolations by calling OmegaConf.to_container(cfg, resolve=True). This means that any sensitive information stored in environment variables and referenced in the configuration (e.g., ${env:WANDB_API_KEY}) will be expanded to plaintext when the configuration dataclass is initialized. If these dataclasses are subsequently logged or serialized, the sensitive information will be leaked.

Recommendation: Consider delaying resolution until the values are actually needed for execution, or ensure that sensitive fields are masked before logging.

gemini-code-assist · 2026-01-31T20:51:54Z

skyrl-train/skyrl_train/entrypoints/main_base.py

+    def get_cfg_as_str(cfg: Union[SkyRLConfig, DictConfig]) -> str:
+        return pprint.pformat(get_config_as_dict(cfg))



The get_cfg_as_str method now logs the resolved configuration by using get_config_as_dict(cfg), which contains resolved interpolations. This is a regression from the previous implementation using OmegaConf.to_yaml(dict_cfg), which did not resolve interpolations by default. This change leads to sensitive information (like API keys) being written to logs in plaintext.

Recommendation: Use a non-resolving method for logging the configuration, or implement masking for sensitive keys.

gemini-code-assist · 2026-01-31T20:51:54Z

skyrl-train/skyrl_train/config/config.py

+@dataclass
+class GSM8kLLMJudgeEnvConfig(BaseConfig):
+    model: str = "gpt-4o-mini"
+    base_url: Optional[str] = None


I see that GSM8kLLMJudgeEnvConfig is redefined here, and the comment explains why. While this might be a necessary workaround for now to avoid circular dependencies with the examples directory, it introduces code duplication which can be a maintenance concern. The definition in skyrl-train/examples/llm_as_a_judge/llm_judge_env.py also doesn't inherit from BaseConfig. It would be good to consolidate these into a single definition in a shared location in a future refactoring, perhaps by moving the example environment into the core library if it's meant to be generally available.

gemini-code-assist · 2026-01-31T20:51:54Z

skyrl-train/skyrl_train/workers/worker.py

            # Create a copy of the config and apply overrides
-            loss_config = OmegaConf.merge(loss_config, OmegaConf.create(loss_fn_config))
+            # NOTE (sumanthrh): This doesn't work like OmegaConf for nested dataclasses - only top level fields are merged
+            # TODO: Fix nested overrides
+            if isinstance(loss_config, DictConfig):
+                loss_config = OmegaConf.merge(loss_config, OmegaConf.create(loss_fn_config))
+            else:
+                assert isinstance(loss_config, AlgorithmConfig)
+                new_loss_config_as_dict = asdict(loss_config)
+                new_loss_config_as_dict.update(loss_fn_config)
+                loss_config = AlgorithmConfig(**new_loss_config_as_dict)


The comment here correctly points out that the current implementation for updating the dataclass config performs a shallow merge. This could lead to unexpected behavior if a user tries to override a nested configuration object (e.g., kl_ctrl), as the entire nested object would be replaced rather than merged.

To address the TODO, you might consider implementing a recursive update function for dataclasses to mimic the deep merge behavior of OmegaConf.merge. This would make the configuration overrides more intuitive and robust.

gemini-code-assist

Code Review

This pull request introduces a significant and valuable refactoring by defining typed dataclasses for configurations, moving away from DictConfig. This greatly improves code clarity, maintainability, and developer experience by enabling static analysis and auto-completion. The strategy of preserving backward compatibility and migrating tests first is excellent. The implementation of the configuration dataclasses and the conversion logic is well-executed. I've identified a few minor areas for improvement, mainly around ensuring consistent handling of both DictConfig and the new dataclasses during the transition period to prevent potential runtime errors. Overall, this is a high-quality contribution that modernizes the codebase.

skyrl-train/skyrl_train/distributed/megatron/optimizer.py

skyrl-gym/skyrl_gym/envs/aime/env.py

skyrl-train/skyrl_train/config/config.py

skyrl-gym/skyrl_gym/envs/searchcode/env.py

tyler-griggs

thanks!

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

SumanthRH · 2026-02-02T19:23:09Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a significant and valuable refactoring by replacing YAML-based DictConfig with typed Python dataclasses for configuration. This greatly improves type safety, code clarity, and developer experience. The migration is handled thoughtfully, with backward compatibility in mind by using Union types during the transition. The changes are consistently applied across the codebase, including core logic, examples, and tests. The addition of unit tests for the new config system is also a great touch.

I've found a few areas for improvement, mainly related to potential issues with config merging and a code smell due to a duplicated config definition. Please see my detailed comments.

skyrl-train/skyrl_train/workers/worker.py

skyrl-train/skyrl_train/config/config.py

skyrl-train/skyrl_train/distributed/fsdp_utils.py

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

SumanthRH added 13 commits January 30, 2026 09:37

Initial commit with new pythonic config

5d3d028

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

lint

94c61b5

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

lint

f1d5c66

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

initial commit

425a9e5

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

df8af57

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

fixes

0e1d13c

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

c31b4ed

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

4fb299a

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

add back support for dictconfig

42a9cde

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

dc37a3b

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

update type hints

0ff7835

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

de81200

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

aead932

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

vercel bot deployed to Preview January 31, 2026 20:30 View deployment

x

104510f

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

vercel bot deployed to Preview January 31, 2026 20:46 View deployment

x

8d947ba

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

SumanthRH marked this pull request as ready for review January 31, 2026 20:47

SumanthRH assigned tyler-griggs Jan 31, 2026

vercel bot deployed to Preview January 31, 2026 20:48 View deployment

gemini-code-assist bot reviewed Jan 31, 2026

View reviewed changes

skyrl-train/skyrl_train/distributed/megatron/optimizer.py Outdated Show resolved Hide resolved

skyrl-gym/skyrl_gym/envs/aime/env.py Outdated Show resolved Hide resolved

tyler-griggs reviewed Feb 1, 2026

View reviewed changes

skyrl-train/skyrl_train/config/config.py Show resolved Hide resolved

tyler-griggs reviewed Feb 1, 2026

View reviewed changes

skyrl-gym/skyrl_gym/envs/searchcode/env.py Outdated Show resolved Hide resolved

tyler-griggs approved these changes Feb 1, 2026

View reviewed changes

x

bdf8bf9

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

vercel bot deployed to Preview February 2, 2026 09:36 View deployment

SumanthRH added 2 commits February 2, 2026 09:50

Merge remote-tracking branch 'upstream/main' into pythonic-configs-v0

bcd0817

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

de0f085

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

SumanthRH added 11 commits February 2, 2026 11:26

x

bb23487

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

22fe245

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

f3dbf68

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

Merge remote-tracking branch 'upstream/main' into pythonic-configs-v0

33bae56

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

831cf10

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

f604a0f

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

546be82

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

8a3b7c8

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

a378675

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

x

40e4392

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

Merge remote-tracking branch 'upstream/main' into pythonic-configs-v0

e89d2fa

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

gemini-code-assist bot reviewed Feb 2, 2026

View reviewed changes

skyrl-train/skyrl_train/workers/worker.py Outdated Show resolved Hide resolved

skyrl-train/skyrl_train/config/config.py Show resolved Hide resolved

skyrl-train/skyrl_train/distributed/fsdp_utils.py Show resolved Hide resolved

SumanthRH added 2 commits February 2, 2026 19:31

x

f13ae42

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

fix megatron test

dd597b6

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>

SumanthRH merged commit d1e1b29 into main Feb 2, 2026
1 check failed

SumanthRH deleted the pythonic-configs-v0 branch February 26, 2026 23:48

SumanthRH mentioned this pull request Mar 11, 2026

[AsyncRL] Use keep mode for pause and resume #1179

Merged

3 tasks

This was referenced Mar 23, 2026

Make SkyRL configuration and initialization more pythonic #748

Closed

SkyRL H1 2026 Roadmap #1391

Open

		def get_cfg_as_str(cfg: Union[SkyRLConfig, DictConfig]) -> str:
		return pprint.pformat(get_config_as_dict(cfg))

Conversation

SumanthRH commented Jan 31, 2026

What does this PR do?

Uh oh!

SumanthRH commented Jan 31, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tyler-griggs left a comment

Choose a reason for hiding this comment

Uh oh!

SumanthRH commented Feb 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants