Skip to content

[train] Pythonic Configs 1/N - Introduce configuration dataclasses for the current YAML and migrate tests#1001

Merged
SumanthRH merged 31 commits intomainfrom
pythonic-configs-v0
Feb 2, 2026
Merged

[train] Pythonic Configs 1/N - Introduce configuration dataclasses for the current YAML and migrate tests#1001
SumanthRH merged 31 commits intomainfrom
pythonic-configs-v0

Conversation

@SumanthRH
Copy link
Member

What does this PR do?

1/2 PRs for introducing pythonic configurations in SkyRL-train.

A short design doc is here: https://docs.google.com/document/d/1YzLPCmVZpBauRyh4l-17AUhXiXALmOX-N1gXZkA9slA

This is going to be lightweight in terms of configuration hierarchy changes to keep user experience similar.

Summary:

  • Introduce dataclasses corresponding to the current configuration hierarchy - PolicyConfig, ModelConfig, AlgorithmConfig, PlacementConfig, etc.
  • Preserves backwards compatibility - internal config structure matches YAML 1:1
  • Entrypoint scripts continue to use generic DictConfig - only migrate tests to new dataclasses.
  • Existing user scripts don’t break

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH marked this pull request as ready for review January 31, 2026 20:47
@SumanthRH
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces typed dataclasses for configuration, moving away from generic DictConfig objects, which is a significant step towards improving type safety, code clarity, and maintainability. However, it introduces a critical security regression: sensitive information from environment variables (such as API keys) may now be leaked into logs. This occurs because the new configuration system resolves interpolations upfront and logs these resolved values, whereas the previous implementation logged the unresolved configuration. Additionally, a potential bug in fsdp_utils.py and areas for future configuration management improvements have been identified.

Comment on lines +28 to +31
def from_dict_config(cls, cfg: DictConfig) -> "BaseConfig":
"""Construct a typed BaseConfig from a Hydra DictConfig."""
raw = OmegaConf.to_container(cfg, resolve=True)
return build_nested_dataclass(cls, raw)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The from_dict_config method resolves configuration interpolations by calling OmegaConf.to_container(cfg, resolve=True). This means that any sensitive information stored in environment variables and referenced in the configuration (e.g., ${env:WANDB_API_KEY}) will be expanded to plaintext when the configuration dataclass is initialized. If these dataclasses are subsequently logged or serialized, the sensitive information will be leaked.

Recommendation: Consider delaying resolution until the values are actually needed for execution, or ensure that sensitive fields are masked before logging.

Comment on lines +131 to 133
def get_cfg_as_str(cfg: Union[SkyRLConfig, DictConfig]) -> str:
return pprint.pformat(get_config_as_dict(cfg))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The get_cfg_as_str method now logs the resolved configuration by using get_config_as_dict(cfg), which contains resolved interpolations. This is a regression from the previous implementation using OmegaConf.to_yaml(dict_cfg), which did not resolve interpolations by default. This change leads to sensitive information (like API keys) being written to logs in plaintext.

Recommendation: Use a non-resolving method for logging the configuration, or implement masking for sensitive keys.

Comment on lines +377 to +380
@dataclass
class GSM8kLLMJudgeEnvConfig(BaseConfig):
model: str = "gpt-4o-mini"
base_url: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

I see that GSM8kLLMJudgeEnvConfig is redefined here, and the comment explains why. While this might be a necessary workaround for now to avoid circular dependencies with the examples directory, it introduces code duplication which can be a maintenance concern. The definition in skyrl-train/examples/llm_as_a_judge/llm_judge_env.py also doesn't inherit from BaseConfig. It would be good to consolidate these into a single definition in a shared location in a future refactoring, perhaps by moving the example environment into the core library if it's meant to be generally available.

Comment on lines +755 to +764
# Create a copy of the config and apply overrides
loss_config = OmegaConf.merge(loss_config, OmegaConf.create(loss_fn_config))
# NOTE (sumanthrh): This doesn't work like OmegaConf for nested dataclasses - only top level fields are merged
# TODO: Fix nested overrides
if isinstance(loss_config, DictConfig):
loss_config = OmegaConf.merge(loss_config, OmegaConf.create(loss_fn_config))
else:
assert isinstance(loss_config, AlgorithmConfig)
new_loss_config_as_dict = asdict(loss_config)
new_loss_config_as_dict.update(loss_fn_config)
loss_config = AlgorithmConfig(**new_loss_config_as_dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment here correctly points out that the current implementation for updating the dataclass config performs a shallow merge. This could lead to unexpected behavior if a user tries to override a nested configuration object (e.g., kl_ctrl), as the entire nested object would be replaced rather than merged.

To address the TODO, you might consider implementing a recursive update function for dataclasses to mimic the deep merge behavior of OmegaConf.merge. This would make the configuration overrides more intuitive and robust.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable refactoring by defining typed dataclasses for configurations, moving away from DictConfig. This greatly improves code clarity, maintainability, and developer experience by enabling static analysis and auto-completion. The strategy of preserving backward compatibility and migrating tests first is excellent. The implementation of the configuration dataclasses and the conversion logic is well-executed. I've identified a few minor areas for improvement, mainly around ensuring consistent handling of both DictConfig and the new dataclasses during the transition period to prevent potential runtime errors. Overall, this is a high-quality contribution that modernizes the codebase.

Copy link
Member

@tyler-griggs tyler-griggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable refactoring by replacing YAML-based DictConfig with typed Python dataclasses for configuration. This greatly improves type safety, code clarity, and developer experience. The migration is handled thoughtfully, with backward compatibility in mind by using Union types during the transition. The changes are consistently applied across the codebase, including core logic, examples, and tests. The addition of unit tests for the new config system is also a great touch.

I've found a few areas for improvement, mainly related to potential issues with config merging and a code smell due to a duplicated config definition. Please see my detailed comments.

x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH merged commit d1e1b29 into main Feb 2, 2026
1 check failed
@SumanthRH SumanthRH deleted the pythonic-configs-v0 branch February 26, 2026 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants