Skip to content

[cleanup] Make agent_loop output a dataclass#194

Merged
tyler-griggs merged 2 commits intoNovaSky-AI:mainfrom
tyler-griggs:agent_output
Aug 25, 2025
Merged

[cleanup] Make agent_loop output a dataclass#194
tyler-griggs merged 2 commits intoNovaSky-AI:mainfrom
tyler-griggs:agent_output

Conversation

@tyler-griggs
Copy link
Member

The tuple return was getting out of control

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the agent_loop method to return a dataclass instead of a long tuple, which significantly improves code readability and maintainability. The changes are well-implemented, and the tests have been updated accordingly. I have one suggestion to make the new dataclass immutable for better robustness.

from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering


@dataclass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider making the AgentLoopOutput dataclass frozen. Since it's used as a data transfer object for return values, making it immutable would prevent accidental modifications and make the program's state more predictable and robust.

Suggested change
@dataclass
@dataclass(frozen=True)

Copy link
Member

@CharlieFRuan CharlieFRuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, much cleaner! Only one nit

from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering


@dataclass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: InferenceEngineInput/Output and GeneratorInput/Output are all TypedDict, perhaps we can make this AgentLoopOutput a TypedDict as well to stay uniformed? Though functionally not much difference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what do you think is better here? I actually like the dataclass more because of the automatic runtime checks. I believe the reason we went with TypedDict for the other interface definitions originally is because dict-like structures are generally easier to deal with at API boundaries (eg, because they're easier to serialize to json), so I thought that'd be helpful as we separate the different major components (generator, trainer). But admittedly, beyond that, I don't remember all trade-offs considered when we initially went with TypedDict for other interface definitions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair! @gemini-code-assist what's your take

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question, @CharlieFRuan, and a good discussion point between you and @tyler-griggs. Both TypedDict and dataclass are excellent tools in Python for defining structured data with type hints, but they serve slightly different primary purposes.

  • TypedDict: This is primarily for static type checking (e.g., with mypy) to ensure that dictionaries conform to a specific schema. As @tyler-griggs mentioned, they are very convenient for representing data that will be serialized to/from dictionary-like formats, such as JSON, especially at API boundaries. They don't inherently provide runtime validation.

  • dataclass: These are designed to create lightweight data objects that behave like regular classes. They automatically generate methods like __init__, __repr__, and __eq__, making them very convenient for defining data structures. The key benefit here, as @tyler-griggs pointed out, is the improved readability and attribute-based access (obj.field instead of obj['field']). While dataclass itself doesn't provide runtime validation out of the box (unless combined with libraries like pydantic), the structured nature and attribute access can make it easier to reason about the data at runtime and catch issues earlier during development.

For AgentLoopOutput, which is an internal return value from a function, using a dataclass significantly improves the readability and maintainability of the code compared to a raw tuple. It allows you to access elements by name (e.g., output.reward) rather than by index (e.g., output[1]), which is less error-prone and self-documenting. While TypedDict offers uniformity with other Input/Output types that might be closer to API boundaries, the benefits of named attribute access and the class-like behavior of dataclass often make it a more suitable choice for internal structured data that isn't primarily being used for direct dictionary serialization.

So, in this context, the choice of dataclass for AgentLoopOutput seems well-justified for enhancing code clarity and maintainability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, let's go with dataclass!

@tyler-griggs tyler-griggs merged commit 7b1d176 into NovaSky-AI:main Aug 25, 2025
3 checks passed
dzorlu referenced this pull request in fleet-ai/SkyRL Feb 4, 2026
The tuple return was getting out of control
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants