[cleanup] Make agent_loop output a dataclass by tyler-griggs · Pull Request #194 · NovaSky-AI/SkyRL

tyler-griggs · 2025-08-25T01:10:28Z

The tuple return was getting out of control

gemini-code-assist

Code Review

This pull request refactors the agent_loop method to return a dataclass instead of a long tuple, which significantly improves code readability and maintainability. The changes are well-implemented, and the tests have been updated accordingly. I have one suggestion to make the new dataclass immutable for better robustness.

gemini-code-assist · 2025-08-25T01:11:30Z

skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

 from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering


+@dataclass


Consider making the AgentLoopOutput dataclass frozen. Since it's used as a data transfer object for return values, making it immutable would prevent accidental modifications and make the program's state more predictable and robust.

Suggested change

@dataclass

@dataclass(frozen=True)

CharlieFRuan

Thank you, much cleaner! Only one nit

CharlieFRuan · 2025-08-25T03:24:14Z

skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

 from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering


+@dataclass


nit: InferenceEngineInput/Output and GeneratorInput/Output are all TypedDict, perhaps we can make this AgentLoopOutput a TypedDict as well to stay uniformed? Though functionally not much difference.

Yeah, what do you think is better here? I actually like the dataclass more because of the automatic runtime checks. I believe the reason we went with TypedDict for the other interface definitions originally is because dict-like structures are generally easier to deal with at API boundaries (eg, because they're easier to serialize to json), so I thought that'd be helpful as we separate the different major components (generator, trainer). But admittedly, beyond that, I don't remember all trade-offs considered when we initially went with TypedDict for other interface definitions.

Yeah that's fair! @gemini-code-assist what's your take

That's a great question, @CharlieFRuan, and a good discussion point between you and @tyler-griggs. Both TypedDict and dataclass are excellent tools in Python for defining structured data with type hints, but they serve slightly different primary purposes.

TypedDict: This is primarily for static type checking (e.g., with mypy) to ensure that dictionaries conform to a specific schema. As @tyler-griggs mentioned, they are very convenient for representing data that will be serialized to/from dictionary-like formats, such as JSON, especially at API boundaries. They don't inherently provide runtime validation.

dataclass: These are designed to create lightweight data objects that behave like regular classes. They automatically generate methods like __init__, __repr__, and __eq__, making them very convenient for defining data structures. The key benefit here, as @tyler-griggs pointed out, is the improved readability and attribute-based access (obj.field instead of obj['field']). While dataclass itself doesn't provide runtime validation out of the box (unless combined with libraries like pydantic), the structured nature and attribute access can make it easier to reason about the data at runtime and catch issues earlier during development.

For AgentLoopOutput, which is an internal return value from a function, using a dataclass significantly improves the readability and maintainability of the code compared to a raw tuple. It allows you to access elements by name (e.g., output.reward) rather than by index (e.g., output[1]), which is less error-prone and self-documenting. While TypedDict offers uniformity with other Input/Output types that might be closer to API boundaries, the benefits of named attribute access and the class-like behavior of dataclass often make it a more suitable choice for internal structured data that isn't primarily being used for direct dictionary serialization.

So, in this context, the choice of dataclass for AgentLoopOutput seems well-justified for enhancing code clarity and maintainability.

Makes sense, let's go with dataclass!

The tuple return was getting out of control

Add agent output changes

5a0084e

tyler-griggs requested a review from CharlieFRuan August 25, 2025 01:10

tyler-griggs mentioned this pull request Aug 24, 2025

[Tracker] SkyRLGymGenerator improvements #179

Open

14 tasks

gemini-code-assist bot reviewed Aug 25, 2025

View reviewed changes

CharlieFRuan reviewed Aug 25, 2025

View reviewed changes

format

2fb8d9f

tyler-griggs force-pushed the agent_output branch from 31dfac3 to 2fb8d9f Compare August 25, 2025 06:12

tyler-griggs merged commit 7b1d176 into NovaSky-AI:main Aug 25, 2025
3 checks passed

dzorlu referenced this pull request in fleet-ai/SkyRL Feb 4, 2026

[cleanup] Make agent_loop output a dataclass (#194)

2135a44

The tuple return was getting out of control

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cleanup] Make agent_loop output a dataclass#194

[cleanup] Make agent_loop output a dataclass#194
tyler-griggs merged 2 commits intoNovaSky-AI:mainfrom
tyler-griggs:agent_output

tyler-griggs commented Aug 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 25, 2025

Uh oh!

CharlieFRuan left a comment

Uh oh!

CharlieFRuan Aug 25, 2025

Uh oh!

tyler-griggs Aug 25, 2025

Uh oh!

CharlieFRuan Aug 25, 2025

Uh oh!

gemini-code-assist bot Aug 25, 2025

Uh oh!

CharlieFRuan Aug 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering


		@dataclass

Conversation

tyler-griggs commented Aug 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

CharlieFRuan left a comment

Choose a reason for hiding this comment

Uh oh!

CharlieFRuan Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

tyler-griggs Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

CharlieFRuan Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

CharlieFRuan Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants