[cleanup] Make agent_loop output a dataclass#194
[cleanup] Make agent_loop output a dataclass#194tyler-griggs merged 2 commits intoNovaSky-AI:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the agent_loop method to return a dataclass instead of a long tuple, which significantly improves code readability and maintainability. The changes are well-implemented, and the tests have been updated accordingly. I have one suggestion to make the new dataclass immutable for better robustness.
| from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering | ||
|
|
||
|
|
||
| @dataclass |
There was a problem hiding this comment.
CharlieFRuan
left a comment
There was a problem hiding this comment.
Thank you, much cleaner! Only one nit
| from skyrl_train.generators.utils import get_custom_chat_template, get_generation_prompt_ids, apply_overlong_filtering | ||
|
|
||
|
|
||
| @dataclass |
There was a problem hiding this comment.
nit: InferenceEngineInput/Output and GeneratorInput/Output are all TypedDict, perhaps we can make this AgentLoopOutput a TypedDict as well to stay uniformed? Though functionally not much difference.
There was a problem hiding this comment.
Yeah, what do you think is better here? I actually like the dataclass more because of the automatic runtime checks. I believe the reason we went with TypedDict for the other interface definitions originally is because dict-like structures are generally easier to deal with at API boundaries (eg, because they're easier to serialize to json), so I thought that'd be helpful as we separate the different major components (generator, trainer). But admittedly, beyond that, I don't remember all trade-offs considered when we initially went with TypedDict for other interface definitions.
There was a problem hiding this comment.
Yeah that's fair! @gemini-code-assist what's your take
There was a problem hiding this comment.
That's a great question, @CharlieFRuan, and a good discussion point between you and @tyler-griggs. Both TypedDict and dataclass are excellent tools in Python for defining structured data with type hints, but they serve slightly different primary purposes.
-
TypedDict: This is primarily for static type checking (e.g., withmypy) to ensure that dictionaries conform to a specific schema. As @tyler-griggs mentioned, they are very convenient for representing data that will be serialized to/from dictionary-like formats, such as JSON, especially at API boundaries. They don't inherently provide runtime validation. -
dataclass: These are designed to create lightweight data objects that behave like regular classes. They automatically generate methods like__init__,__repr__, and__eq__, making them very convenient for defining data structures. The key benefit here, as @tyler-griggs pointed out, is the improved readability and attribute-based access (obj.fieldinstead ofobj['field']). Whiledataclassitself doesn't provide runtime validation out of the box (unless combined with libraries likepydantic), the structured nature and attribute access can make it easier to reason about the data at runtime and catch issues earlier during development.
For AgentLoopOutput, which is an internal return value from a function, using a dataclass significantly improves the readability and maintainability of the code compared to a raw tuple. It allows you to access elements by name (e.g., output.reward) rather than by index (e.g., output[1]), which is less error-prone and self-documenting. While TypedDict offers uniformity with other Input/Output types that might be closer to API boundaries, the benefits of named attribute access and the class-like behavior of dataclass often make it a more suitable choice for internal structured data that isn't primarily being used for direct dictionary serialization.
So, in this context, the choice of dataclass for AgentLoopOutput seems well-justified for enhancing code clarity and maintainability.
There was a problem hiding this comment.
Makes sense, let's go with dataclass!
31dfac3 to
2fb8d9f
Compare
The tuple return was getting out of control
The tuple return was getting out of control