Skip to content

[Cleanup] Many small fixes and improvements#64

Merged
tyler-griggs merged 4 commits intomainfrom
tgriggs/small-fixes
Jul 7, 2025
Merged

[Cleanup] Many small fixes and improvements#64
tyler-griggs merged 4 commits intomainfrom
tgriggs/small-fixes

Conversation

@tyler-griggs
Copy link
Member

@tyler-griggs tyler-griggs commented Jul 7, 2025

Iterating on several small pieces of feedback from users, especially around clarity of documentation for creating a new environment.

This PR also changes the default setting of generator.batched to false. Reasoning: a couple users have been surprised by low performance on tool-use and multi-turn tasks. The root-cause was simply that batched was at its default setting of true. Because we focus on multi-turn, tool-use agentic tasks (rather than single-turn tasks), we should default this to false.

Wandb link for gsm8k: https://wandb.ai/sky-posttraining-uc-berkeley/gsm8k/runs/navm8ull?nw=nwuserskyposttraining

@erictang000 erictang000 self-requested a review July 7, 2025 21:00
@erictang000 erictang000 self-assigned this Jul 7, 2025
Copy link
Collaborator

@erictang000 erictang000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, everything makes sense to me, thanks!

Could you double check/link to a training run (multiply or gsm8k with the latest changes) just to make sure nothing is breaking?

@tyler-griggs tyler-griggs merged commit 956533a into main Jul 7, 2025
3 checks passed
@SumanthRH SumanthRH deleted the tgriggs/small-fixes branch July 16, 2025 23:20
fannie1208 pushed a commit to vinid/SkyRL that referenced this pull request Aug 19, 2025
Iterating on several small pieces of feedback from users, especially
around clarity of documentation for creating a new environment.

This PR also changes the default setting of `generator.batched` to
`false`. Reasoning: a couple users have been surprised by low
performance on tool-use and multi-turn tasks. The root-cause was simply
that `batched` was at its default setting of `true`. Because we focus on
multi-turn, tool-use agentic tasks (rather than single-turn tasks), we
should default this to `false`.

Wandb link for gsm8k:
https://wandb.ai/sky-posttraining-uc-berkeley/gsm8k/runs/navm8ull?nw=nwuserskyposttraining
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants