Unify Megatron and FSDP training interfaces with forward_backward + optim_step by tyler-griggs · Pull Request #901 · NovaSky-AI/SkyRL

tyler-griggs · 2026-01-20T17:20:28Z

Summary

Add forward_backward() and optim_step() methods to MegatronPolicyWorkerBase to match FSDP worker interface
Update trainer to use unified interface for both Megatron and FSDP strategies (removes strategy branching)
Mark ppo_train() as deprecated (kept for backward compatibility)
Update test_megatron_worker.py to use the new interface
Add get_lr and set_lr to the megatron worker to be in line with behavior from Add set_lr() for dynamic learning rate updates from Tinker #978
Add SFT behavior form [skyrl-train] Add SFT support via forward_backward(loss_fn="cross_entropy") #961, allowing the megatron backend to be used with the TX SkyRL-Train integration

This brings Megatron up to parity with FSDP following the refactoring in PR #859.

Test plan

Run test_megatron_worker.py to verify forward_backward + optim_step works correctly
Verify metrics match between Megatron and FSDP implementations

Co-Authored-By: Eric Tang erictang000@gmail.com

…ptim_step - Add forward_backward() and optim_step() methods to MegatronPolicyWorkerBase - Update trainer to use unified interface for both strategies - Remove strategy branching in train_critic_and_policy() - Mark ppo_train() as deprecated (kept for backward compatibility) - Update test_megatron_worker.py to use new interface Co-Authored-By: Eric Tang <erictang000@gmail.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request successfully unifies the training interfaces for Megatron and FSDP strategies by introducing forward_backward and optim_step methods in MegatronPolicyWorkerBase. The trainer.py file is updated to use this unified interface, removing strategy-specific branching, which significantly improves code maintainability and clarity. The ppo_train method is appropriately marked as deprecated, and the tests are updated to reflect these changes. Overall, this is a well-executed refactoring that aligns with the goal of creating a more consistent training pipeline.

skyrl-train/skyrl_train/workers/megatron/megatron_worker.py

…nterface - Remove ppo_train from MegatronPolicyWorkerBase and WorkerDispatch - Update test_megatron_dp, test_megatron_offload to use forward_backward + optim_step - Update test_save_load_model.py and test_save_load_checkpoint.py for unified interface - Simplify _normalize_mini_batch_size (no longer needs policy_mini_batch_size_per_gpu) Both FSDP and Megatron now use the same forward_backward + optim_step interface. Co-Authored-By: Eric Tang <erictang000@gmail.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The method just set _micro_batches_accumulated = 0, which can be done directly in __init__. This removes unnecessary indirection and the vestigial mesh_rank guard that was no longer needed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

skyrl-train/skyrl_train/workers/megatron/megatron_worker.py

…ggs/megatron_refactor

erictang000 · 2026-01-31T01:54:42Z

training still working, new tests pass

erictang000 · 2026-01-31T02:29:25Z

sft trainer script runs with the megatron backend:

gemini-code-assist bot reviewed Jan 20, 2026

View reviewed changes

skyrl-train/skyrl_train/workers/megatron/megatron_worker.py Show resolved Hide resolved

tyler-griggs and others added 3 commits January 20, 2026 17:44

format

7adcb3d

erictang000 reviewed Jan 20, 2026

View reviewed changes

skyrl-train/skyrl_train/workers/megatron/megatron_worker.py Show resolved Hide resolved

erictang000 added 5 commits January 22, 2026 00:43

x

6cbb2ea

Merge branch 'main' of https://github.com/erictang000/SkyRL into HEAD

c64d913

Merge branch 'main' of https://github.com/erictang000/SkyRL into tgri…

c15923f

…ggs/megatron_refactor

add get/set lr and sft loss

61b7857

add megatron get/set lr test

c330a6b

erictang000 merged commit a18746c into NovaSky-AI:main Jan 31, 2026
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify Megatron and FSDP training interfaces with forward_backward + optim_step#901

Unify Megatron and FSDP training interfaces with forward_backward + optim_step#901
erictang000 merged 9 commits intoNovaSky-AI:mainfrom
tyler-griggs:tgriggs/megatron_refactor

tyler-griggs commented Jan 20, 2026 •

edited by erictang000

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

erictang000 commented Jan 31, 2026

Uh oh!

erictang000 commented Jan 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tyler-griggs commented Jan 20, 2026 • edited by erictang000 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

erictang000 commented Jan 31, 2026

Uh oh!

erictang000 commented Jan 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tyler-griggs commented Jan 20, 2026 •

edited by erictang000

Loading