feat: Add SGLang rollout backend and tests #1674

RolaoDenthu · 2025-12-21T02:58:50Z

What does this PR do ?

Add comprehensive test coverage for SGLang generation backend, including functional tests, unit tests, and nightly tests.

Functional Test (tests/functional/grpo_sglang.sh): Quick validation of SGLang-based GRPO training
Unit Tests (tests/unit/models/generation/test_sglang_generation.py): unit tests covering:
- Basic configuration validation
- Policy generation and tensor parallelism
- Worker seed behavior for RLHF diversity
- HTTP server direct API access
- Weight updates with DTensor policy (colocated mode)
- Prefix cache reset after weight updates
Nightly Test (tests/test_suites/llm/grpo-qwen3-0.6b-1n8g-sglang.sh): End-to-end convergence test for SGLang backend

Usage

You can potentially add a usage example below

# Run functional test
uv add coverage
bash tests/functional/grpo_sglang.sh

# Run unit tests
uv sync --extra sglang --group test
uv run python -m pytest tests/unit/models/generation/test_sglang_generation.py -v --sglang-only

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Summary by CodeRabbit

Release Notes

New Features
- Distributed generation engine using SGLang backend with HTTP weight streaming and multi-GPU support.
Configuration
- New YAML configuration templates for SGLang-based experiments with customizable generation parameters.
Tests
- Comprehensive test coverage for SGLang generation, including tensor parallelism, batching, and dynamic weight updates.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

…a server Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

…p servers Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang: add 1B example Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

Signed-off-by: Zhuoran Yin <[email protected]>

- Convert SGLangConfig from regular class to TypedDict inheriting GenerationConfig - Align structure with VllmConfig pattern for consistency - Mark all fields as NotRequired for backward compatibility - Add sglang_kwargs field for additional ServerArgs parameters - Add type casting in grpo.py for type safety This maintains backward compatibility while aligning with the existing generation config structure pattern. Signed-off-by: Zhuoran Yin <[email protected]>

Signed-off-by: Zhuoran Yin <[email protected]>

Co-authored-by: Terry Kong <[email protected]> Signed-off-by: Night <[email protected]>

Signed-off-by: RolaoDenthu <[email protected]>

guyueh1 · 2026-01-05T05:09:55Z

quote lint check

File nemo_rl/models/generation/sglang/init.py has zero errors but is not in pyrefly.toml in the 'project-includes' list. Please add it to this whitelist.

guyueh1 · 2026-01-05T05:13:53Z

quote build-container job

The lockfile at uv.lock needs to be updated, but --locked was provided. To update the lockfile, run uv lock.
You need to try locally build docker first to make sure it works (check docs/docker.md for instructions). During the process you will find you need to update the uv.lock file, add that change to the PR

Signed-off-by: RolaoDenthu <[email protected]>

RolaoDenthu · 2026-01-06T17:42:58Z

quote build-container job

The lockfile at uv.lock needs to be updated, but --locked was provided. To update the lockfile, run uv lock.
You need to try locally build docker first to make sure it works (check docs/docker.md for instructions). During the process you will find you need to update the uv.lock file, add that change to the PR

I have updated the uv.lock and pyrefly.toml.

Signed-off-by: RolaoDenthu <[email protected]>

guyueh1 · 2026-01-06T17:50:04Z

@RolaoDenthu lint seems to be complaining a new file not added to the pyrefly.toml
Please check the test log, it has the shell command, please run it locally to make sure it passes, or use the pre-commit hook (as here )

github-actions · 2026-01-06T17:50:48Z

⚠️ File Consistency Check

Check based on commit: 0513cbf (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

Signed-off-by: RolaoDenthu <[email protected]>

RolaoDenthu · 2026-01-07T19:34:09Z

@guyueh1 Hi I’ve fixed the environment and it should now be able to run with the current uv lock. Could you please restart the test?

github-actions · 2026-01-08T00:02:33Z

⚠️ File Consistency Check

Check based on commit: 570f996 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

terrykong · 2026-01-08T08:26:10Z

hey @RolaoDenthu . i'm trying to get the examples in this PR to run and running into some issues. I'll submit a PR against your branch tomorrow with some fixes

Signed-off-by: RolaoDenthu <[email protected]>

PrinsYin and others added 30 commits December 6, 2025 21:12

sglang support:initial commit

d9cf489

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang:manually set cuda visible to let localran=0 to manage gpus of …

3eace5f

…a server Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang: add sglang setup in grpo.py, add find available port to set u…

6fbbbb7

…p servers Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang: add shutdown

242612c

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang server: fix gpu allocation when tp =1

a3d8ad6

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

generate only first request

88971e3

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

fix : choose the correct gpu using base gpu id

db8b07b

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

asyncio to roolout all saples

dd0e54f

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

fix new event loop for rollout

21c54e3

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

added mem_fraction

5e24fab

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

modified build_sampling_paras and stop token handling

50189a9

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

temp: prevent server overlaod with semaphore

ec35b6b

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang: refactor, move async loop position

f099caa

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang: fix total length in generate

a03eba8

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang: env setup

e08cfd6

sglang: add 1B example Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

from tensor:

ccc66f6

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

sglang refit: fix sglang import

2ce928b

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

fix: match fsdp ranks correctly with sglang

4aa1e74

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

flush cache before update begins

9098077

Signed-off-by: Ryan <[email protected]> Signed-off-by: Zhuoran Yin <[email protected]>

Fix SGLang compatibility: add hasattr checks for vLLM-specific methods

9900a33

Signed-off-by: Zhuoran Yin <[email protected]>

sglang: modified config (increase mem_fration, enable wandb)

5cb78e3

Signed-off-by: Zhuoran Yin <[email protected]>

refactor(grpo): extract init logic for generation backends

03d9d0c

Signed-off-by: Zhuoran Yin <[email protected]>

refactor: generalize logger metrics for all generation backends

f1c26dd

Signed-off-by: Zhuoran Yin <[email protected]>

refactor sglang config loading to make it consistent with other backendw

255dcc6

Signed-off-by: Zhuoran Yin <[email protected]>

resolved ai comments

ee01f91

Signed-off-by: Zhuoran Yin <[email protected]>

changed print to using loging

e25e573

Signed-off-by: Zhuoran Yin <[email protected]>

Merge branch 'main' into sglang_server

e93699f

Update nemo_rl/models/generation/sglang/sglang_worker.py

85d6a92

Co-authored-by: Terry Kong <[email protected]> Signed-off-by: Night <[email protected]>

Merge branch 'main' into sglang_server

be1ae27

fix lints

c345a15

Signed-off-by: RolaoDenthu <[email protected]>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 5, 2026

guyueh1 had a problem deploying to nemo-ci January 5, 2026 05:09 — with GitHub Actions Failure

RolaoDenthu added 2 commits January 6, 2026 17:31

add sglang init

02a7e1f

Signed-off-by: RolaoDenthu <[email protected]>

update uv.lock

5ec83e1

Signed-off-by: RolaoDenthu <[email protected]>

RolaoDenthu force-pushed the add-tests branch from 2a19c16 to 5ec83e1 Compare January 6, 2026 17:32

Add sglang/config.py to pyrefly

0513cbf

Signed-off-by: RolaoDenthu <[email protected]>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 6, 2026

guyueh1 temporarily deployed to nemo-ci January 6, 2026 18:00 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 6, 2026 19:03 — with GitHub Actions Inactive

RolaoDenthu added 2 commits January 7, 2026 08:26

uv.lock updated

2338fdd

Signed-off-by: RolaoDenthu <[email protected]>

fix envir

b846b36

Signed-off-by: RolaoDenthu <[email protected]>

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 7, 2026

Merge branch 'main' into add-tests

570f996

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 8, 2026

guyueh1 temporarily deployed to nemo-ci January 8, 2026 00:02 — with GitHub Actions Inactive

guyueh1 temporarily deployed to nemo-ci January 8, 2026 00:36 — with GitHub Actions Inactive

add sglang-only marker filtering

ae4bc44

Signed-off-by: RolaoDenthu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add SGLang rollout backend and tests #1674

feat: Add SGLang rollout backend and tests #1674

RolaoDenthu commented Dec 21, 2025 •

edited

Loading

Uh oh!

guyueh1 commented Jan 5, 2026

Uh oh!

guyueh1 commented Jan 5, 2026

Uh oh!

RolaoDenthu commented Jan 6, 2026

Uh oh!

guyueh1 commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

RolaoDenthu commented Jan 7, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

terrykong commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add SGLang rollout backend and tests #1674

Are you sure you want to change the base?

feat: Add SGLang rollout backend and tests #1674

Conversation

RolaoDenthu commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Usage

Before your PR is "Ready for review"

Summary by CodeRabbit

Release Notes

Uh oh!

guyueh1 commented Jan 5, 2026

Uh oh!

guyueh1 commented Jan 5, 2026

Uh oh!

RolaoDenthu commented Jan 6, 2026

Uh oh!

guyueh1 commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

RolaoDenthu commented Jan 7, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

⚠️ File Consistency Check

⚠️ DTensor Policy Worker Synchronization Warning

Uh oh!

terrykong commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RolaoDenthu commented Dec 21, 2025 •

edited

Loading