Skip to content

Conversation

@RolaoDenthu
Copy link

@RolaoDenthu RolaoDenthu commented Dec 21, 2025

What does this PR do ?

Add comprehensive test coverage for SGLang generation backend, including functional tests, unit tests, and nightly tests.

  • Functional Test (tests/functional/grpo_sglang.sh): Quick validation of SGLang-based GRPO training
  • Unit Tests (tests/unit/models/generation/test_sglang_generation.py): unit tests covering:
    • Basic configuration validation
    • Policy generation and tensor parallelism
    • Worker seed behavior for RLHF diversity
    • HTTP server direct API access
    • Weight updates with DTensor policy (colocated mode)
    • Prefix cache reset after weight updates
  • Nightly Test (tests/test_suites/llm/grpo-qwen3-0.6b-1n8g-sglang.sh): End-to-end convergence test for SGLang backend

Usage

  • You can potentially add a usage example below
# Run functional test
uv add coverage
bash tests/functional/grpo_sglang.sh

# Run unit tests
uv sync --extra sglang --group test
uv run python -m pytest tests/unit/models/generation/test_sglang_generation.py -v --sglang-only

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Summary by CodeRabbit

Release Notes

  • New Features

    • Distributed generation engine using SGLang backend with HTTP weight streaming and multi-GPU support.
  • Configuration

    • New YAML configuration templates for SGLang-based experiments with customizable generation parameters.
  • Tests

    • Comprehensive test coverage for SGLang generation, including tensor parallelism, batching, and dynamic weight updates.

✏️ Tip: You can customize this high-level summary in your review settings.

PrinsYin and others added 30 commits December 6, 2025 21:12
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
sglang: add 1B example
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
- Convert SGLangConfig from regular class to TypedDict inheriting GenerationConfig
- Align structure with VllmConfig pattern for consistency
- Mark all fields as NotRequired for backward compatibility
- Add sglang_kwargs field for additional ServerArgs parameters
- Add type casting in grpo.py for type safety

This maintains backward compatibility while aligning with the existing
generation config structure pattern.

Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: Zhuoran Yin <[email protected]>
Signed-off-by: RolaoDenthu <[email protected]>
@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 5, 2026
@guyueh1
Copy link
Contributor

guyueh1 commented Jan 5, 2026

quote lint check

File nemo_rl/models/generation/sglang/init.py has zero errors but is not in pyrefly.toml in the 'project-includes' list. Please add it to this whitelist.

@guyueh1
Copy link
Contributor

guyueh1 commented Jan 5, 2026

quote build-container job

The lockfile at uv.lock needs to be updated, but --locked was provided. To update the lockfile, run uv lock.
You need to try locally build docker first to make sure it works (check docs/docker.md for instructions). During the process you will find you need to update the uv.lock file, add that change to the PR

Signed-off-by: RolaoDenthu <[email protected]>
Signed-off-by: RolaoDenthu <[email protected]>
@RolaoDenthu
Copy link
Author

quote build-container job

The lockfile at uv.lock needs to be updated, but --locked was provided. To update the lockfile, run uv lock.
You need to try locally build docker first to make sure it works (check docs/docker.md for instructions). During the process you will find you need to update the uv.lock file, add that change to the PR

I have updated the uv.lock and pyrefly.toml.

@guyueh1
Copy link
Contributor

guyueh1 commented Jan 6, 2026

@RolaoDenthu lint seems to be complaining a new file not added to the pyrefly.toml
Please check the test log, it has the shell command, please run it locally to make sure it passes, or use the pre-commit hook (as here )

@github-actions
Copy link

github-actions bot commented Jan 6, 2026

⚠️ File Consistency Check

Check based on commit: 0513cbf (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 6, 2026
Signed-off-by: RolaoDenthu <[email protected]>
Signed-off-by: RolaoDenthu <[email protected]>
@RolaoDenthu
Copy link
Author

@guyueh1 Hi I’ve fixed the environment and it should now be able to run with the current uv lock. Could you please restart the test?

@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 7, 2026
@guyueh1 guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 8, 2026
@github-actions
Copy link

github-actions bot commented Jan 8, 2026

⚠️ File Consistency Check

Check based on commit: 570f996 (PR #1674 from add-tests)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@terrykong
Copy link
Contributor

hey @RolaoDenthu . i'm trying to get the examples in this PR to run and running into some issues. I'll submit a PR against your branch tomorrow with some fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L2 Run doctests, unit tests, functional tests, and convergence tests community-request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants