[Feature] Support nvfp4 RL #1505

fy1214 · 2026-01-27T14:37:47Z

WIP, support the nvfp4 for Slime RL process.

zianglih · 2026-02-09T05:16:46Z

If using nvfp4 for training fprop + wgrad + dgrad, it's better to stick to the original TE nvfp4 recipe (https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#NVFP4-training-recipe), 2d weight (for better preserving chain rule) + Random Hadamard transforms + stochastic rounding.

Additionally, I have a TE PR (NVIDIA/TransformerEngine#2644) which allows using only nvfp4 for fprop while keeping wgrad and dgrad in bf16. In this case since we no longer use nvfp4 for bwd, we don't need 2d weight to better preserve chain rule and 1d weight has lower quantization error for free. Since the bwd is not in nvfp4, Random Hadamard transforms and stochastic rounding can also be disabled.

fy1214 added 8 commits January 27, 2026 16:55

[feature] nvfp4 QAT

8368108

[feature] nvfp4 QAT

cfda84d

[feature] nvfp4 QAT

60e8e54

[feature] nvfp4 QAT

280ffe2

[feature] nvfp4 QAT

2c7f573

[feature] nvfp4 QAT script and fix some bug

b7bdd60

[feature] nvfp4 QAT script and fix some bug

5b9e9f3

[feature] add env param to the script.

84fd861

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support nvfp4 RL #1505

[Feature] Support nvfp4 RL #1505

Uh oh!

fy1214 commented Jan 27, 2026

Uh oh!

zianglih commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Feature] Support nvfp4 RL #1505

Are you sure you want to change the base?

[Feature] Support nvfp4 RL #1505

Uh oh!

Conversation

fy1214 commented Jan 27, 2026

Uh oh!

zianglih commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants