Remove no exp usage from logical rule Part II by NuojCheng · Pull Request #3603 · AI-Hypercomputer/maxtext

NuojCheng · 2026-04-08T18:02:33Z

Description

Followup PR of #3578

This PR deprecates

activation_kv_batch_no_exp
activation_q_length_no_exp
activation_attn_length_no_exp

from logical names.

After this change

activation_kv_batch always includes "expert" physical axis
activation_q_length and activation_attn_length does not include "expert"

Other logical names containing "_no_exp" will be deprecated in following PR.

Tests

CI tests protecting attention.py.

Inference test:

NEW_MODEL_DESIGN=1 python src/maxtext/inference/vllm_decode.py src/maxtext/configs/post_train/rl.yml model_name=qwen3-30b-a3b tokenizer_path=Qwen/Qwen3-30B-A3B ici_tensor_parallelism=4 ici_expert_parallelism=1 enable_dp_attention=false hbm_utilization_vllm=0.3 load_parameters_path=gs://parambole-qwen3-moe-verification/unscanned/qwen3-30b-a3b-thinking-2507/14_08_2025/0/items vllm_hf_overrides='{architectures: ["MaxTextForCausalLM"]}' prompt="Suggest some famous landmarks in London." 2>&1 | tee  qwen3_moe_vllm_0.log

Output: https://paste.googleplex.com/6211135167660032

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-04-08T18:08:17Z

Codecov Report

❌ Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/maxtext/layers/attention_op.py	75.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

NuojCheng added the draft Draft PR label Apr 8, 2026

This was referenced Apr 8, 2026

Remove no exp usage from logical rule Part III #3606

Draft

Remove no exp usage from logical rule Part IV #3607

Draft

NuojCheng force-pushed the chengnuojin-no-exp2 branch from fab65eb to 6d39967 Compare April 8, 2026 20:23

remoe no_exp from attentions

fdb2024

NuojCheng force-pushed the chengnuojin-no-exp2 branch from 6d39967 to fdb2024 Compare April 8, 2026 23:59

NuojCheng removed the draft Draft PR label Apr 9, 2026

NuojCheng marked this pull request as ready for review April 9, 2026 00:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove no exp usage from logical rule Part II#3603

Remove no exp usage from logical rule Part II#3603
NuojCheng wants to merge 1 commit intomainfrom
chengnuojin-no-exp2

NuojCheng commented Apr 8, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NuojCheng commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

NuojCheng commented Apr 8, 2026 •

edited

Loading

codecov bot commented Apr 8, 2026 •

edited

Loading