Conversation
📝 WalkthroughWalkthroughExtracted GPU-selection helpers and changed two-node disaggregated GPU selection; added boolean-aware vLLM CLI flag serialization; implemented vLLM parallelism constraint checks validating prefill/decode roles; updated tests for flag serialization and constraint scenarios. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/workloads/vllm/vllm.py`:
- Around line 96-129: The code in _validate_vllm_parallelism_constraints uses a
nested getattr to fetch expert_parallel with a fallback to
enable_expert_parallel, which adds unnecessary complexity; change the ep_enabled
assignment to directly read the canonical field (e.g., ep_enabled = cast(bool,
getattr(args, "enable_expert_parallel", False))) and remove the nested getattr;
if expert_parallel is intended as a supported alias instead, add documentation
and update tests to reflect that alias rather than keeping the nested lookup in
validate logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 517f1db1-3c09-4041-80af-b82be6b2c21e
📒 Files selected for processing (3)
src/cloudai/workloads/common/llm_serving.pysrc/cloudai/workloads/vllm/vllm.pytests/workloads/vllm/test_workload.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/workloads/vllm/vllm.py`:
- Around line 98-104: The code uses cast(...) on args fields (tp, pp, dp) but if
the attribute exists and is None the arithmetic at constraint1 = (tp * pp * dp)
<= gpu_count will raise TypeError; update the logic in the vllm workload setup
to normalize/validate these parallelism values from args before using them
(e.g., check args.tensor_parallel_size, args.pipeline_parallel_size,
args.data_parallel_size for None and replace with the intended default or
raise/return an error), and ensure ep_enabled and all2all_backend handling
remains unchanged; concretely, validate/convert tp, pp, dp to ints (or bail with
a clear error) prior to computing constraint1 so multiplication never receives
None.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8ab3f09f-d909-4fa1-8646-c7859874fbb9
📒 Files selected for processing (1)
src/cloudai/workloads/vllm/vllm.py
Summary
For VLLM:
Test Plan
Additional Notes