Skip to content

Conversation

@calvin197
Copy link
Contributor

What this PR does / why we need it:

Overview

This change adds a new MigStrategy variable to the NodeBootstrappingConfiguration (NBC) to support configurable MIG (Multi-Instance GPU) strategies for the nvidia-device-plugin.

Previously, when GPUInstanceProfile was set, the MIG strategy was hardcoded to single. This PR enables the Resource Provider (RP) to explicitly pass the desired MIG strategy via NBC, allowing users to control how MIG devices are exposed to Kubernetes.

Supported MIG Strategies

  • None
    MIG is not enabled.

  • Single (default)
    All MIG devices are exposed as generic resources:
    nvidia.com/gpu

  • Mixed
    MIG devices are exposed with specific resource types, such as:
    nvidia.com/mig-1g.5gb

Changes

  • Add mig_strategy field to the GPU config proto (scriptless NBC path)
  • Add MigStrategy field to NodeBootstrappingConfiguration (legacy NBC path)
  • Add NVIDIA_MIG_STRATEGY environment variable to CSE
  • Update startNvidiaManagedExpServices() to configure the nvidia-device-plugin using the new MIG strategy variable

Backward Compatibility

  • The default behavior remains Single to preserve existing behavior
  • Mixed is only used when explicitly specified via NBC

Which issue(s) this PR fixes:

Fixes #

ai prompt for claude
should not enabled managed gpu in the original test
Reapply "Update work-prompt.md"

This reverts commit 28f0bd0.
  - Add mig_strategy field to gpu_config.proto with field number 8
  - Add NVIDIA_MIG_STRATEGY environment variable mapping in parser.go
  - Add MigStrategy field to NodeBootstrappingConfiguration in types.go
  - Add GetMigStrategy template function in baker.go
  - Add NVIDIA_MIG_STRATEGY variable in cse_cmd.sh
  - Add comprehensive tests for Mixed, Single, and None strategy values
…_MIG_STRATEGY environment variable instead of hardcoded 'single' strategy.

  - Add conditional logic to map RP values to device plugin flags:
    * "Mixed" -> --mig-strategy mixed
    * "Single"/"None"/empty -> --mig-strategy single (default)
  - Add comprehensive comments explaining MIG strategy behavior
  - Maintain backward compatibility with safe defaults
  - Only use "mixed" when explicitly specified for safety
This reverts commit e9433c9.
@ganeshkumarashok ganeshkumarashok changed the title feat: add MigStrategy NBC variable for managed GPU experience feat: add MigStrategy and EnableManagedGPU NBC variables for managed GPU experience Jan 26, 2026
@lilypan26 lilypan26 merged commit 53b9806 into main Jan 27, 2026
36 of 42 checks passed
@lilypan26 lilypan26 deleted the calvinshum/managed-gpu/mig-mode branch January 27, 2026 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants