Skip to content

[Feature]: Add support for OpenShift/OCP guidance in DCM partitioning doc #428

@leo8a

Description

@leo8a

Suggestion Description

DCM partitioning doc lacks OpenShift (OCP) procedure and examples

Doc/page: Applying Partition Profiles (https://instinct.docs.amd.com/projects/gpu-operator/en/latest/dcm/applying-partition-profiles.html)

Problems:

  • The procedure reads like vanilla Kubernetes and omits OpenShift-specific details needed to succeed on OCP.
  • Examples reference kube-amd-gpu, but on OpenShift the operator is commonly deployed in openshift-amd-gpu.
  • Taint/toleration guidance is incomplete for OpenShift (NoExecute taint can evict critical pods).
    • The doc provides (or implies) tolerations guidance suitable for simple Kubernetes setups, but doesn’t cover OpenShift’s reality, where essential cluster DaemonSets/Deployments may land on GPU nodes. Applying a NoExecute taint can evict important components unless they tolerate it indeed.

Impact: OCP users follow the doc and get stuck or apply changes in the wrong namespace / wrong assumptions about cluster components.

Operating System

No response

GPU

No response

ROCm Component

No response

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions