feat: Add GKE Inference Gateway blueprints for a3-mega and a3-ultra #5002

SinaChavoshi · 2025-12-16T00:24:23Z

This PR introduces new example blueprints to facilitate the deployment of GKE clusters optimized for the Inference Gateway on A3 Mega and A3 Ultra machine types.

These changes replicate the existing pattern from the gke-a3-highgpu-inference-gateway.yaml blueprint.

Key Changes:

New Blueprint for A3 Mega: Added examples/gke-a3-megagpu/gke-a3-megagpu-inference-gateway.yaml and a corresponding deployment file. This blueprint configures a GKE cluster with the necessary REGIONAL_MANAGED_PROXY subnet and enables the enable_inference_gateway flag.
New Blueprint for A3 Ultra: Added examples/gke-a3-ultragpu/gke-a3-ultragpu-inference-gateway.yaml and a corresponding deployment file, following the same pattern as the A3 Mega blueprint.
Documentation: Updated examples/README.md to include descriptions and links for the new gke-a3-megagpu-inference-gateway.yaml and gke-a3-ultragpu-inference-gateway.yaml blueprints.

These additions make it easier for users to get started with Inference Gateway on the latest A3 Mega and A3 Ultra hardware.

Submission Checklist

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

Fork your PR branch from the Toolkit "develop" branch (not main)
Test all changes with pre-commit in a local branch #
Confirm that "make tests" passes all tests
Add or modify unit tests to cover code changes
Ensure that unit test coverage remains above 80%
Update all applicable documentation
Follow Cluster Toolkit Contribution guidelines #

gemini-code-assist · 2025-12-16T00:24:44Z

Summary of Changes

Hello @SinaChavoshi, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the GKE Inference Gateway capabilities within the toolkit by introducing dedicated blueprints for A3 Mega and A3 Ultra GPU machine types. These new examples streamline the deployment of GKE clusters pre-configured for high-performance inference workloads, including necessary networking components and Inference Gateway enablement, making it easier for users to leverage the latest A3 hardware.

Highlights

A3 Mega Inference Gateway Blueprint: A new example blueprint and deployment file have been added for deploying GKE clusters optimized for Inference Gateway on A3 Mega machine types (H100 GPUs).
A3 Ultra Inference Gateway Blueprint: A new example blueprint and deployment file have been added for deploying GKE clusters optimized for Inference Gateway on A3 Ultra machine types (H200 GPUs).
Documentation Updates: The examples/README.md has been updated to include descriptions and links for these new A3 Mega and A3 Ultra Inference Gateway blueprints.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces new example blueprints for GKE Inference Gateway on A3 Mega and A3 Ultra hardware, which is a great addition. The changes correctly update the documentation in examples/README.md. However, I've identified several inconsistencies between the new megagpu and ultragpu examples in both the main blueprint files and their corresponding deployment files. A significant point of feedback is that the gke-a3-ultragpu-inference-gateway.yaml blueprint is substantially more complex than its megagpu counterpart, including many modules related to training workloads that seem out of scope for an inference-focused example. Addressing these points by simplifying the ultragpu blueprint and aligning styles across all new files will significantly improve the consistency and usability of these new examples.

gemini-code-assist · 2025-12-16T00:26:26Z

examples/gke-a3-ultragpu/gke-a3-ultragpu-inference-gateway.yaml

+  - id: training_bucket
+    source: community/modules/file-system/cloud-storage-bucket
+    settings:
+      local_mount: /training-data
+      name_prefix: training
+      random_suffix: true
+      force_destroy: false
+      enable_hierarchical_namespace: true
+
+  - id: checkpoint_bucket
+    source: community/modules/file-system/cloud-storage-bucket
+    settings:
+      local_mount: /checkpoint-data
+      name_prefix: checkpoint
+      random_suffix: true
+      force_destroy: false
+      enable_hierarchical_namespace: true
+
+  - id: a3-ultragpu-cluster
+    source: modules/scheduler/gke-cluster
+    use: [gke-a3-ultra-net-0, workload_service_account]
+    settings:
+      system_node_pool_machine_type: "e2-standard-16"
+      system_node_pool_disk_size_gb: $(vars.system_node_pool_disk_size_gb)
+      system_node_pool_taints: []
+      enable_dcgm_monitoring: true
+      enable_gcsfuse_csi: true
+      enable_managed_lustre_csi: true # Enable Managed Lustre for the cluster
+      enable_private_endpoint: false # Allows access from authorized public IPs
+      configure_workload_identity_sa: true
+      master_authorized_networks:
+      - cidr_block: $(vars.authorized_cidr) # Allows your machine to run the kubectl command. Required for multi network setup.
+        display_name: "kubectl-access-network"
+      additional_networks:
+        $(concat(
+          [{
+            network=gke-a3-ultra-net-1.network_name,
+            subnetwork=gke-a3-ultra-net-1.subnetwork_name,
+            subnetwork_project=vars.project_id,
+            nic_type="GVNIC",
+            queue_count=null,
+            network_ip=null,
+            stack_type=null,
+            access_config=[{nat_ip=null, public_ptr_domain_name=null, network_tier=null}],
+            ipv6_access_config=[],
+            alias_ip_range=[]
+          }],
+         gke-a3-ultra-rdma-net.subnetwork_interfaces_gke
+        ))
+      # Cluster versions cannot be updated through the toolkit after creation
+      # Please manage cluster version from the Google Cloud Console directly
+      version_prefix: $(vars.version_prefix)
+      release_channel: RAPID
+      maintenance_exclusions:
+      - name: no-minor-or-node-upgrades-indefinite
+        start_time: "2024-12-01T00:00:00Z"
+        end_time: "2025-12-22T00:00:00Z"
+        exclusion_scope: NO_MINOR_OR_NODE_UPGRADES
+      enable_inference_gateway: true
+    outputs: [instructions]
+
+  # # --- MANAGED LUSTRE ADDITIONS ---
+  # # Private Service Access (PSA) requires the compute.networkAdmin role which is
+  # # included in the Owner role, but not Editor.
+  # # PSA is required for all Managed Lustre functionality.
+  # # https://cloud.google.com/vpc/docs/configure-private-services-access#permissions
+  # - id: private_service_access
+  #   source: community/modules/network/private-service-access
+  #   use: [gke-a3-ultra-net-0]
+  #   settings:
+  #     prefix_length: 24
+
+  # # Firewall to allow Managed Lustre connection
+  # - id: lustre_firewall_rule
+  #   source: modules/network/firewall-rules
+  #   use: [gke-a3-ultra-net-0]
+  #   settings:
+  #     ingress_rules:
+  #     - name: $(vars.deployment_name)-allow-lustre-traffic
+  #       description: Allow Managed Lustre traffic
+  #       source_ranges:
+  #       - $(private_service_access.cidr_range)
+  #       allow:
+  #       - protocol: tcp
+  #       ports:
+  #       - "988"
+
+  # - id: managed-lustre
+  #   source: modules/file-system/managed-lustre
+  #   use: [gke-a3-ultra-net-0, private_service_access]
+  #   settings:
+  #     name: $(vars.lustre_instance_id)
+  #     local_mount: /lustre
+  #     remote_mount: lustrefs
+  #     size_gib: $(vars.lustre_size_gib)
+  #     per_unit_storage_throughput: $(vars.per_unit_storage_throughput)
+
+  # - id: lustre-pv
+  #   source: modules/file-system/gke-persistent-volume
+  #   use: [managed-lustre, a3-ultragpu-cluster]
+  #   settings:
+  #     capacity_gib: $(vars.lustre_size_gib)
+
+  - id: a3-ultragpu-pool
+    source: modules/compute/gke-node-pool
+    use: [a3-ultragpu-cluster, node_pool_service_account]
+    settings:
+      machine_type: a3-ultragpu-8g
+      auto_upgrade: true
+      zones: [$(vars.zone)]
+      disk_size_gb: $(vars.a3ultra_node_pool_disk_size_gb)
+      static_node_count: $(vars.static_node_count)
+      guest_accelerator:
+      - type: $(vars.accelerator_type)
+        count: 8
+      reservation_affinity:
+        consume_reservation_type: SPECIFIC_RESERVATION
+        specific_reservations:
+        - name: $(vars.reservation)
+      additional_networks:
+        $(concat(
+          [{
+            network=gke-a3-ultra-net-1.network_name,
+            subnetwork=gke-a3-ultra-net-1.subnetwork_name,
+            subnetwork_project=vars.project_id,
+            nic_type="GVNIC",
+            queue_count=null,
+            network_ip=null,
+            stack_type=null,
+            access_config=[{nat_ip=null, public_ptr_domain_name=null, network_tier=null}],
+            ipv6_access_config=[],
+            alias_ip_range=[]
+          }],
+         gke-a3-ultra-rdma-net.subnetwork_interfaces_gke
+        ))
+    outputs: [instructions]
+
+  - id: workload-manager-install
+    source: modules/management/kubectl-apply
+    use: [a3-ultragpu-cluster]
+    settings:
+      apply_manifests:
+      - source: $(vars.permissions_file_staged_path)
+        enable: $(vars.enable_periodic_health_checks)
+        template_vars:
+          project_id: $(vars.project_id)
+          deployment_name: $(vars.deployment_name)
+      - source: $(vars.chs_pvc_rendered_path)
+        enable: $(vars.enable_periodic_health_checks)
+        template_vars:
+          pvc_name: $(vars.chs_pvc_claim_name)
+          access_mode: ReadWriteOnce
+          capacity: 1Gi
+          storage_class_name: standard-rwo
+      - source: $(vars.chs_cronjob_rendered_path)
+        enable: $(vars.enable_periodic_health_checks)
+        template_vars:
+          project_id: $(vars.project_id)
+          deployment_name: $(vars.deployment_name)
+          region: $(vars.region)
+          machine_type: a3-ultragpu-8g
+          gcs_bucket: $(vars.chs_output_bucket_name)
+          gcs_pvc: $(vars.chs_pvc_claim_name)
+          cronjob_schedule: $(vars.health_check_schedule)
+      kueue:
+        install: true
+        config_path: $(vars.kueue_configuration_path)
+        config_template_vars:
+          num_gpus: $(a3-ultragpu-pool.static_gpu_count)
+          accelerator_type: $(vars.accelerator_type)
+      jobset:
+        install: true
+      gib:
+        install: true  # NCCL gIB plugin via DaemonSet initContainer
+        path: $(vars.gib_installer_path)
+        template_vars:
+          version: v1.1.0
+          accelerator_count: 8
+
+  - id: job-template
+    source: modules/compute/gke-job-template
+    use: [a3-ultragpu-pool]
+    settings:
+      image: nvidia/cuda:11.0.3-runtime-ubuntu20.04
+      command:
+      - nvidia-smi
+      node_count: 2
+      name: run-nvidia-smi
+      k8s_service_account_name: workload-identity-k8s-sa
+    outputs: [instructions]
+
+  # Create a remote mount of training_bucket using
+  # mount options optimized for reading training data.
+  # Based on Source of truth https://github.com/GoogleCloudPlatform/gcsfuse/blob/d1373b665b7f60e98856d2181f1193396ef16427/samples/gke-csi-yaml/gpu/training-pv.yaml#L15
+  # Some of the options might be available only on latest GKE version, please check the cluster version to meet the required version https://cloud.google.com/kubernetes-engine/docs/how-to/cloud-storage-fuse-csi-driver-perf
+  - id: gcs-training
+    source: modules/file-system/pre-existing-network-storage
+    settings:
+      remote_mount: $(training_bucket.gcs_bucket_name)
+      local_mount: /training-data
+      fs_type: gcsfuse
+      mount_options: >-
+        implicit-dirs,
+        metadata-cache:ttl-secs:-1,
+        metadata-cache:stat-cache-max-size-mb:-1,
+        metadata-cache:type-cache-max-size-mb:-1,
+        file-cache:max-size-mb:-1,
+        file-cache:cache-file-for-range-read:true
+
+  # Create a remote mount of checkpoint_bucket using mount
+  # options optimized for writing and reading checkpoint data.
+  # Based on Source of truth https://github.com/GoogleCloudPlatform/gcsfuse/blob/d1373b665b7f60e98856d2181f1193396ef16427/samples/gke-csi-yaml/gpu/checkpointing-pv.yaml#L15
+  # Some of the options might be available only on latest GKE version, please check the cluster version to meet the required version https://cloud.google.com/kubernetes-engine/docs/how-to/cloud-storage-fuse-csi-driver-perf
+  - id: gcs-checkpointing
+    source: modules/file-system/pre-existing-network-storage
+    settings:
+      remote_mount: $(checkpoint_bucket.gcs_bucket_name)
+      local_mount: /checkpoint-data
+      fs_type: gcsfuse
+      mount_options: >-
+        implicit-dirs,
+        metadata-cache:ttl-secs:-1,
+        metadata-cache:stat-cache-max-size-mb:-1,
+        metadata-cache:type-cache-max-size-mb:-1,
+        file-cache:max-size-mb:-1,
+        file-cache:cache-file-for-range-read:true,
+        file-cache:enable-parallel-downloads:true,
+        rename-dir-limit=200000
+
+  # Persistent Volume for training data
+  - id: training-pv
+    source: modules/file-system/gke-persistent-volume
+    use: [gcs-training, a3-ultragpu-cluster]
+    settings:
+      gcs_bucket_name: $(training_bucket.gcs_bucket_name)
+      capacity_gib: 1000000
+
+  # Persistent Volume for checkpoint data
+  - id: checkpointing-pv
+    source: modules/file-system/gke-persistent-volume
+    use: [gcs-checkpointing, a3-ultragpu-cluster]
+    settings:
+      gcs_bucket_name: $(checkpoint_bucket.gcs_bucket_name)
+      capacity_gib: 1000000
+
+  # This is an example job that will install and run an `fio`
+  # benchmark against the training and checkpointing buckets.
+  - id: fio-bench-job-template
+    source: modules/compute/gke-job-template
+    use: [checkpointing-pv, training-pv, a3-ultragpu-pool]
+    settings:
+      security_context:  # to make sure the job have enough access to install the fio packages
+      - key: runAsUser
+        value: 0
+      - key: runAsGroup
+        value: 100
+      - key: fsGroup
+        value: 100
+      # By adding an ephemeral volume, this will ensure that the job adds:
+      # nodeSelector:
+      #   cloud.google.com/gke-ephemeral-storage-local-ssd: "true"
+      # which is the best practice for using local-ssd for ephemeral storage.
+      ephemeral_volumes:
+      - type: local-ssd
+        mount_path: /scratch-data
+        size_gb: 1000  # Use 1 out of 12 TB for local scratch
+
+      k8s_service_account_name: workload-identity-k8s-sa
+      image: ubuntu:latest
+
+      command:
+      - bash
+      - -c
+      - |
+
+        set -eux
+        export DEBIAN_FRONTEND=noninteractive
+
+        # Install fio
+        apt update -y && apt install -y fio
+
+        # Use a tag to create a unique path for tests
+        TAG=`date +%s`
+
+        # Verify mountpoints
+        df -h
+        mountpoint /scratch-data
+        mountpoint /checkpoint-data
+        mountpoint /training-data
+
+        # Create temporary directory for fio benchmarks
+        mkdir -p /{scratch,training,checkpoint}-data/fio-benchmarks-${TAG}
+
+        # The following will take roughly 10 minutes to complete
+
+        # Perform scratch data write performance test
+        fio --ioengine=libaio --filesize=10G --ramp_time=2s --runtime=1m \
+          --numjobs=32 --create_serialize=0 --direct=1 --verify=0 \
+          --randrepeat=0 --group_reporting --directory=/scratch-data/fio-benchmarks-${TAG} \
+          --name=scratch --blocksize=100m --iodepth=64 --readwrite=write
+
+        # Perform training data reading performance test
+        fio --ioengine=libaio --filesize=1G --ramp_time=2s --runtime=1m \
+          --numjobs=32 --create_serialize=0 --direct=1 --verify=0 \
+          --randrepeat=0 --group_reporting --directory=/training-data/fio-benchmarks-${TAG} \
+          --name=training --blocksize=1m --iodepth=64 --readwrite=randread
+
+        # Perform checkpoint data writing performance test
+        fio --ioengine=libaio --filesize=10G --ramp_time=2s --runtime=1m \
+          --numjobs=32 --create_serialize=0 --direct=1 --verify=0 \
+          --randrepeat=0 --group_reporting --directory=/checkpoint-data/fio-benchmarks-${TAG} \
+          --name=checkpoint --blocksize=100m --iodepth=64 --readwrite=write
+
+        # Perform checkpoint data reading performance test
+        fio --ioengine=libaio --filesize=10G --ramp_time=2s --runtime=1m \
+          --numjobs=32 --create_serialize=0 --direct=1 --verify=0 \
+          --randrepeat=0 --group_reporting --directory=/checkpoint-data/fio-benchmarks-${TAG} \
+          --name=checkpoint --blocksize=100m --iodepth=64 --readwrite=read
+
+        # Clean up temporary directories for fio benchmarks
+        rm -rf /{scratch,training,checkpoint}-data/fio-benchmarks-${TAG}
+
+    outputs: [instructions]


This blueprint is significantly more complex than the other inference gateway examples (megagpu and highgpu). It includes numerous modules related to training workloads, such as training_bucket, checkpoint_bucket, fio-bench-job-template, and various persistent volume configurations. The purpose of an "inference-gateway" example should be to provide a minimal, focused configuration for deploying inference services. This complexity can be confusing for users and deviates from the stated goal of replicating the existing pattern. Please simplify this blueprint by removing the modules that are not essential for setting up the GKE Inference Gateway, aligning it with the structure of gke-a3-megagpu-inference-gateway.yaml.

gemini-code-assist · 2025-12-16T00:26:26Z

examples/gke-a3-megagpu/gke-a3-megagpu-inference-gateway.yaml

+        subnet_ip: 10.128.0.0/20 
+      - subnet_name: $(vars.deployment_name)-proxy-subnet
+        subnet_region: $(vars.region)
+        subnet_ip: "10.129.0.0/24"
+        purpose: "REGIONAL_MANAGED_PROXY"
+        role: "ACTIVE"


There's an inconsistency in how subnet_ip values are defined. On line 73, the IP is unquoted and has a trailing space, while on line 76, it's quoted. For consistency and correctness, it's best to use a single style. I recommend removing the quotes and the trailing space, as they are not required for these string values.

subnet_ip: 10.128.0.0/20 - subnet_name: $(vars.deployment_name)-proxy-subnet subnet_region: $(vars.region) subnet_ip: 10.129.0.0/24 purpose: "REGIONAL_MANAGED_PROXY" role: "ACTIVE"

gemini-code-assist · 2025-12-16T00:26:26Z

examples/gke-a3-ultragpu/gke-a3-ultragpu-inference-gateway-deployment.yaml

+  project_id: PROJECT_ID
+  region: COMPUTE_REGION
+  zone: COMPUTE_ZONE
+  # Cidr block containing the IP of the machine calling terraform.
+  # The following line must be updated for this example to work.
+  authorized_cidr: IP_ADDRESS/SUFFIX


The placeholder values and comments in this deployment file are inconsistent with the newly added gke-a3-megagpu-inference-gateway-deployment.yaml and other examples in the repository. For instance, this file uses COMPUTE_REGION and IP_ADDRESS/SUFFIX, whereas the megagpu counterpart provides a default region with a comment and uses <IP_ADDRESS>/<SUFFIX>. Aligning these makes the examples more consistent and easier for users to understand and modify.

project_id: PROJECT_ID # The GCP Region used for this deployment. region: us-central1 # The GCP Zone used for this deployment. zone: us-central1-c # Cidr block containing the IP of the machine calling terraform. # The following line must be updated for this example to work. authorized_cidr: <IP_ADDRESS>/<SUFFIX>

gemini-code-assist · 2025-12-16T00:26:27Z

examples/gke-a3-ultragpu/gke-a3-ultragpu-inference-gateway.yaml

+  project_id: # add this
+  deployment_name: # add this
+  region: # add this
+  zone: # add this
+  # Cidr block containing the IP of the machine calling terraform.
+  # The following line must be updated for this example to work.
+  authorized_cidr: # add this
+  # The name of the compute engine reservation in the form of
+  # <reservation-name>
+  # To target a BLOCK_NAME, the name of the extended reservation
+  # can be inputted as <reservation-name>/reservationBlocks/<reservation-block-name>
+  reservation: # add this
+  kueue_configuration_path: $(ghpc_stage("./kueue-configuration.yaml.tftpl"))
+  gib_installer_path: $(ghpc_stage("./nccl-installer.yaml.tftpl"))
+  static_node_count: # add this


The placeholder comments here (e.g., # add this) and lack of default values are inconsistent with the style used in the gke-a3-megagpu-inference-gateway.yaml blueprint (e.g., ## Set GCP Project ID Here ## and default values provided). To maintain consistency across the example blueprints, please use the same placeholder style and provide sensible defaults where applicable.

# The following variables should be over-written in the deployment.yaml file. # Your GCP Project ID project_id: ## Set GCP Project ID Here ## # This should be unique across all of your Cluster # Toolkit Deployments. deployment_name: gke-a3-ultra-ig # The GCP Region used for this deployment. region: us-central1 # The GCP Zone used for this deployment. zone: us-central1-c # Cidr block containing the IP of the machine calling terraform. # The following line must be updated for this example to work. authorized_cidr: # The name of the compute engine reservation in the form of # <reservation-name> # To target a BLOCK_NAME, the name of the extended reservation # can be inputted as <reservation-name>/reservationBlocks/<reservation-block-name> reservation: kueue_configuration_path: $(ghpc_stage("./kueue-configuration.yaml.tftpl")) gib_installer_path: $(ghpc_stage("./nccl-installer.yaml.tftpl")) static_node_count: 2

gemini-code-assist · 2025-12-16T00:26:27Z

examples/gke-a3-ultragpu/gke-a3-ultragpu-inference-gateway.yaml

+        subnet_ip: 192.168.0.0/18
+      - subnet_name: $(vars.deployment_name)-proxy-subnet
+        subnet_region: $(vars.region)
+        subnet_ip: "192.168.192.0/24"
+        purpose: "REGIONAL_MANAGED_PROXY"
+        role: "ACTIVE"


There's an inconsistency in how subnet_ip values are quoted. On line 74, the IP is unquoted, while on line 77, it's quoted. For consistency within this file and across other YAML blueprints, it's best to use a single style. I recommend removing the quotes for consistency, as they are not required for these string values.

subnet_ip: 192.168.0.0/18 - subnet_name: $(vars.deployment_name)-proxy-subnet subnet_region: $(vars.region) subnet_ip: 192.168.192.0/24 purpose: "REGIONAL_MANAGED_PROXY" role: "ACTIVE"

feat: Add GKE Inference Gateway blueprints for a3-mega and a3-ultra

70dcf62

SinaChavoshi requested review from a team and samskillman as code owners December 16, 2025 00:24

gemini-code-assist bot reviewed Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add GKE Inference Gateway blueprints for a3-mega and a3-ultra #5002

feat: Add GKE Inference Gateway blueprints for a3-mega and a3-ultra #5002

Uh oh!

SinaChavoshi commented Dec 16, 2025

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add GKE Inference Gateway blueprints for a3-mega and a3-ultra #5002

Are you sure you want to change the base?

feat: Add GKE Inference Gateway blueprints for a3-mega and a3-ultra #5002

Uh oh!

Conversation

SinaChavoshi commented Dec 16, 2025

Submission Checklist

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant