Skip to content

Add support for OD-CDS concurrent with SoTW CDS #42551

@ravi-dd

Description

@ravi-dd

Add support for OD-CDS concurrent with SoTW CDS

Currently, SoTW CDS updates delete all OD-CDS delivered clusters. This cluster removal is unnecessary, causes extra load on management servers, and was the cause of tricky to debug cluster_not_found responses on SoTW updates.

There's been similar concerns & solutions for related SoTW CDS interactions:

Example

Consider the following setup with both static clusters & ODCDS:

# bootstrap file referencing static clusters
dynamic_resources:
  cds_config:
    path_config_source:
      path: /envoy/configs/cds.yaml
    resource_api_version: V3
  lds_config:
    path_config_source:
      path: /envoy/configs/lds.yaml
    resource_api_version: V3

# cds.yaml w static clusters A,B,C
resources:
  - "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
    name: A
    ...

# lds.yaml route with OnDemand CDS for clusters D,E,F
...
  - match: ...
    route:
      weighted_clusters:
        clusters:
          - name: D
            weight: 100
            typed_per_filter_config:
              envoy.filters.http.on_demand:
                "@type": type.googleapis.com/envoy.extensions.filters.http.on_demand.v3.PerRouteConfig
                odcds: ...
      timeout: 45s
  1. Envoy starts and loads clusters A,B,C from cds.yaml
  2. Over time, Envoy discovers clusters D,E,F via ODCDS
  3. SoTW CDS update occurs. This can happen by replacing the lds.yaml file.
  4. SoTW CDS removes ODCDS clusters D,E,F

Proposal

I have a few independent high level ideas to make this work:

  1. Extend xds-federation: adding support for OD-CDS over xDS-TP concurrent with CDS SotW #41117, making SoTW CDS only able to remove its own clusters.
  2. Have ODCDS add Clusters similar to DFP, piping avoid_cds_removal through to the cds_api_helper.
  3. Add explicit owner tracking in the Cluster Manager. SoTW CDS, ODCDS, DFP, etc use a unique identifier when interacting with the Cluster Manager. The Cluster Manager rejects requests related to clusters with a mismatched owner.

I'm in favor of option 1 since it's a straightforward change and implements what the intended behavior of SoTW CDS should be.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/xdsenhancementFeature requests. Not bugs or questions.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions