diff --git a/docs/backup-resource-options.md b/docs/backup-resource-options.md index 7d7c8fdb..4d277346 100644 --- a/docs/backup-resource-options.md +++ b/docs/backup-resource-options.md @@ -39,6 +39,14 @@ Specifies the name of the `pgBackRest` repository where to save a backup. It mus | ----------- | ---------- | | :material-code-string: string | `repo1` | +### `method` + +Specifies what method to use for the backup. When undefined, uses `pgBackRest` by default. Supported value is `VolumeSnapshot`. See [PVC snapshot support](backups-pvc-snapshots.md) to learn more. + +| Value type | Example | +| ----------- | ---------- | +| :material-code-string: string | `VolumeSnapshot` | + ### `options` You can customize the backup by specifying different [command line options supported by pgBackRest :octicons-external-link-16:](https://pgbackrest.org/configuration.html). diff --git a/docs/backups-pvc-setup.md b/docs/backups-pvc-setup.md new file mode 100644 index 00000000..11091a15 --- /dev/null +++ b/docs/backups-pvc-setup.md @@ -0,0 +1,368 @@ +# Configure and use PVC snapshots + +This document provides step-by-step instructions for configuring and using Persistent Volume Claim (PVC) Snapshots with Percona Operator for PostgreSQL on Kubernetes. + +For a high-level explanation of PVC snapshots, please refer to the [PVC snapshot support](backups-pvc-snapshots.md#overview) chapter. + +## Prerequisites + +To use PVC snapshots, ensure you have the following prerequisites met: + +1. Your Kubernetes cluster must have a CSI driver that supports Volume Snapshots + For example, Google Kubernetes Engine (GKE) with `pd.csi.storage.gke.io`, or Amazon EKS with `ebs.csi.aws.com`. + +2. Your Kubernetes cluster must have VolumeSnapshot CRDs installed. Most managed Kubernetes providers include these by default. Verify by running: + + ```bash + kubectl get crd volumesnapshots.snapshot.storage.k8s.io + ``` + +3. At least one `VolumeSnapshotClass` must exist and be compatible with the storage class used by your PostgreSQL data volumes. Check it with: + + ```bash + kubectl get volumesnapshotclasses + ``` + + If you don't have one, refer to the [Add a VolumeSnapshotClass](#add-volumesnapshotclass) section. + +4. You must enable the `VolumeSnapshots` feature gate for the **Percona Operator for PostgreSQL** deployment. Refer to the [Enable the feature gate](#enable-the-feature-gate) section for details. + +## Before you start + +1. Check the [prerequisites](#prerequisites) and [limitations](backups-pvc-snapshots.md#limitations) +2. Clone the Operator repository to be able to edit manifests: + + ```bash + git clone -b v{{release}} https://github.com/percona/percona-postgresql-operator + ``` + +2. Export the namespace where you run your cluster as an environment variable: + + ```bash + export NAMESPACE= + ``` + +## Configuration + +### Enable the feature gate + +If you have the Operator Deployment up and running, you can edit the `deploy/operator.yaml` manifest. If you deploy the Operator from scratch, edit the `deploy/bundle.yaml` manifest. + +1. Edit the `deploy/operator.yaml` or `deploy/bundle.yaml` and set the `PGO_FEATURE_GATES` environment variable for the Operator Deployment to `"VolumeSnapshots=true"`: + + ```yaml + spec: + containers: + - name: percona-postgresql-operator + env: + - name: PGO_FEATURE_GATES + value: "VolumeSnapshots=true" + ``` + +2. Apply the configuration: + + ```bash + kubectl apply -f deploy/operator.yaml -n $NAMESPACE + ``` + + or + + ```bash + kubectl apply --sever-side -f deploy/bundle.yaml -n $NAMESPACE + ``` + +### Add a VolumeSnapshotClass + +If your Kubernetes cluster doesn't have a `VolumeSnapshotClass` that matches your CSI driver, create one. + +1. Create a VolumeSnapshotClass configuration file with the following configuration: + + ```yaml title="volume-snapshot-class.yaml" + apiVersion: snapshot.storage.k8s.io/v1 + kind: VolumeSnapshotClass + metadata: + name: gke-snapshot-class + driver: pd.csi.storage.gke.io + deletionPolicy: Delete + ``` + +2. Create the VolumeSnapshotClass resource: + + ```bash + kubectl apply -f volume-snapshot-class.yaml + ``` + +### Configure PVC snapshots in your cluster + +You must reference the `VolumeSnapshotClass` in your cluster Custom Resource. + +1. Check the name of the `VolumeSnapshotClass` that works with your storage. You can list available classes with: + + ```bash + kubectl get volumesnapshotclasses + ``` + + 2. Edit the `deploy/cr.yaml` Custom Resource and add the `snapshots` subsection under `backups`. Specify the name of the `VolumeSnapshotClass` in the `volumeSnapshotClassName` key: + + ```yaml + spec: + backups: + snapshots: + volumeSnapshotClassName: + ``` + +3. Apply the configuration to update the cluster: + + ```bash + kubectl apply -f deploy/cr.yaml -n $NAMESPACE + ``` + +Once configured, snapshots are created automatically when you [make a manual on-demand backup](#make-an-on-demand-backup-from-a-pvc-snapshot) or when [a scheduled backup runs](#make-a-scheduled-snapshot-based-backup). + +## Use PVC snapshots + +Once the PVC snapshots are configured, you can use them to make backups and restores. + +### Make an on-demand backup from a PVC snapshot + +1. Configure the `PerconaPGBackup` object. Edit the `deploy/backup.yaml` manifest and specify the following keys: + + * `pgCluster` - the name of your cluster. Check it with the `kubectl get pg -n $NAMESPACE` command + + * `method` - the backup method. Specify `volumeSnapshot`. + + Here's the example configuration: + + ```yaml + apiVersion: pgv2.percona.com/v2 + kind: PerconaPGBackup + metadata: + name: my-snapshot-backup + spec: + pgCluster: cluster1 + method: volumeSnapshot + ``` + +2. Apply the configuration to start a backup: + + ```bash + kubectl apply -f deploy/backup.yaml -n $NAMESPACE + ``` + +3. Check the backup status: + + ```bash + kubectl get pg-backup my-snapshot-backup -n $NAMESPACE + ``` + + ??? example "Sample output" + + ```text + NAME CLUSTER REPO DESTINATION STATUS TYPE COMPLETED AGE + my-snapshot-backup cluster1 repo1 Succeeded volumeSnapshot 3m38s 3m53s + ``` + +### Make a scheduled snapshot-based backup + +1. Configure the backup schedule in your cluster Custom Resource. Edit the `deploy/cr.yaml` manifest. In the `schedule` key in the `snapshots` subsection under `backups`, specify the schedule in the Cron format for the snapshots to be made automatically. Your updated configuration should look like this: + + ```yaml + apiVersion: pgv2.percona.com/v2 + kind: PerconaPGCluster + metadata: + name: my-cluster + spec: + backups: + volumeSnapshots: + className: my-snapshot-class + mode: offline + schedule: "0 3 ** *" # Every day at 3:00 AM + ``` + +2. Apply the configuration to update the cluster: + + ```bash + kubectl apply -f deploy/cr.yaml -n $NAMESPACE + ``` + +### In-place restore from a PVC snapshot + +An in-place restore is a restore to the same cluster using the `PerconaPGRestore` custom resource. You can make a full in-place restore or a point-in-time restore. + +When you create the `PerconaPGRestore` object, the Operator performs the following steps: + +1. Suspends all instances in the cluster. +2. Deletes all existing PVCs in the cluster. This removes all existing data, WAL, and tablespaces. +3. Creates new PVCs with the snapshot serving as the data source. This restores the data, WAL, and tablespaces from that snapshot. +4. Spins up a job to configure the restored PVCs to be used by the cluster. +5. Resumes all instances in the cluster. The cluster starts with the data from the snapshot. + +!!! important + + An in-place restore overwrites the current data and is destructive. Any data that was written after the backup was made is lost. Therefore, consider restoring to a new cluster instead. This way you can evaluate the data before switching to the new cluster and don't risk losing data in the existing cluster. + +Follow the steps below to make a full in-place restore from a PVC snapshot. + +1. Configure the `PerconaPGRestore` object. Edit the `deploy/restore.yaml` manifest and specify the following keys: + + * `pgCluster` - the name of your cluster. Check it with the `kubectl get pg -n $NAMESPACE` command + + * `volumeSnapshotBackupName` - the name of the PVC snapshot backup. Check it with the `kubectl get pg-backup -n $NAMESPACE` command. + + Here's the example configuration: + + ```yaml + apiVersion: pgv2.percona.com/v2 + kind: PerconaPGRestore + metadata: + name: restore1 + spec: + pgCluster: cluster1 + volumeSnapshotBackupName: my-snapshot-backup + ``` + +2. Apply the configuration to start a restore: + + ```bash + kubectl apply -f deploy/restore.yaml -n $NAMESPACE + ``` + +3. Check the restore status: + + ```bash + kubectl get pg-restore restore1 -n $NAMESPACE + ``` + + ??? example "Sample output" + + ```text + NAME CLUSTER STATUS COMPLETED AGE + restore1 cluster1 Succeeded 2026-02-16T11:00:00Z 2m20s + ``` + +### In-place restore with point-in-time recovery + +You can make a point-in-time restore from a PVC snapshot and replay WAL files from a WAL archive made with pgBackRest. For this scenario, your cluster must meet the following requirements: + +1. Have a `pgBackRest` configuration, including the backup storage and at least one repository. See the [Configure backup storage](backups-storage.md) section for configuration steps. +2. The repository must have at least one WAL archive. + +The workflow for point-in-time restore is similar to [a full in-place restore](#in-place-restore-from-a-pvc-snapshot). After the Operator restores the data from the snapshot, it replays the WAL files from the WAL archive to bring the cluster to the target time. + +!!! important + + An in-place restore overwrites the current data and is destructive. Any data that was written after the backup was made is lost. Therefore, consider restoring to a new cluster instead. This way you can evaluate the data before switching to the new cluster and don't risk losing data in the existing cluster. + +Follow the steps below to make a point-in-time restore from a PVC snapshot. + +1. Check the repo name and the target time for the restore. + + * List the backups: + + ```bash + kubectl get pg-backup -n $NAMESPACE + ``` + + * For a `pgBackRest` backup run the following command to get the target time: + + ```bash + kubectl get pg-backup -n $NAMESPACE -o jsonpath='{.status.latestRestorableTime}' + ``` + +2. Configure the `PerconaPGRestore` object. Edit the `deploy/restore.yaml` manifest and specify the following keys: + + * `pgCluster` - the name of your cluster. Check it with the `kubectl get pg -n $NAMESPACE` command + + * `volumeSnapshotBackupName` - the name of the PVC snapshot backup. + + * `repoName` - the name of the pgBackRest repository that contains the WAL archives. + + * `options` - the options for the restore. Specify the following options: + + * `--type=time` - set to `time` to make a point-in-time restore. + * `--target` - set the target time for the restore. + + Here's the example configuration: + + ```yaml + apiVersion: pgv2.percona.com/v2 + kind: PerconaPGRestore + metadata: + name: pitr-restore + spec: + pgCluster: cluster1 + volumeSnapshotBackupName: my-snapshot-backup + repoName: repo1 + options: + - --type=time + - --target="2026-02-16T11:00:00Z" + ``` + +3. Apply the configuration to start a restore: + + ```bash + kubectl apply -f deploy/restore.yaml -n $NAMESPACE + ``` + +4. Check the restore status: + + ```bash + kubectl get pg-restore pitr-restore -n $NAMESPACE + ``` + +### Create a new cluster from a PVC snapshot + +You can create a new cluster from a PVC snapshot. This is useful when you want to restore the data to a new cluster and don't want to overwrite the existing data in the existing cluster. + +To create a new cluster from a PVC snapshot, you need to configure the `PerconaPGCluster` object and specify the existing PVC snapshot as the `dataSource`. You also need to configure the `instances` and `backups` sections to set up the new cluster. + +For more information about the `dataSource` options, see the [Understand the `dataSource` options](backups-clone.md#understand-the-datasource-options) section. Also check the [Custom Resource reference](operator.md#datasource-subsection) for all available options. + +Follow the steps below to create a new cluster from a PVC snapshot. + +1. Create the namespace where a new cluster will be deployed and export it as the environment variable: + + ```bash + kubectl create namespace + export NEW_NAMESPACE= + ``` + +2. Configure the `PerconaPGCluster` object. Edit the `deploy/cr.yaml` manifest and specify the following keys: + + * `dataSource` - the name of the PVC snapshot backup. Check it with the `kubectl get pg-backup my-snapshot-backup -o jsonpath='{.status.snapshot.dataVolumeSnapshotRef}'` command on the **source** cluster. + + * `instances` - the instances configuration for the new cluster. + + * `backups` - the backups configuration for the new cluster. + + Here's the example configuration: + + ```yaml + apiVersion: pgv2.percona.com/v2 + kind: PerconaPGCluster + metadata: + name: new-cluster + spec: + instances: + - name: instance1 + replicas: 3 + dataVolumeClaimSpec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 10Gi + dataSource: + apiGroup: snapshot.storage.k8s.io + kind: VolumeSnapshot + name: + ``` + +3. Apply the configuration to create the new cluster: + + ```bash + kubectl apply -f deploy/cr.yaml -n $NEW_NAMESPACE + ``` + +The new cluster will be provisioned shortly using the volume of the source cluster. + diff --git a/docs/backups-pvc-snapshots.md b/docs/backups-pvc-snapshots.md new file mode 100644 index 00000000..13792f26 --- /dev/null +++ b/docs/backups-pvc-snapshots.md @@ -0,0 +1,88 @@ +# PVC snapshot support + +!!! note "" + + This feature is in the tech preview stage. The API and behavior may change in future releases. + +This document provides an overview of PVC snapshots. If you are familiar with the concept and want to try it out, jump to the [Configure and PVC snaphots](backups-pvc-setup.md) tutorial. + +By reading this document you will learn the following: + +* [What is a PVC shapshot](#overview) +* [How it works](#workflow) +* [Why you need it](#benefits) +* [Requirements](#requirements) +* [Current limitations](#limitations) + +## Overview + +A PVC snapshot is a point-in-time copy of a [Persistent Volume Claim :octicons-link-external-16:](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) created by the storage provider. It captures the volume contents at a specific moment without copying data block by block. + +PVC snapshots are much faster than streaming data to cloud storage or a backup volume. This is especially beneficial for large datasets. The Operator uses the [Kubernetes VolumeSnapshot API :octicons-link-external-16:](https://kubernetes.io/docs/concepts/storage/volume-snapshots/) to create PVC snapshots at the storage level. When used with pgBackRest WAL archiving, PVC snapshots ensure data consistency and provide support for point-in-time recovery. + + +## Workflow + +The Operator currently supports only cold backups (the `offline` mode). +A cold backup is also known as an offline backup. +It is a physical base backup taken when the PostgreSQL instance is shut down. +This ensures consistency by capturing the entire database exactly as it exists at the moment when it's shut down. + +During cold backups, the Operator: + +1. Selects a **replica** instance as the snapshot target. +2. Issues a PostgreSQL `CHECKPOINT` on that replica (if checkpoint is enabled). +3. **Suspends** the replica StatefulSet (scales it to zero). +4. Creates Kubernetes `VolumeSnapshot` objects for the data PVC, WAL PVC (if separate), and any tablespace PVCs. +5. Waits for all snapshots to become `ReadyToUse`. +6. **Resumes** the replica StatefulSet. + +This approach ensures a crash-consistent snapshot while minimizing the impact on the primary. Only a replica is taken offline, so the cluster continues serving read/write traffic on the primary during the snapshot. + +## Why to use PVC snapshots + +PVC snapshots can speed up backups and restores, which is especially beneficial for large data sets. With this feature, you get: + +* **Much faster backups** – Snapshot creation is typically seconds to minutes, regardless of database size. Traditional full backups scale with data volume and can take hours for large datasets. +* **Much faster restores** – Restoring from a snapshot is significantly faster than restoring from cloud storage. Both in-place restores and restores to a new cluster are supported. +* **Lower resource usage** – Snapshots avoid the CPU and network overhead of streaming data to remote storage. +* **PITR support** – When used with pgBackRest, snapshots integrate with point-in-time recovery for flexible restore targets. + +## Requirements + +Before enabling PVC snapshots, ensure the following: + +1. Your Kubernetes cluster must have the CSI driver that supports VolumeSnapshot API. An example of such driver for GKE is `pd.csi.storage.gke.io`, for EKS - `ebs.csi.aws.com`. + +2. Your Kubernetes cluster must have the VolumeSnapshot CRDs installed. Verify if they are installed with this command: + + ```bash + kubectl get crd volumesnapshots.snapshot.storage.k8s.io + ``` + + !!! example "Expected output" + + `volumesnapshotclasses.snapshot.storage.k8s.io` + `volumesnapshotcontents.snapshot.storage.k8s.io` + `volumesnapshots.snapshot.storage.k8s.io` + +2. At least one `VolumeSnapshotClass` must exist and be compatible with the storage class used by your PostgreSQL data volumes. Check it with: + + ```bash + kubectl get volumesnapshotclasses + ``` + + See how to add it in the [Add a VolumeSnapshotClass](#add-volumesnapshotclass) section. + +3. You must explicitly enable the `VolumeSnapshots` feature gate in the Operator Deployment. See [Enable the feature gate](#enable-the-feature-gate). + +## Limitations + +* **Currently only offline mode** – Only offline snapshots are supported; the Operator must stop a replica pod to take a consistent snapshot of the database. +* **At least one replica required** – Your cluster must have at least one replica pod besides the primary. The Operator takes the snapshot from a replica; clusters without replicas cannot use this feature. +* **CSI driver support required** — your Kubernetes cluster’s storage provisioner +must support the Volume Snapshot API. +* **One snapshot backup at a time** – You can only run one snapshot backup at a time on a given cluster; concurrent snapshot backups are not supported. + +[Configure PVC snapshots](backups-pvc-setup.md){.md-button} + diff --git a/docs/backups.md b/docs/backups.md index 59fe4a4d..45c4fafd 100644 --- a/docs/backups.md +++ b/docs/backups.md @@ -11,10 +11,16 @@ file. The Operator makes them automatically according to the schedule. ## What you need to know -### Backup repositories +### Backup methods + +By default, the Operator uses the open source [pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) backup +and restore utility to make backups. + +Starting with version 2.9.0, the Operator also supports [PVC snapshots](backups-pvc-snapshots.md) - storage-level copies of your data volumes. PVC snapshots offer much faster backups and restores. When used with `pgBackRest` WAL archiving, they maintain data consistency and enable you to make point-in-time recovery of your database. -To make backups, the Operator uses the open source [pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) backup -and restore utility. +This feature is in the tech preview stage. You must explicitly enable it for the Operator deployment. + +### Backup repositories When the Operator creates a new PostgreSQL cluster, it also creates a special *pgBackRest repository* to facilitate the usage of the pgBackRest features. You can notice an additional `repo-host` Pod after the cluster diff --git a/docs/env-var-operator.md b/docs/env-var-operator.md index ff458b6b..115e0197 100644 --- a/docs/env-var-operator.md +++ b/docs/env-var-operator.md @@ -71,6 +71,9 @@ Following feature gates are present as of Operator version 2.8.1: 1. `AutoGrowVolumes=true` - Enables automatic PVC resize when the storage usage reaches a threshold. The Operator can trigger volume expansion for database data volumes. To learn more, refer to the [Scale your cluster](scaling.md#enable-automatic-storage-resize) chapter. +2. `VolumeSnapshots=true` - Enables [PVC snapshot support](backups-pvc-snapshots.md) for backups and restores. When enabled and configured in the cluster Custom Resource, the Operator creates volume snapshots in coordination with pgBackRest backups, enabling much faster restores for large datasets. + + **Example configuration:** ```yaml @@ -82,6 +85,13 @@ spec: value: "AutoGrowVolumes=true" ``` +To enable multiple features, list them separated by a comma: + +```yaml +- name: PGO_FEATURE_GATES + value: "AutoGrowVolumes=true,VolumeSnapshots=true" +``` + ### `LOG_STRUCTURED` Controls whether the Operator outputs logs in a structured JSON format instead of the plain text. Structured logging is useful for log aggregation tools. diff --git a/docs/operator.md b/docs/operator.md index 97520de0..344a16b2 100755 --- a/docs/operator.md +++ b/docs/operator.md @@ -563,6 +563,30 @@ The [Kubernetes labels :octicons-link-external-16:](https://kubernetes.io/docs/c | ---------- | ------- | | :material-label-outline: label | `test-label: value` | +### `dataSource.apiGroup` + +The name of the VolumeSnapshot API. It is required for bootstrapping a new cluster from a PVC snapshot. + +| Value type | Example | +| ---------- | ------- | +| :material-code-string: string | `snapshot.storage.k8s.io` | + +### `dataSource.kind` + +Specifies what kind of resources serves as the data source + +| Value type | Example | +| ---------- | ------- | +| :material-code-string: string | `VolumeSnapshot` | + +### `dataSource.name` + +Specifies what name of the PVC snapshot backup that will be used as a data source for the restore. + +| Value type | Example | +| ---------- | ------- | +| :material-code-string: string | `my-snapshot-backup-data` | + ### `image` The PostgreSQL Docker image to use. @@ -1077,6 +1101,30 @@ Enables or disables [tracking the latest restorable time](backups-restore-inplac | ---------- | ------- | | :material-toggle-switch-outline: boolean | `true` | +### `backups.volumeSnapshots.className` + +Name of the [VolumeSnapshotClass :octicons-link-external-16:](https://kubernetes.io/docs/concepts/storage/volume-snapshots/) to use when creating [PVC snapshots](backups-pvc-snapshots.md). When set, the Operator creates a volume snapshot in coordination with each backup. Snapshots enable much faster restores when provisioning new clusters. Requires the `VolumeSnapshots=true` feature gate. + +| Value type | Example | +| ---------- | ------- | +| :material-code-string: string | `csi-gce-pd-snapshot-class` | + +### `backups.volumeSnapshots.mode` + +Specifies the type of PVC snapshot-based backups. + +| Value type | Example | +| ---------- | ------- | +| :material-code-string: string | `offline` | + +### `backups.volumeSnapshots.schedule` + +Specifies the schedule in Cron format to run PVC snapshot-based backups automatically. + +| Value type | Example | +| ---------- | ------- | +| :material-code-string: string | `"0 3 * * *"` | + ### `backups.pgbackrest.metadata.labels` Set [labels :octicons-link-external-16:](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) for pgBackRest Pods. diff --git a/docs/restore-options.md b/docs/restore-options.md index f6b108ff..c199aab7 100644 --- a/docs/restore-options.md +++ b/docs/restore-options.md @@ -39,6 +39,14 @@ Specifies the name of one of the 4 pgBackRest repositories, already configured i | ----------- | ---------- | | :material-code-string: string | `repo1` | +### `volumeSnapshotBackupName` + +Specifies the name of a PVC snapshot-based backup to restore from. See [Configure and use PVC snapshots](backups-pvc-setup.md) to learn more. + +| Value type | Example | +| ----------- | ---------- | +| :material-code-string: string | `backup1` | + ### `options` Specify the [command line options supported by `pgBackRest` :octicons-external-link-16:](https://pgbackrest.org/configuration.html). For example, to make a point-in-time restore or to restore from a specific backup. diff --git a/mkdocs-base.yml b/mkdocs-base.yml index 47786d95..d6954b74 100644 --- a/mkdocs-base.yml +++ b/mkdocs-base.yml @@ -215,6 +215,9 @@ nav: - Restore options: backups-restore.md - To the same cluster (in-place restore): backups-restore-inplace.md - To a new cluster (cluster clone): backups-clone.md + - PVC shapshots: + - "About PVC snapshots": backups-pvc-snapshots.md + - "Configure PVC snapshots": backups-pvc-snapshots-setup.md - "Backup encryption": backup-encryption.md - "Speed up backups": async-archiving.md - "Backup retention": backup-retention.md