Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 0 additions & 72 deletions storage/docs/containers-storage-composefs.md

This file was deleted.

31 changes: 31 additions & 0 deletions storage/docs/containers-storage-driver-btrfs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# containers-storage 1 "February 2026"

## NAME
containers-storage-driver-btrfs - The btrfs storage driver

## DESCRIPTION

The btrfs driver uses native btrfs copy-on-write via subvolumes and snapshots.

## IMPLEMENTATION

The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.

Requires a btrfs filesystem. Layers are stored as subvolumes under `btrfs/subvolumes/`. New empty layers are created as subvolumes; child layers are created as btrfs snapshots, providing true CoW semantics. Quotas are supported via btrfs qgroups. Set `btrfs.min_space` to enable quota enforcement.

Reference: `drivers/btrfs/btrfs.go`

## RUNTIME

Like VFS, there is no mount involved. Btrfs subvolumes are accessible as regular directories, so `Get()` returns the subvolume path directly. If a quota was configured, the qgroup limit is applied at this point. `Put()` is a no-op.

## BUGS

https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fbtrfs

## FOOTNOTES
The Containers Storage project is committed to inclusivity, a core value of open source.
The `master` and `slave` mount propagation terminology is used in this repository.
This language is problematic and divisive, and should be changed.
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
86 changes: 86 additions & 0 deletions storage/docs/containers-storage-driver-overlay.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# containers-storage 1 "February 2026"

## NAME
containers-storage-driver-overlay - The overlay storage driver

## DESCRIPTION

The overlay driver uses Linux OverlayFS for copy-on-write semantics. This is the default and recommended driver for most use cases. See [containers-storage.conf.5.md](containers-storage.conf.5.md) for configuration options.

## IMPLEMENTATION

The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.
The description below is intended to aid debugging and recovery, but changing content directly is not supported.

The top-level overlay directory holds layers keyed by a [chain ID](https://github.com/opencontainers/image-spec/blob/main/config.md#layer-chainid) which identifies the precise sequence of parent layers leading to this one. A layer with the same DiffID can have multiple physical objects in this directory if it was created in different contexts (e.g. with or without zstd:chunked).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The top-level overlay directory holds layers keyed by a [chain ID](https://github.com/opencontainers/image-spec/blob/main/config.md#layer-chainid) which identifies the precise sequence of parent layers leading to this one. A layer with the same DiffID can have multiple physical objects in this directory if it was created in different contexts (e.g. with or without zstd:chunked).
The top-level overlay directory holds layers keyed by a [chain ID](https://github.com/opencontainers/image-spec/blob/main/config.md#layer-chainid), which identifies the precise sequence of parent layers leading to this one. A layer with the same DiffID can have multiple physical objects in this directory if it was created in different contexts (e.g., with or without zstd:chunked).


Each layer has at least a `diff` directory and `link` file. If there are lower layers, the layer also has a `lower` file, `merged` directory, and `work` directory. The `diff` directory has the upper layer of the overlay and is used to capture any changes to the layer. The `lower` file contains all the lower layer mounts separated by `:` and ordered from uppermost to lowermost layers. The overlay itself is mounted in the `merged` directory, and the `work` dir is needed for overlay to work.

The `link` file for each layer contains a unique string for the layer. Under the `l/` directory at the root there will be a symbolic link with that unique string pointing to the `diff` directory for the layer. The symbolic links are used to reference lower layers in the `lower` file and on mount. The links are used to shorten the total length of a layer reference without requiring changes to the layer identifier or root directory. Mounts are always done relative to root and referencing the symbolic links in order to ensure the number of lower directories can fit in a single page for making the mount syscall.

A hard upper limit of 500 lower layers is enforced.

The `overlay-layers/` directory alongside the per-layer directories contains metadata managed by the storage library. Each layer has a `${layerid}.tar-split.gz` file preserving the original tar stream structure (without file content) so that the original archive can be reconstructed exactly from the unpacked `diff/`. The directory also contains `layers.json` with global layer metadata and `layers.lock` for concurrency control.

The `overlay-containers/` directory holds running container state: `containers.json` for metadata and `containers.lock` for concurrency control.

Reference: `drivers/overlay/overlay.go`

## RUNTIME

When a container needs its filesystem, the driver performs a `mount(2)` with type `overlay`, passing the layer's `diff` directory as the upperdir and all parent layers' `diff` directories as lowerdirs. The kernel's overlayfs merges these at access time — no data is copied, and layers remain independent on disk. Writes go to the upperdir via copy-up. The mount is placed at the layer's `merged` directory, and the `work` directory is used internally by overlayfs for atomic operations like rename.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When a container needs its filesystem, the driver performs a `mount(2)` with type `overlay`, passing the layer's `diff` directory as the upperdir and all parent layers' `diff` directories as lowerdirs. The kernel's overlayfs merges these at access time — no data is copied, and layers remain independent on disk. Writes go to the upperdir via copy-up. The mount is placed at the layer's `merged` directory, and the `work` directory is used internally by overlayfs for atomic operations like rename.
When a container needs its filesystem, the driver performs a `mount(2)` with type `overlay`, passing the layer's `diff` directory as the upperdir and all parent layers' `diff` directories as lowerdirs. The kernel's overlayfs merges these at access time — no data is copied, and layers remain independent on disk. Writes go to the upperdir via copy-up. The mount is placed in the layer's `merged` directory, and the `work` directory is used internally by overlayfs for atomic operations such as rename.


If a mount program is configured (e.g. `fuse-overlayfs` for rootless operation), it is invoked instead of the `mount(2)` syscall. When the mount option string exceeds the kernel's page size limit, the driver forks a child process that `chdir`s into the storage root and uses relative paths to shorten the options.

On `Put()`, the overlayfs mount is unmounted.

### zstd:chunked

`zstd:chunked` is a variant of the `application/vnd.oci.image.layer.v1.tar+zstd` media type that uses zstd skippable frames to include a table of contents with SHA-256 digests and offsets of individual file chunks. This allows fetching only content not already present via HTTP range requests.

Note: The zstd:chunked format is not standardized, though it is an eventual goal to do so.

Each layer has an associated big data key `chunked-manifest-cache` containing index metadata in a binary format suitable for mmap(). When pulling, existing layers are scanned for files with matching digests. Matching files are hardlinked if `use_hardlinks = "true"`, otherwise reflinked (or copied if reflinks are unsupported).

Configuration (support is enabled by default in the code):

```
[storage.options.pull_options]
enable_partial_images = "true"
```

Configuration values must be string booleans (quoted), not native TOML booleans.

Reference: `pkg/chunked/internal/compression.go`

### composefs

composefs provides an immutable filesystem layer with optional integrity verification.

Configuration:

```
[storage.options.overlay]
use_composefs = "true"
```

Configuration values must be string booleans (quoted), not native TOML booleans.

composefs requires zstd:chunked images. For non-zstd:chunked images, set `convert_images = "true"` in `[storage.options.pull_options]` to enable dynamic conversion during pulls.

With composefs enabled, the `diff/` directory becomes an object hash directory where each filename is the sha256 of its contents. Each layer has a `composefs-data/composefs.blob` file containing the composefs superblock with all metadata.

Existing layers are scanned for matching objects and reused via hardlink or reflink. An attempt is made to enable fsverity on backing files, but this is best-effort only; there is currently no support for enforced integrity verification.

Layers with or without composefs format can be mixed in the same overlay stack. Layers with a composefs blob are mounted and included in the final overlayfs stack, while layers without composefs format are reused as-is.

## BUGS

https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Foverlay

## FOOTNOTES
The Containers Storage project is committed to inclusivity, a core value of open source.
The `master` and `slave` mount propagation terminology is used in this repository.
This language is problematic and divisive, and should be changed.
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
31 changes: 31 additions & 0 deletions storage/docs/containers-storage-driver-vfs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# containers-storage 1 "February 2026"

## NAME
containers-storage-driver-vfs - The VFS storage driver

## DESCRIPTION

The VFS driver copies directories to create layers. No kernel overlay filesystem support is required.

## IMPLEMENTATION

The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.

Layers are stored under `vfs/dir/`. When creating a layer from a parent, the entire parent directory is copied. The copy uses reflinks (FICLONE) if supported by the filesystem, falling back to regular copying otherwise. The VFS driver works on any filesystem but is storage-inefficient without reflink support.

Reference: `drivers/vfs/driver.go`, `drivers/copy/copy_linux.go`

## RUNTIME

There is no mount involved. When a container needs its filesystem, `Get()` simply returns the layer's directory path. All layer merging happened at create time when the parent was copied, so the directory is already a complete filesystem tree. `Put()` is a no-op since there is nothing to unmount.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There is no mount involved. When a container needs its filesystem, `Get()` simply returns the layer's directory path. All layer merging happened at create time when the parent was copied, so the directory is already a complete filesystem tree. `Put()` is a no-op since there is nothing to unmount.
There is no mount involved. When a container needs its filesystem, `Get()` simply returns the layer's directory path. All layer merging occurs at create time, when the parent is copied, so the directory is already a complete filesystem tree. `Put()` is a no-op since there is nothing to unmount.


## BUGS

https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fvfs

## FOOTNOTES
The Containers Storage project is committed to inclusivity, a core value of open source.
The `master` and `slave` mount propagation terminology is used in this repository.
This language is problematic and divisive, and should be changed.
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
33 changes: 33 additions & 0 deletions storage/docs/containers-storage-driver-zfs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# containers-storage 1 "February 2026"

## NAME
containers-storage-driver-zfs - The ZFS storage driver

## DESCRIPTION

The ZFS driver uses ZFS datasets and clones for copy-on-write semantics.

## IMPLEMENTATION

The on-disk file layout is an internal implementation detail and may change between versions. The only stable interface is the Go library API.

Requires `/dev/zfs` and the `zfs` command. Configure the parent dataset via the `zfs.fsname` option.

Layers are stored as datasets under `zfs.fsname` (e.g., `tank/containers/storage/$id`). Mountpoints are at `zfs/graph/`. All datasets use `mountpoint=legacy` so containers-storage controls mounts directly. New root layers are created with `zfs create`. Child layers are created by snapshotting the parent dataset and cloning the snapshot; the snapshot is marked for deferred deletion after cloning.

Reference: `drivers/zfs/zfs.go`

## RUNTIME

When a container needs its filesystem, the driver performs `mount(2)` with type `zfs` to mount the dataset at a path under `zfs/graph/`. Because all datasets use `mountpoint=legacy`, ZFS does not auto-mount them — the driver controls when and where each dataset is mounted. A reference counter tracks multiple users of the same mountpoint. On `Put()`, the last reference triggers an unmount.

## BUGS

https://github.com/containers/storage/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fzfs

## FOOTNOTES
The Containers Storage project is committed to inclusivity, a core value of open source.
The `master` and `slave` mount propagation terminology is used in this repository.
This language is problematic and divisive, and should be changed.
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.
58 changes: 0 additions & 58 deletions storage/docs/containers-storage-zstd-chunked.md

This file was deleted.

5 changes: 5 additions & 0 deletions storage/drivers/btrfs/btrfs.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
//go:build linux && cgo

// Package btrfs implements the btrfs storage driver for container images.
// It uses native btrfs copy-on-write via subvolumes and snapshots, storing
// layers as subvolumes under a 'subvolumes/' directory. Child layers are
// created as snapshots for true copy-on-write semantics. Storage quotas
// are supported via btrfs qgroups.
package btrfs

/*
Expand Down
4 changes: 4 additions & 0 deletions storage/drivers/overlay/overlay.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
//go:build linux

// Package overlay implements the overlay storage driver for container images.
// It uses Linux OverlayFS to provide copy-on-write semantics, allowing efficient
// storage sharing between image layers. When enabled, composefs can be used as
// an optional mode for enhanced integrity verification of container images.
package overlay

import (
Expand Down
5 changes: 5 additions & 0 deletions storage/drivers/vfs/driver.go
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
// Package vfs implements the VFS storage driver for container images.
// It copies directories to create layers, attempting reflinks (FICLONE) first
// for efficient copy-on-write on supporting filesystems, then falling back to
// copy_file_range and regular copying. This provides maximum filesystem
// compatibility while achieving storage efficiency on reflink-capable filesystems.
package vfs

import (
Expand Down
5 changes: 5 additions & 0 deletions storage/drivers/zfs/zfs.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
//go:build linux || freebsd

// Package zfs implements the ZFS storage driver for container images.
// It uses ZFS datasets and clones for copy-on-write semantics. Each layer
// is stored as a dataset under the parent specified by zfs.fsname, with
// child layers created by snapshotting and cloning. Datasets use
// mountpoint=legacy so containers-storage controls mount operations.
package zfs

import (
Expand Down