Skip to content

Conversation

@giuseppe
Copy link

When pulling container images from local containers-storage, detect overlay filesystem layers and import them directly to OSTree instead of processing tar streams. This allows to use reflinks on file systems that support them.

Assisted-by: Claude Opus 4.5

This is just a PoC at the moment, I've not fully reviewed the generated code yet.

@cgwalters This takes a different approach that we discussed, but I believe it is easier to maintain in the containers library, as it is much easier to add a new API to the image/storage library to expose low level objects (i.e. layer diff when available, file objects by their checksum...) that will replace the manual access to the overlay diff that is implemented now.

Another benefit, IMO, is that it moves us away from tarballs which ultimately I'd like to see it happening in the containers library as well.

I've manually tested the change and that reflinks are created to the files in the containers storage.

@github-actions github-actions bot added area/install Issues related to `bootc install` area/ostree Issues related to ostree labels Jan 29, 2026
@bootc-bot bootc-bot bot requested a review from jeckersb January 29, 2026 10:50
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a proof-of-concept for directly importing container layers from an overlay filesystem, aiming to improve performance by leveraging reflinks. This involves adding a new import path in ostree-ext that bypasses tar stream processing, using ostree::Repo::write_dfd_to_mtree and plumbing Storage context through bootc. However, the implementation contains a path traversal vulnerability in the way layer directory paths are constructed from layers.json. The use of open_ambient_dir and std::fs::read_dir in the filesystem import logic is insecure and should be replaced with more robust cap_std operations to prevent potential host filesystem access and TOCTOU issues. Additionally, the new filesystem-based import path does not set an ostree ref for the imported layer commit, which breaks layer caching and negates performance benefits for subsequent operations.

}

// Construct the layer diff path: $STORAGE/overlay/$LAYER_ID/diff
let layer_diff_path = storage_root.join(format!("overlay/{}/diff", layer_id));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The layer_id obtained from layers.json is used to construct layer_diff_path using storage_root.join(). Since layer_id is not validated, it can contain path traversal sequences like .., allowing an attacker who can influence the contents of layers.json to point layer_diff_path to arbitrary directories on the host. This is a path traversal vulnerability that could lead to unauthorized access to host files during the container import process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're effectively trusting the source storage so this is not a critical problem at all, but containers/composefs-rs#218 comprehensively uses cap_std Dir for this reason.

Comment on lines 1285 to 1329
async fn import_layer_from_filesystem(
repo: &ostree::Repo,
layer_path: &Utf8Path,
target_ref: &str,
options: &crate::tar::WriteTarOptions,
) -> Result<crate::tar::WriteTarResult> {
tracing::info!(
"import_layer_from_filesystem: layer_path={}, target_ref={}",
layer_path,
target_ref
);

let config = crate::filesystem::FilesystemFilterConfig::from_tar_options(options);
let repo = repo.clone();
let layer_path = layer_path.to_owned();

// Run the synchronous import in a blocking task
// Note: The caller is responsible for setting the ref within a transaction
let result = crate::tokio_util::spawn_blocking_flatten(move || {
crate::filesystem::import_filesystem_to_ostree(&repo, &layer_path, &config)
})
.await?;

tracing::info!(
"Successfully imported layer from filesystem: commit={}",
result.commit
);

Ok(result)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The target_ref parameter is unused in this function. This leads to a bug where layers imported from the filesystem are not cached in ostree via a ref, causing them to be re-imported on every pull.

The else branch for tar stream import correctly uses crate::tar::write_tar which sets the ref. The new filesystem import path should do the same to ensure layer caching works as expected.

To fix this, you should pass target_ref down to import_filesystem_to_ostree and have it set the ref after creating the commit. This will also involve removing the now-incorrect comment about the caller being responsible for setting the ref.

Here are the suggested changes:

1. In crates/ostree-ext/src/container/store.rs:

--- a/crates/ostree-ext/src/container/store.rs
+++ b/crates/ostree-ext/src/container/store.rs
@@ -1297,11 +1297,10 @@
         let config = crate::filesystem::FilesystemFilterConfig::from_tar_options(options);
         let repo = repo.clone();
         let layer_path = layer_path.to_owned();
+        let target_ref = target_ref.to_owned();
 
         // Run the synchronous import in a blocking task
-        // Note: The caller is responsible for setting the ref within a transaction
         let result = crate::tokio_util::spawn_blocking_flatten(move || {
-            crate::filesystem::import_filesystem_to_ostree(&repo, &layer_path, &config)
+            crate::filesystem::import_filesystem_to_ostree(&repo, &layer_path, &target_ref, &config)
         })
         .await?;
 

2. In crates/ostree-ext/src/filesystem/filter.rs:

--- a/crates/ostree-ext/src/filesystem/filter.rs
+++ b/crates/ostree-ext/src/filesystem/filter.rs
@@ -72,6 +72,7 @@
 pub fn import_filesystem_to_ostree(
     repo: &ostree::Repo,
     src_path: &Utf8Path,
+    target_ref: &str,
     config: &FilesystemFilterConfig,
 ) -> Result<crate::tar::WriteTarResult> {
     let cancellable = gio::Cancellable::NONE;
@@ -188,6 +189,8 @@
         .write_commit(None, None, None, Some(&metadata), &root, cancellable)
         .context("Writing commit")?;
 
+    repo.set_ref_immediate(None, target_ref, Some(&commit), cancellable)?;
+
     Ok(crate::tar::WriteTarResult {
         commit: commit.to_string(),
         filtered: filtered_stats,

Comment on lines 80 to 155
let src_dir = cap_std_ext::cap_std::fs::Dir::open_ambient_dir(
src_path.as_std_path(),
cap_std_ext::cap_std::ambient_authority(),
)
.with_context(|| format!("Opening source directory: {}", src_path))?;
let src_dfd = src_dir.as_raw_fd();

// Create the root MutableTree
let root_mtree = ostree::MutableTree::new();
let mut filtered_stats = BTreeMap::new();

// Create a commit modifier for the import
// Note: We don't use CONSUME flag since we don't want to modify the source overlay
let modifier = ostree::RepoCommitModifier::new(ostree::RepoCommitModifierFlags::empty(), None);

// Set up SELinux labeling from base commit (always enabled when base is provided)
if let Some(ref base) = config.base {
modifier.set_sepolicy_from_commit(repo, base, cancellable)?;
}

// Create default dirmeta for directories we create
let dirmeta_checksum = create_default_dirmeta(repo)?;

// Set the root tree's metadata checksum - required before write_mtree
root_mtree.set_metadata_checksum(&dirmeta_checksum);

// Read toplevel entries and process each according to its classification
let entries =
std::fs::read_dir(src_path).with_context(|| format!("Reading directory: {}", src_path))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The function import_filesystem_to_ostree uses open_ambient_dir to open the source directory and std::fs::read_dir to iterate over its entries. open_ambient_dir follows symlinks and allows opening arbitrary absolute paths. Combined with the potential path traversal in layer_diff_path, this allows importing arbitrary host directories into the OSTree repository. Furthermore, using std::fs::read_dir instead of src_dir.entries() introduces a TOCTOU (Time-of-Check to Time-of-Use) risk and is inconsistent with the use of the cap_std library for secure filesystem access.

@giuseppe giuseppe force-pushed the copy-directly-from-storage-overlay branch from b4aa56a to bacbf43 Compare January 29, 2026 11:54
@cgwalters
Copy link
Collaborator

Awesome, thanks for kickstarting this! So I have a giant amount of code generated in https://github.com/cgwalters/cstor-rs that is a lot more fleshed-out "read (and maybe write to) containers-storage" - but it also needs some cleanup, review, testing etc.

Backing up a second...here's what I'm thinking re #20

Basically let's try to go to a world where the flow is:

  • Pull into containers-storage: using podman/skopeo (probably podman since the bollard apis are probably just fine for this, but we could make it pluggable too)
  • Reflink/hardlink from there into composefs-rs
  • If ostree is enabled, reflink/hardlink from composefs-rs into ostree

The thing that is really appealing to me about this is we're centralizing more on composefs.

@giuseppe giuseppe force-pushed the copy-directly-from-storage-overlay branch 2 times, most recently from 44859a5 to 8324e0f Compare January 29, 2026 14:21
@giuseppe
Copy link
Author

  • Reflink/hardlink from there into composefs-rs

I have missed this one. Where is the composefs-rs store?

@cgwalters
Copy link
Collaborator

I have missed this one. Where is the composefs-rs store?

It's initialized by default only in the composefs path, but...it would make sense probably to do so by default now even with ostree. It's in /composefs.

@giuseppe
Copy link
Author

I guess I've hit ostreedev/ostree#2712 as it fails to import the whiteout and it seems there is no way to filter it without changing the source directory

When pulling container images from local containers-storage, detect
overlay filesystem layers and import them directly to OSTree instead
of processing tar streams. This allows to use reflinks on file systems
that support them.

Assisted-by: Claude Opus 4.5
Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe giuseppe force-pushed the copy-directly-from-storage-overlay branch from 8324e0f to 4a5cad5 Compare January 29, 2026 16:20
@cgwalters
Copy link
Collaborator

Reflink/hardlink from there into composefs-rs

➡️ containers/composefs-rs#218


let file = match std::fs::File::open(&layers_json_path) {
Ok(f) => f,
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +240 to +241
let matches = layer.diff_digest.as_deref() == Some(digest)
|| layer.compressed_diff_digest.as_deref() == Some(digest);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is interesting, I need to understand the distinction here, will look.

let root_mtree = ostree::MutableTree::new();
let mut filtered_stats = BTreeMap::new();

// Create a commit modifier - no SELinux here, it's applied during merge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should set the selinux labeling policy here. Which is messy! Because we need to mount (or compute/access) the merged root for just /etc/selinux.

But not doing so causes effective double disk space usage temporarily, it's very ugly.

Again the SELinux labeling thing is one of the most awesome parts about composefs - we just stick all that in the metadata.

let name_str = name.to_string_lossy();
let file_type = entry.file_type().context("Getting file type")?;

// Only process directories at toplevel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what's funny:

$ podman run --rm -ti quay.io/fedora/fedora cat /.profile
kiwi_align='1048576'
kiwi_boot_timeout=''
kiwi_bootkernel=''
kiwi_bootloader='grub2'

Yes, we build the Fedora base image with this "kiwi" tool that likes to dump its metadata as a file into /...soooooo ugly

)
.with_context(|| format!("Importing file: {}", file_path.display()))?;
} else if is_overlay_whiteout(&metadata) {
// Convert overlay whiteout to OCI format (.wh.<filename>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm but we're not handling .wh..wh..opq here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/install Issues related to `bootc install` area/ostree Issues related to ostree

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants