-
Notifications
You must be signed in to change notification settings - Fork 170
install/bootupd: chroot to deployment #1816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a chroot into the deployment before calling bootupd when --src-imgref is used. This is a sensible approach to ensure that the binaries and configuration files from the target image are used for bootloader installation. The implementation correctly sets up bind mounts for necessary pseudo-filesystems. However, I've identified a potential resource leak where these bind mounts would not be cleaned up if the bootupd command fails. My review includes a suggestion to refactor the cleanup logic to ensure it runs regardless of the command's success or failure, preventing leaked mounts.
2e335d8 to
3b92a48
Compare
|
/gemini review |
0b51f0e to
7d79124
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a chroot into the deployment before calling bootupd when a source image reference is provided. This is a good change to ensure the correct binaries and configuration from the target image are used.
My review has identified two issues with the resource management of the bind mounts:
- A high-severity issue where mounts can be leaked if an error occurs during the setup of the bind mounts.
- A medium-severity issue in the cleanup logic where only the last error from multiple unmount failures would be reported.
I've provided detailed comments and a suggestion for one of the issues. Addressing these will make the implementation more robust.
crates/lib/src/bootloader.rs
Outdated
| let src_root_arg = if let Some(p) = abs_deployment_path.as_deref() { | ||
| vec!["--src-root", p.as_str()] | ||
| let abs_deployment_path = deployment_path.map(|deploy| rootfs.join(deploy)); | ||
| // When not running inside the target container (through `--src-imgref`) we chroot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's other threads were we talked about offering a bootc install mount as a general ability to mount a deployment outside of booting it; were we to do that it would make a lot of sense for this code to use it.
In ostree we resisted doing that for a long time but eventually did just internally for selinux, see https://github.com/ostreedev/ostree/blob/c6f0b5b2bc26b22fbceee0dc28a0f31349c28d41/src/libostree/ostree-sysroot-deploy.c#L3308
On that topic, it'd be a lot cleaner even here to use a more proper containerization than just setting up the mounts. It's a bit tricky though because we actually do need to e.g. pass through all of /dev and /sys here (i.e. --privileged in docker/podman terms) in order to update the ESP if desired.
I haven't looked at which of bwrap/{runc,crun}/nspawn/podman would make the most sense for this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On that topic, it'd be a lot cleaner even here to use a more proper containerization than just setting up the mounts. It's a bit tricky though because we actually do need to e.g. pass through all of
/devand/syshere (i.e.--privilegedin docker/podman terms) in order to update the ESP if desired.I haven't looked at which of bwrap/{runc,crun}/nspawn/podman would make the most sense for this use case.
I am not sure of what you mean with this comment. Do you want to block this change until there are more proper containerization helpers in bootc, or are you just making a note that this should be revisited later on ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a live chat about this and agreed to merge as is and file a tracker followup issue for improving the mount setup.
crates/lib/src/bootloader.rs
Outdated
| .run_inherited_with_cmd_context() | ||
| .run_inherited_with_cmd_context(); | ||
|
|
||
| // Clean up the mounts after ourselves |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could entirely avoid the need to clean up by using the new mount API to get file descriptors instead, and then use https://docs.rs/cap-std-ext/latest/cap_std_ext/cmdext/trait.CapStdExtCommandExt.html#tymethod.cwd_dir with chroot . or so
|
OK there's some legit failures here like |
cgwalters
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking as requested changes due to failing CI
023a4c6 to
d636cb1
Compare
cgwalters
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking as requested changes due to failing CI
|
Note that it's only Fedora variants that are failing; should reproduce locally via e.g. |
Yes, i can reproduce that locally ! Ok so I dug deeper and I think I figured out something : I am not sure what happens because removing the Without the systemd wrapper : In the second case the chroot works. Another thing to note, and i cannot figure this out yet : the error finding the |
|
Thinking more about this : in both cases the mounting and formating of the block device ( |
|
Ok so after doing a bunch of testing around I will try to add the full path but that requires a rebuild of bootupd as well, so that's for tomorrow. edit: ok I did ask bootc to run |
e225788 to
7a14a4b
Compare
f7891ca to
9e2fbc4
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a chroot mechanism when installing the bootloader with bootupd via --src-imgref. This ensures that the binaries and configuration from the target image are used. The changes look mostly correct, but I've found a few issues.
There is a critical bug in how the rootfs_mount path is determined, which will break the non-chroot installation path. I've also found a high-severity issue where an unmount operation is performed unconditionally while the corresponding mount is conditional, which could lead to errors.
Additionally, I've left some comments regarding leftover commented-out code, a WIP comment, and a suggestion to make the code more robust by avoiding .unwrap().
Once these issues are addressed, the changes should be good to go.
crates/lib/src/bootloader.rs
Outdated
| tracing::debug!("bind mounting {}", dest.display()); | ||
| rustix::mount::mount_bind_recursive(src, dest)?; | ||
| } | ||
| // WIP : let's try to bind-mount /target/boot into the deployment as well rather than bind-mounting the whole thing?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0866667 to
93aaa72
Compare
035ccd8 to
4344dc3
Compare
The install-outside-container and install-unified-flag tests were failing because they pulled centos-bootc:stream10 which has a newer bootupd with EFI files at /usr/lib/efi/, while the running Fedora image has the old layout at /usr/lib/bootupd/updates/EFI/. Use the booted image instead to ensure bootupd versions match. The original code is commented out for easy reversion once PR bootc-dev#1816 lands to properly handle cross-version installs. Assisted-by: OpenCode (Claude Sonnet 4) Signed-off-by: Colin Walters <[email protected]>
4344dc3 to
6f94015
Compare
When `--src-imgref` is passed, the deployed systemd does not match the running environnement. In this case, let's run bootupd from inside the deployment. This makes sure we are using the binaries shipped in the image (and relevant config files such as grub fragements). We use bwrap to set up the chroot for a easier handling of the API filesystems. We could do that in all cases but i kept it behind the `--src-imgref` option since when using the target container as the buildroot it will have no impact, and we expect this scenario to be the most common. In CoreOS we have a specific test that checks if the bootloader was installed with the `grub2-install` of the image. Fixes bootc-dev#1559 Also see bootc-dev#1455 Assisted-by: OpenCode (Opus 4.5) Signed-off-by: jbtrystram <[email protected]>
6f94015 to
38d8381
Compare
|
Alright so I finally figured out a working solution ! For posterity i'll leave a small writeup : I figured out why the CI was failing in my previous manual The chroot was technically working, but leaving the host in a broken state, requiring a reboot. I tried systemd-nspawn to handle all the API filesystems better than me manually, which seemed to work just fine in my initial testing, until i figured out the base bootc images don't ship the I then tried
The |
crates/utils/src/bwrap.rs
Outdated
| #[derive(Debug, Default)] | ||
| pub struct BwrapCmd<'a> { | ||
| /// The target directory to use as root for the container | ||
| chroot_path: &'a str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine as is but I would like to add an option that takes a Dir instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the last push
crates/lib/src/bootloader.rs
Outdated
| for partition in &device.partitions { | ||
| cmd = cmd.bind_device(&partition.node); | ||
| } | ||
| // // TODO : is it needed ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the general case - maybe, and especially when talking about cross-distro.
Especially when run outside of a container in the general case we could get all sorts of stuff leaking into our environment - like LD_LIBRARY_PATH or LD_PRELOAD and that could easily break because now the environment is different from the target root.
But in practice, we can just let it be inherited for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is a leftover from the manual chroot, but I noticed that bwrap does set this up for us so I commented it out.
You're right about other things that could leak in though
| } | ||
|
|
||
| // Add device bind mounts | ||
| for device in self.devices { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would actually probably work most reliably in general is for us to pass the block device as a file descriptor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC that would require bootupd to accept the device as a FD rather than a path. so we can't change that here for now, correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, or we can pass /proc/ns/fd/{rawFd} I guess ?
|
There was one failure in fedora-43/ostree which might have been related to this; I restarted it. Clearly one thing that would help again is for us to lock the blockdev and pass it down as a fd and not a path through the stack. |
I think that's because of the commented out $PATH. I'm testing locally to verify. |
Confirmed. I'll push a fix that also address some of the review comments :) |
Head branch was pushed to by a user without write access
62964bb to
caa758b
Compare
When
--src-imgrefis passed, the deployed systemd does not match the running environnement. In this case, let's chroot into the deployment before calling bootupd. This makes sure we are using the binaries shipped in the image (and relevant config files such as grub fragements).We could do that in all cases but i kept it behind the
--src-imgrefoption since when using the target container as the buildroot it will have no impact, and we expect this scenario to be the most common.In CoreOS we have a specific test that checks if the bootloader was installed with the
grub2-installof the image.Fixes #1559 Also see #1455