Skip to content

procfs: workarounds for AppArmor design flaw #296

@cyphar

Description

@cyphar

Okay, I've figured it out. This is really dumb. tl;dr: This is really an AppArmor bug (or even a design flaw if you prefer).

For context, the file we are trying to write to is /proc/sys/net/ipv4/ip_unprivileged_port_start. @stgraber figured out that the problematic AppArmor rules are the rules they have which block writing to most /sys files. How is it possible that one affects the other?

Well, the problem is that runc now uses a detached mount of procfs to operate on (this avoids mount race attacks). Because detached mounts have not been attached to the filesystem, d_name (the kernel's facility for generating names for dentries) just generates a name that looks like /foo if you try to open a file foo inside the detached procfs mount. AFAICS this is what AppArmor uses to determine what file you are trying to write to (because AppArmor is path-based, and d_name is the only way to get pathnames from dentries).

This means that when we try to write to /proc/sys/net/ipv4/ip_unprivileged_port_start, AppArmor sees this as us trying to write to /sys/net/ipv4/ip_unprivileged_port_start which is forbidden by the /sys denial rules. I have attached a program that can show this behaviour using a detached tmpfs mount, it's very trivial to trigger:

% ./aa-bug &
c1:~ # ./aa-bug &
fd: /proc/2061/fd/5
[1] 2061
c1:~ # mkdir /proc/2061/fd/5/sys
c1:~ # mkdir /proc/2061/fd/5/sys/foo
mkdir: cannot create directory ‘/proc/2061/fd/5/sys/foo’: Permission denied

aa-bug.go.txt

There is a trivial workaround for this particular sysctl:

-  deny /sys/[^fdck]*{,/**} wklx,
+  deny /sys/[^fdckn]*{,/**} wklx,

(In /etc/apparmor.d/abstractions/lxc/container-base.)

But this doesn't help in the general case for all sysctls. @stgraber has just submitted lxc/incus#2624 which just removes these rules entirely. I think AppArmor should not do this, because it's incredibly broken (literally any detached mount could match against a rule by accident), but this is unfortunately how AppArmor's design works.

From runc's side, we could in theory use this to our advantage -- if we created a tmpfs with a subpath like .go-away-apparmor and then attached our procfs mount to that path, we might be able to subvert AppArmor. However, this has a risk of causing lifetime issues that would require a rework of how we do lookups -- the tmpfs must not be closed after we attach to it because it will lazy-unmount the procfs...

Originally posted by @cyphar in #12484

Metadata

Metadata

Assignees

No one assigned

    Labels

    api/procfsRelated to the procfs API.bugSomething isn't workingupstream/linuxDependent on upstream kernel work.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions