Conversation
7de191e to
9589736
Compare
pkg/api/cached.go
Outdated
There was a problem hiding this comment.
We should consider what value we want for the redirect_dir option. When I was working with overlayfs for making the gadget CI bits that reset the filesystem faster, I had to set it to on, but that's not necessarily true here. https://docs.kernel.org/filesystems/overlayfs.html#redirect-dir
See https://github.com/gadget-inc/gadget/pull/11070/files#diff-37490d715785dc29d4323daef03e84518b414f5fdca4d258c9f415c936080d4f if you are curious
There was a problem hiding this comment.
TIL.
We can set it, I'm not sure how often we'll change the cache but it'll be nice to have the option.
What do you mean by
CI bits that reset the filesystem faster
As in, overlay was faster or something else?
|
@airhorns https://github.com/gadget-inc/gadget/pull/14578 CI is passing with the overlay running. I think this PR is ready for review. |
pkg/api/cached.go
Outdated
There was a problem hiding this comment.
We gonna try to avoid this extra subfolder if we can after talking IRL
| @@ -0,0 +1,31 @@ | |||
| name: Test Integration | |||
There was a problem hiding this comment.
Added this because overlay won't run on mac's so bumping the csi-tests out to be more integration sanity tests.
| return totalSize, err | ||
| } | ||
|
|
||
| func execCommand(cmdName string, args ...string) error { |
There was a problem hiding this comment.
In github actions we're not running as root, in our deployment we run as root in a privileged container so this is how we can call mount/umount during our tests. If there's a more elegant way to do this let me know.
| if err := os.MkdirAll(targetPath, targetPermissions); err != nil { | ||
| return nil, fmt.Errorf("failed to create target directory %s: %s", targetPath, err) | ||
| workdir := path.Join(volumePath, WORK_DIR) | ||
| err = os.MkdirAll(workdir, 0777) |
There was a problem hiding this comment.
This did not seem to set the dir to 0777 so I had to add the explicit chmod below.
There was a problem hiding this comment.
Terrifying -- maybe because its a mkdirall, the new perms only apply if it is creating new dirs, but it leaves existing ones alone?
|
@airhorns I made the changes, I still have some clean up to do but I think this is ready for a quick second pass. |
airhorns
left a comment
There was a problem hiding this comment.
LGTM, or definitely good enough to canary
| if err := os.MkdirAll(targetPath, targetPermissions); err != nil { | ||
| return nil, fmt.Errorf("failed to create target directory %s: %s", targetPath, err) | ||
| workdir := path.Join(volumePath, WORK_DIR) | ||
| err = os.MkdirAll(workdir, 0777) |
There was a problem hiding this comment.
Terrifying -- maybe because its a mkdirall, the new perms only apply if it is creating new dirs, but it leaves existing ones alone?
| return totalSize, err | ||
| } | ||
|
|
||
| func execCommand(cmdName string, args ...string) error { |
| "overlay", | ||
| "-n", | ||
| "--options", | ||
| fmt.Sprintf("redirect_dir=on,lowerdir=%s,upperdir=%s,workdir=%s", c.StagingPath, upperdir, workdir), |
There was a problem hiding this comment.
Did redirect_dir end up being necessary?
There was a problem hiding this comment.
Not entirely, it seems to be useful if there are changes to the lower dir that files don't disappear which seems good in general.
Fix up mount permissions Make the cache dir not writeable by the sandbox (or non-DL user) Remove wonky integration test
Fixup overlay command and rename things to be less gadget centric
Use sudo Fixup tests Use sudo if env variable set Test tweaks
7cf5d2e to
f366531
Compare
Use an overlay for dateilager's cached service. Instead of giving each sandbox pod its own hard linked copy of the
dl_cachewe will mount an overlayfs of thecached'sdl_cacheIn the pod that's mounting to overlay we can see that we can hardlink between the 2 directories and that the pod can write to the cache and corrupt it.

however on the
cachedpod the contents are still fineand additionally if we mount a second pod, it too sees the clean copy.

The pods are configured to use their ephemeral space so any writes to
dl_cachewill be cleaned up when the pod is killed.The
cachedpod stores the "golden cache" at/var/lib/kubelet/dateilager_cacheFor a simple example pod lets look at
A pod that is mounting a volume managed by the dateilager-csi will get a new ephemeral dir (like
emptyDir) that is on the host/var/lib/kubelet/pods/88064d87-6870-4da5-8ba1-d9ac62b3f3ff/volumes/kubernetes.io~csi/gadget/mountthis is thetargetPaththat comes in on thecsi.NodePublishVolumeRequestThe directory structure that we create inside of the target dir is like this
gadgetis the target mount point for the overlay.gadget_writeableis theupperdirandworkis theworkdirThe readonly
lowerdiris/var/lib/kubelet/dateilager_cachewhich containsdl_cache.