You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/analysis/timings.md
+45-38Lines changed: 45 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,15 @@
1
1
# What did this change break?
2
+
2
3
Hopefully nothing? :D
3
4
4
5
# What is this change doing?
6
+
5
7
My goal is for taint tracking to work exactly as before, but to clean up the ftrace/cflog/events side of the house, unifying `--cflog` and `--ftrace` options (cleaning up / simplifying how we are writing to the Functions, Events, Control Flow Log, and String Table sections overall) so we don't add duplicate instrumentation to software or write duplicate data to the TDAG and/or separate files (i.e., functionid.json) anymore.
6
8
7
9
Everything that I could build got run on example inputs to make sure it worked as expected. As a part of these changes we don't write to functionid.json anymore and just use the space we were allocating and not filling in in the tdag, since it's a humongous region we don't use all of anyway. TDAG size is fixed, but our usage of it is slightly more efficient currently. A future goal could be to only mmap the space we need so file size can be smaller.
8
10
9
11
# Instrumentation Time and Resulting Bitcode Sizes
12
+
10
13
These experiments reproduce the measurements from the
but on different hardware. For uniformity, experiments were all conducted in an Ubuntu 24.04 cloud VM with
@@ -20,57 +23,61 @@ I'm comparing the before-and-after of the TDAG condensation changes on `kaoudis/
20
23
All the current example Dockerfiles on `master` that work right now (we/I need to clean up the others a bit; they're a bit bitrotted) are included here for completeness. The following measurements aren't terribly scientific, they are from one run of the Dockerfile each (whereas for the paper I averaged ten runs apiece).
21
24
22
25
## Bitcode sizes
26
+
23
27
The "in" .bc file is the whole-program .bc file that gets the first layer of instrumentation applied to it. The CFlog .bc is the "in" .bc with CFlog instrumentation, pre-optimization (if optimization occurs in the PolyTracker build). the final .bc file is the instrumented .bc file ending in `.instrumented.bc` that we lower to an executable. bc size may have changed because what instrumentation we use changed: I removed the separate function name recording / events pass-level code, and added function name recording to the tdag into the cflog pass. I also removed the separate `--ftrace` and `--taint` options: we do `--taint` by default, and `--ftrace` is part of `--cflog` now.
24
28
25
29
Also note that some dockerfiles did not compile on the `master` branch prior to these changes with the `--cflog` option and I'm not sure why, but because of this I did not record cflog-inclusive bc size for them on `master`.
26
30
27
31
As measured by `ls -lb` in the container, and normalized into MiB:
28
32
29
-
| Dockerfile | In .bc size | Final .bc BEFORE (taint, ftrace, events) | Final .bc BEFORE (cflog, taint, ftrace, events) | CFlog-_only_ .bc |Final .bc AFTER (cflog, taint) | Final .bc AFTER (taint only) |
| Dockerfile | In .bc size | Final .bc BEFORE (taint, ftrace, events) | Final .bc BEFORE (cflog, taint, ftrace, events) | CFlog-_only_ .bc |Final .bc AFTER (cflog, taint) | Final .bc AFTER (taint only) |
TDAG size is fixed because of how we write TDAGs right now; it didn't change.
48
53
49
54
## Total instrumentation time
50
-
"Instrumentation time" here refers either to the time Docker takes to run `polytracker instrument-targets`, which includes how long it takes to do both cflog and taint label instrumentation placement as well as executable creation, or the time to do equivalent steps.
51
-
52
-
Also note that some dockerfiles did not compile on the `master` branch prior to these changes with the `--cflog` option and I'm not sure why, but because of this I did not record cflog-inclusive instrumentation time for them on `master`.
53
-
54
-
As measured by Docker:
55
-
56
-
| Dockerfile | Instrumentation time (taint, ftrace, events) BEFORE | Instrumentation time (cflog, taint, ftrace, events) BEFORE | Instrumentation time (cflog, taint) AFTER | Instrumentation time (taint only) AFTER |
57
-
| -- | -- | -- | -- | -- |
58
-
| Dockerfile-acropalypse.demo | 26.7\* s || 30.3\* s | 27.3\* s |
59
-
| Dockerfile-daedalus-pdf.demo | 34.2 s | 39.1 s | 37.5 s | 35.2 s |
60
-
| Dockerfile-ffmpeg.demo | 150.7 s || 156.5 s | 158.3 s |
61
-
| Dockerfile-file.demo | 12.1 s || 12.4 s | 12.6 s |
62
-
| Dockerfile-libjpeg.demo | 22.7 s || 21.2 s | 23.6 s |
63
-
| Dockerfile-mupdf.demo | 152.4 s || 129.2 s | 154.8 s |
64
-
| Dockerfile-nitro-nitf.demo | 30 s | 33.7 s | 33.8 s | 29.5 s |
65
-
| Dockerfile-openjpeg.demo | 45.3\* s || 51.3\* s | 49.6\* s |
66
-
| Dockerfile-poppler.demo `pdftops`| 291.2 s | 279.1 s | 290 s | 305.9 s |
67
-
| Dockerfile-poppler.demo `pdftotext`| 255.5 s | 249 s | 255.3 s | 268.5 s |
68
-
| Dockerfile-qpdf.demo | 382.9 s || 393.8 s | 391.9 s |
69
-
| Dockerfile-xpdf.demo `pdfinfo`| 154.5 s | 141.9 s | 143.3 s | 164.2 s |
70
-
| Dockerfile-xpdf.demo `pdftops`| 206.9 s | 189.9 s | 187.2 s | 217.2 s |
71
-
| Dockerfile-xpdf.demo `pdftotext`| 169.1 s | 157.1 s | 154.4 s | 184.3 s |
55
+
56
+
"Instrumentation time" here refers either to the time Docker takes to run `polytracker instrument-targets`, which includes how long it takes to do both cflog and taint label instrumentation placement as well as executable creation, or the time to do equivalent steps.
57
+
58
+
Also note that some dockerfiles did not compile on the `master` branch prior to these changes with the `--cflog` option and I'm not sure why, but because of this I did not record cflog-inclusive instrumentation time for them on `master`.
59
+
60
+
As measured by Docker:
61
+
62
+
| Dockerfile | Instrumentation time (taint, ftrace, events) BEFORE | Instrumentation time (cflog, taint, ftrace, events) BEFORE | Instrumentation time (cflog, taint) AFTER | Instrumentation time (taint only) AFTER |
| Dockerfile-acropalypse.demo | 26.7\* s || 30.3\* s | 27.3\* s |
65
+
| Dockerfile-daedalus-pdf.demo | 34.2 s | 39.1 s | 37.5 s | 35.2 s |
66
+
| Dockerfile-ffmpeg.demo | 150.7 s || 156.5 s | 158.3 s |
67
+
| Dockerfile-file.demo | 12.1 s || 12.4 s | 12.6 s |
68
+
| Dockerfile-libjpeg.demo | 22.7 s || 21.2 s | 23.6 s |
69
+
| Dockerfile-mupdf.demo | 152.4 s || 129.2 s | 154.8 s |
70
+
| Dockerfile-nitro-nitf.demo | 30 s | 33.7 s | 33.8 s | 29.5 s |
71
+
| Dockerfile-openjpeg.demo | 45.3\* s || 51.3\* s | 49.6\* s |
72
+
| Dockerfile-poppler.demo `pdftops`| 291.2 s | 279.1 s | 290 s | 305.9 s |
73
+
| Dockerfile-poppler.demo `pdftotext`| 255.5 s | 249 s | 255.3 s | 268.5 s |
74
+
| Dockerfile-qpdf.demo | 382.9 s || 393.8 s | 391.9 s |
75
+
| Dockerfile-xpdf.demo `pdfinfo`| 154.5 s | 141.9 s | 143.3 s | 164.2 s |
76
+
| Dockerfile-xpdf.demo `pdftops`| 206.9 s | 189.9 s | 187.2 s | 217.2 s |
77
+
| Dockerfile-xpdf.demo `pdftotext`| 169.1 s | 157.1 s | 154.4 s | 184.3 s |
72
78
73
79
# What's weird here
80
+
74
81
The sizes of bitcode when instrumented with all our passes before AND after these changes seem like they could be indicative of extra instrumentation (perhaps the labels pass instrumenting the cflog and/or functions pass?), though I haven't dug into whether this is truly happening yet. It doesn't _seem like_ this is exactly hurting anything at the moment, but I would be curious if others notice the same.
0 commit comments