Skip to content

Commit ca0beb7

Browse files
committed
ebuild-writing/bundled-deps: new section
I've tried to faithfully port the wiki page [0] to the devmanual in this commit, and intend to change the contents as required in followups, to allow easier comparison and to retain provenance. [0] https://wiki.gentoo.org/wiki/Why_not_bundle_dependencies Closes: https://bugs.gentoo.org/300625 Signed-off-by: Sam James <[email protected]>
1 parent c6a617b commit ca0beb7

File tree

2 files changed

+393
-0
lines changed

2 files changed

+393
-0
lines changed
Lines changed: 392 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,392 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<devbook self="ebuild-writing/bundled-deps/">
3+
<chapter>
4+
<title>Bundled dependencies</title>
5+
<body>
6+
7+
<p>
8+
The intent of this page is to collect information on dependency bundling
9+
and static linking as a reference to refer upstream developers, instead of
10+
explaining the same thing repeatedly by e-mail.
11+
</p>
12+
</body>
13+
14+
<section>
15+
<title>When is code bundled?</title>
16+
<body>
17+
18+
<p>
19+
Say you develop and distribute a piece of software: a game, a library, anything.
20+
Now, the code is considered bundled if any of the following conditions occur:
21+
</p>
22+
23+
<ul>
24+
<li>
25+
Statically linking against a system library
26+
</li>
27+
<li>
28+
Shipping and using your own copy of a library
29+
</li>
30+
<li>
31+
Including and (unconditionally) using snippets of code copied from
32+
a library
33+
</li>
34+
</ul>
35+
36+
<p>
37+
In other words, code bundling occurs whenever a program or library ends
38+
up containing code that does not belong to it.
39+
</p>
40+
41+
</body>
42+
</section>
43+
44+
<section>
45+
<title>Temptations</title>
46+
<body>
47+
48+
<p>
49+
There are reasons why bundling dependencies and using static linking occurs;
50+
there are certain benefits to it. So why is it tempting to do such a thing?
51+
</p>
52+
53+
</body>
54+
55+
<subsection>
56+
<title>Comforting non-Linux users</title>
57+
<body>
58+
59+
<p>
60+
Especially in Windows, shipping dependencies <e>can</e> be a favour to users
61+
to save end users having to manually install dependencies or additional
62+
libraries. Without a package manager, there is no real-solution to that on
63+
Windows anyway.
64+
</p>
65+
66+
<p>
67+
It is tempting when using bundled code on Windows to bundle on GNU/Linux too.
68+
It feels consistent and fits together nicely in the mind of the software
69+
author.
70+
</p>
71+
72+
</body>
73+
</subsection>
74+
75+
<subsection>
76+
<title>Easing up adoption despite odd dependencies</title>
77+
<body>
78+
79+
<p>
80+
If a software package <e>P</e> has some dependency <e>D</e> that is not yet
81+
packaged for major distributions, <e>D</e> makes it harder for <e>P</e> to
82+
get in as packaging <e>P</e> forces the new maintainer to package <e>D</e>
83+
him/herself or to wait for someone else to package it for him/her.
84+
</p>
85+
86+
<p>
87+
Bundling <e>D</e> hides the dependency on <e>D</e> in a way: if the packager
88+
is not paying close attention <e>P</e> may even get in despite and with the
89+
bundled dependency. (It is, however, only a matter of time until someone
90+
noticed the bundling.)
91+
</p>
92+
93+
</body>
94+
</subsection>
95+
96+
<subsection>
97+
<title>Private forks</title>
98+
<body>
99+
100+
<p>
101+
If <e>P</e> uses a library <e>D</e>, the developers of <e>P</e> may wish
102+
to make some changes to <e>D</e>, for example to add a new feature, modify
103+
the API, or change the default behavior. If the developers of <e>D</e>
104+
for whatever reason are opposed to these changes, the developers of
105+
<e>P</e> may want to fork <e>D</e>.
106+
</p>
107+
108+
<p>
109+
But publishing and properly maintaining a fork takes time and effort, so
110+
the developers of <e>P</e> could be tempted to take the easy road, bundle
111+
their patched version of <e>D</e> with <e>P</e>, and maybe occasionally
112+
update it for upstream <e>D</e> changes.
113+
</p>
114+
</body>
115+
</subsection>
116+
</section>
117+
118+
<section>
119+
<title>Problems</title>
120+
<body>
121+
122+
<p>
123+
So why is bundling dependencies and static linking bad after all?
124+
</p>
125+
</body>
126+
127+
<subsection>
128+
<title>Security implications</title>
129+
<body>
130+
131+
<p>
132+
Let's consider you're a developer of <e>foo</e> and your <e>foo</e> uses
133+
<e>libbar</e>.
134+
</p>
135+
136+
<p>
137+
Now, a very important security flaw has been found in <e>libbar</e>
138+
(say, remote privilege escalation). The problem is large enough that devs
139+
of <e>libbar</e> release fixed version right away, and distributions package
140+
it quickly to decrease the possibility of break-in to users' systems to
141+
minimum.
142+
</p>
143+
144+
<p>
145+
If a particular distribution has an efficient security upgrade system, the
146+
patched library can get there in less than 24 hours. But that would be of
147+
no use to <e>foo</e> users which will still use the earlier vulnerable library.
148+
</p>
149+
150+
<p>
151+
Now, depending on how bad things are:
152+
</p>
153+
154+
<ul>
155+
<li>
156+
If <e>foo</e> statically linked against <e>libbar</e>, then the users would
157+
either have to rebuild <e>foo</e> themselves to make it use the fixed library
158+
or distribution developers would have to make a new package for <e>foo</e> and
159+
make sure it gets to user systems along with <e>libbar</e> (assuming they
160+
are aware that the package is statically linked)
161+
</li>
162+
<li>
163+
If <e>foo</e> bundled local copy of <e>libbar</e>, then they would have to wait
164+
till you discover the vulnerability, update <e>libbar</e> sources, release
165+
the new version and distributions package the new version
166+
</li>
167+
</ul>
168+
169+
<p>
170+
In the meantime, users probably even won't know they are a running vulnerable
171+
application just because they won't know there's a vulnerable library
172+
statically linked into the executables.
173+
</p>
174+
175+
<p>
176+
Examples:
177+
</p>
178+
179+
<ul>
180+
<li>
181+
<uri link="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-3074">
182+
CVE-2016-3074</uri> has to be
183+
<uri link="https://bugs.php.net/bug.php?id=71912">fixed in PHP</uri>
184+
(where it is bundled) after it is
185+
<uri link="https://github.com/libgd/libgd/commit/2bb97f407c1145c850416a3bfbcc8cf124e68a19">
186+
fixed in libgd</uri> (upstream)
187+
</li>
188+
</ul>
189+
</body>
190+
</subsection>
191+
192+
<subsection>
193+
<title>Waste of hardware resources</title>
194+
<body>
195+
196+
<p>
197+
Say a media player is bundling library libvorbis. If libvorbis is also
198+
installed system-wide this means that two copies of libvorbis:
199+
</p>
200+
201+
<ol>
202+
<li>
203+
occupy twice as much space on disk
204+
</li>
205+
<li>
206+
occupy (up to) twice as much RAM (of the page cache)
207+
</li>
208+
</ol>
209+
</body>
210+
</subsection>
211+
212+
<subsection>
213+
<title>Waste of development time downstream</title>
214+
<body>
215+
216+
<p>
217+
Due to the
218+
<uri link="::ebuild-writing/bundled-deps/#Downstream consequences">
219+
consequences</uri> of bundled dependencies, many hours of downstream developer
220+
time are wasted that could have been put to more useful work.
221+
</p>
222+
</body>
223+
</subsection>
224+
225+
<subsection>
226+
<title>Potential for symbol collisions</title>
227+
<body>
228+
229+
<p>
230+
If a program <e>P</e> uses a system-installed library <e>A</e> and also uses
231+
another library <e>B</e> which bundles library <e>A</e>, there is a potential
232+
for symbol collisions.
233+
</p>
234+
235+
<p>
236+
This means that <e>P</e> might use an interface, such as <e>my_function()</e>
237+
and that the <e>my_function()</e> symbol would be present in both <e>A</e>
238+
and the version of <e>A</e> bundled inside of library <e>B</e>.
239+
</p>
240+
241+
<p>
242+
If the system-installed copy of <e>A</e> and the copy of <e>A</e> compiled
243+
into library <e>B</e> are from different releases of library <e>A</e>, then
244+
the operation of the interface <e>my_function()</e> might behave differently
245+
in each copy of <e>A</e>.
246+
</p>
247+
248+
<p>
249+
Since the program <e>P</e> was compiled against the system-installed copy of
250+
<e>A</e> and for various other reasons, if <e>P</e> ends up using the
251+
<e>my_function()</e> interface from the version of <e>A</e> bundled in
252+
library <e>B</e> instead of the interface in the system-installed copy.
253+
</p>
254+
255+
<p>
256+
This can potentially result in crashes or strange unpredictable behavior.
257+
</p>
258+
259+
<p>
260+
This sort of problem can be prevented if library <e>B</e> uses symbol
261+
visibility tricks when it links against library <e>A</e>, which would cause
262+
library <e>B</e> not to export library <e>A</e>'s interfaces.
263+
</p>
264+
265+
<p>
266+
Examples:
267+
</p>
268+
269+
<ul>
270+
<li>
271+
libmagic bundled with PHP (<uri link="https://bugs.gentoo.org/471682">Gentoo
272+
bug 471682</uri>, <uri link="https://bugs.php.net/bug.php?id=66095">
273+
PHP bug 66095</uri>)
274+
</li>
275+
</ul>
276+
</body>
277+
</subsection>
278+
</section>
279+
280+
<section>
281+
<title>Downstream consequences</title>
282+
<body>
283+
284+
<p>
285+
When a bundled dependency is discovered downstream this has a number of
286+
bad consequences.
287+
</p>
288+
289+
</body>
290+
291+
<subsection>
292+
<title>Analysis</title>
293+
<body>
294+
295+
<p>
296+
So there is a copy of libvorbis bundled with that media player. Which
297+
version is it? Has it been modified?
298+
</p>
299+
</body>
300+
301+
<subsubsection>
302+
<title>Separating forks from copies</title>
303+
<body>
304+
305+
<p>
306+
Before the bundled dependency can be replaced by the system-widely installed
307+
one, we need to know if it has been modified: we have to know if it's a fork.
308+
</p>
309+
310+
<p>
311+
If it is a fork it may or may not be replaced without breaking something.
312+
</p>
313+
314+
<p>
315+
That's something to find out: more time wasted. If the code says which
316+
version it is we at least know what to run <c>diff</c> against, but that
317+
is not always the case.
318+
</p>
319+
</body>
320+
</subsubsection>
321+
322+
<subsubsection>
323+
<title>Determining versions</title>
324+
<body>
325+
326+
<p>
327+
If a bundled dependency doesn't tell its version we may have to find out
328+
ourselves. Mailing upstream could work, comparing against a number of
329+
tarball contents may work too. Lots of opportunities to waste time.
330+
</p>
331+
</body>
332+
</subsubsection>
333+
</subsection>
334+
335+
<subsection>
336+
<title>Patching</title>
337+
<body>
338+
339+
<p>
340+
Once it is clear that a bundled dependency can be ripped out, a patch is
341+
written, applied and tested (more waste of time). If upstream is willing to
342+
co-operate the patch may be dropped later. If not the patch will need
343+
porting to each now version downstream.
344+
</p>
345+
</body>
346+
</subsection>
347+
348+
<subsection>
349+
<title>What to do upstream</title>
350+
<body>
351+
352+
<ul>
353+
<li>
354+
<p>
355+
Remove bundled dependency:
356+
</p>
357+
<p>
358+
At best, remove the bundle dependency and allow compilation against
359+
dependency <e>D</e> from either a system-wide installation of it or a
360+
local one at any user-defined location.
361+
</p>
362+
<p>
363+
That gives flexibility to users on systems without <e>D</e> packaged and makes
364+
it easy to compile against the system copy downstream: cool!
365+
</p>
366+
</li>
367+
<li>
368+
<p>
369+
Keep bundled dependency: make usage <e>completely optional</e>:
370+
</p>
371+
<p>
372+
With a build time option to disable use of the bundled dependency it is
373+
possible to bypass it downstream without patching: nice!
374+
</p>
375+
<p>
376+
When keeping dependency <e>D</e> bundled make sure to follow the upstream of
377+
<e>D</e> closely and update your copy to a recent version of <e>D</e> on every
378+
minor (and major) release to at least reduce the damage done to people
379+
using your bundled version a little.
380+
</p>
381+
<p>
382+
Also: Clearly document if a bundled dependency is a fork or an unmodified
383+
copy and which version of the bundled software we are dealing with.
384+
</p>
385+
</li>
386+
</ul>
387+
</body>
388+
</subsection>
389+
390+
</section>
391+
</chapter>
392+
</devbook>

ebuild-writing/text.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,5 @@ with some general notes and extended examples.
3131
<include href="misc-files/"/>
3232
<include href="user-submitted/"/>
3333
<include href="common-mistakes/"/>
34+
<include href="bundled-deps/"/>
3435
</devbook>

0 commit comments

Comments
 (0)