HTTPCLIENT-1843 Plug Commons-Compress into HttpClient’s automatic#651
Conversation
|
@ok2c @garydgregory — could you please review this PR when you have a moment? I’ve closed the earlier one because I opted for a cleaner approach here. |
There was a problem hiding this comment.
@arturobernalg It is much better and can basically be merged as it, but it can be made better still.
CommonsCompressDecoderFactoryuses reflection. I does not have to. It is perfectly OK to have Commons Compress as a optional compile dependency. what is important that there is no direct dependency on Commons Compress inContentDecoderRegistry, What#commonsCompressPresentinContentDecoderRegistrydoes perfectly fine and sufficient.CommonsCompressDecoderFactoryimport Commons Compress classes directly.REGISTRYinContentDecoderRegistryshould be unmodifiable. It is only a matter of tweaking#buildmethod a little. This would make#snapshotunnecessary. Better yet you should probably useorg.apache.hc.core5.http.config.Registryinstead of a planMap
garydgregory
left a comment
There was a problem hiding this comment.
TY @arturobernalg !
I agree with @ok2c comments, especially regarding reflection. I've made a few low-level comments throughout.
| * Returns a <em>defensive copy</em> of the internal registry | ||
| * (key = canonical encoding token, value = {@link InputStreamFactory}). | ||
| */ | ||
| public static Map<String, InputStreamFactory> snapshot() { |
There was a problem hiding this comment.
snapshot() is a bad method name here IMO, rename to getRegistry().
Saying this is a defensive copy is misleading because InputStreamFactory makes no thread-safety guarantees. I would say this is a "shallow copy" which is technically correct and doesn't imply intent of attack or defense for call sites.
| false, | ||
| ContentDecoderRegistry.class.getClassLoader()); | ||
| return true; | ||
| } catch (final ClassNotFoundException e) { |
|
|
||
| final List<String> list = new ArrayList<>(4); | ||
| list.add("gzip"); | ||
| list.add("x-gzip"); |
There was a problem hiding this comment.
Maybe a later PR could gather all these magic strings ("x-gzip" and so on) into a class (existing or new).
|
|
||
| /* 2. Optional Commons-Compress decoders (reflection, safe) */ | ||
| if (commonsCompressPresent()) { | ||
| addCommons(m, "br"); |
There was a problem hiding this comment.
See my other comment about magic strings, for a later PR.
| Class.forName(cn, false, | ||
| CommonsCompressDecoderFactory.class.getClassLoader()); | ||
| return true; | ||
| } catch (final ClassNotFoundException e) { |
95f58c5 to
7ea87d5
Compare
|
Please @ok2c @garydgregory do another pass. |
garydgregory
left a comment
There was a problem hiding this comment.
@arturobernalg
Look good. I
have some
comment
scattered
also, why
not use longer
lines?
I'm not a fan
of the
newspaper format
;-)
| false, | ||
| ContentDecoderRegistry.class.getClassLoader()); | ||
| return true; | ||
| } catch (final ClassNotFoundException e) { |
| } | ||
| } | ||
|
|
||
| private enum Probe { |
There was a problem hiding this comment.
This feels a bit redundant with the new ContentCoding enum. What do you think about moving the "requires this type name" aspect (a.k.a. "probe class") of Probe to ContentCoding and turning this "Probe" into a Set<ContentCoding>?
|
|
||
| final String enc, probeClass; | ||
|
|
||
| Probe(final String enc, final String probeClass) { |
There was a problem hiding this comment.
probeClass is misleading to me because it's a name, not an actual class. I think "requiredTypeName" or "requiredClassName" would be a clearer name because "probe" doesn't really tell you anything (to me).
| this.probeClass = probeClass; | ||
| } | ||
|
|
||
| static String helperFor(final String enc) { |
There was a problem hiding this comment.
I would stay away from cryptic names like "enc", even in private code, if it's an "encoding", the call it that ;-)
|
|
||
| /** | ||
| * {@link InputStreamFactory} backed by | ||
| * <a href="https://commons.apache.org/proper/commons-compress/">Apache Commons Compress</a>. |
There was a problem hiding this comment.
If there's a non-breaking space between words 1 and 2, why not between words 2 and 3? I would just not bother since it might cause odd rendering on small screens like phones and tablets.
| * {@link InputStreamFactory} backed by | ||
| * <a href="https://commons.apache.org/proper/commons-compress/">Apache Commons Compress</a>. | ||
| * <p> | ||
| * The class is compiled against Commons Compress but lives behind an <i>optional</i> |
There was a problem hiding this comment.
I would not use <i> in Javadocs, that's up to the rederer to decide what style to apply, which is why Javadoc supports <em> for ephasis which is usually translated to italics and <strong> which is usually translated to bold.
| * </p> | ||
| * | ||
| * <h4>Run-time guards</h4> | ||
| * Some encodings (e.g. {@code br}, {@code zstd}, {@code xz/lzma}) |
There was a problem hiding this comment.
In general, I stay away from Latin abbreviations, I find it doesn't make the text clearer to read. No need for non-breaking space IMO.
| * <a href="https://commons.apache.org/proper/commons-compress/">Apache Commons Compress</a>. | ||
| * <p> | ||
| * The class is compiled against Commons Compress but lives behind an <i>optional</i> | ||
| * dependency. At run-time it will only be loaded when the library is present, |
There was a problem hiding this comment.
You don't need two spaces after a .; we're not using typewriters anymore and Javadoc is not rendered in mono-space ;-)
404fb79 to
5889705
Compare
content-decoding (optional) * New ContentDecoderRegistry discovers extra codecs (br, zstd, xz, lz4, …) via Commons-Compress when that jar is on the class-path; otherwise falls back to the built-ins (gzip, deflate) only. * No hard dependency added—projects that need the extra algorithms just add `commons-compress` (and helper jars like google-brotli, zstd-jni, xz-java) to their pom and HttpClient uses them automatically.
fff3ab4 to
f566906
Compare
commons-compress(and helper jars like google-brotli, zstd-jni, xz-java) to their pom and HttpClient uses them automatically.