HADOOP-19785. mvn site fails in JDK17#8182
Conversation
|
Hi @zhtttylz, you've implemented new custom Doclet to support JDK17 in #8038, however, unfortunately
It seems our custom doclet implementation is prohibited after https://bugs.openjdk.org/browse/JDK-8253736:
Now I feel it's becoming really hard to maintain Hadoop's custom Doclets, and therefore I would like to drop the custom implementation. The primary change is we are going to build Hadoop JavaDoc with @slfan1989 @cnauroth @zhtttylz What do you think? |
| <groupId>org.apache.hadoop</groupId> | ||
| <artifactId>hadoop-hdfs</artifactId> | ||
| <version>${project.version}</version> |
There was a problem hiding this comment.
Added because Hadoop Dynamometer production classes depend on HDFS test jar
This comment was marked as outdated.
This comment was marked as outdated.
@aajisaka Thanks for the detailed analysis — after reading through it, I fully agree that the cost of maintaining this custom Doclet has become unreasonably high. With OpenJDK continuing to clean up internal APIs (the trend starting from JDK-8253736 is only getting stronger), future compatibility will only get worse, and the next LTS might break it completely.
|
|
Thank you @slfan1989. I'll update the patch with using the standard doclet. |
f1e05a4 to
5111050
Compare
cnauroth
left a comment
There was a problem hiding this comment.
@aajisaka , thank you for bringing this up. I'm not sure why we had a successful site build in the pre-commits for #8038 . I guess it wasn't really triggering a full site build.
I tried building the site with the current patch, and it has a pretty drastic impact on the user experience. The current Hadoop 3.4.2 API docs show ~13,000 classes. Without the visibility annotation filtering, it goes up to ~126,000 classes. A lot of it is filled with stuff like auto-generated Protobuf classes for internal protocols, which aren't particularly helpful for either end users or Hadoop contributors.
It's really unfortunate that there doesn't seem to be any standard solution for this, and the JDK has closed the door on the only available option. The only other thing I can think of is to manually manage include/exclude rules in the mave-javadoc-plugin configuration, but that would be a huge maintenance burden on us.
I'm going to start an email thread on common-dev@ to make sure this is visible to people not following this PR.
late chime in - the number sounds like a huge impact, was anyone considered migrating to japicmp? it looks like a modern and popular solution for binary compatibility check, I see that at least the Parquet project is using it. Specifically, the docs say it supports "exclusion based on annotations"
|
|
Thank you @pan3793 There are 2 discussion points - jdiff and javadoc. I think jdiff can be covered by japicmp, but javadoc is not. Chris already started the thread in common-dev@ for doc side |
|
@aajisaka, thanks for the clarification. I re-read the conversation carefully and understand the situation now.
Specific to this issue, it looks like
Agree, but given the impact number is huge, and I suppose that Hadoop will use Java 17 as a baseline for several years, it's still valuable if the problem can be solved by a few hundred lines of hacky code. |
You are right. I could successfully create HadoopDocEnvImpl extends DocEnvImpl, and also cleaned up invocations. |
...adoop-annotations/src/main/java/org/apache/hadoop/classification/tools/HadoopDocEnvImpl.java
Show resolved
Hide resolved
...adoop-annotations/src/main/java/org/apache/hadoop/classification/tools/HadoopDocEnvImpl.java
Show resolved
Hide resolved
cnauroth
left a comment
There was a problem hiding this comment.
@aajisaka and @zhtttylz , I tried building the site with the changes in this PR. I think something is going wrong with the handling of @Public + @Evolving. Public evolving classes are included in the JavaDocs for published releases today, but I think they're missing now. Example classes: hadoop-common CompositeService and YARN TimelineClient.
I don't know if this problem was introduced by changes in this PR or in #8038.
|
Thank you @cnauroth for your review. Updated the default stability level to include |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
pan3793
left a comment
There was a problem hiding this comment.
LGTM, though I guess it may need small tweaks to pass CI.
| </activation> | ||
| <properties> | ||
| <javac.version>17</javac.version> | ||
| <maven.compiler.release>${javac.version}</maven.compiler.release> |
There was a problem hiding this comment.
I think we want to keep this, so that we can use a higher version of JDK to build Hadoop, but produce Java 17 compatible bytecode. If this fails on some modules, can we just disable it on these specific modules?
BTW, the activation condition of jdk17+ profile is always true, we can remove it and move the content to the top level.
There was a problem hiding this comment.
I think we want to keep this, so that we can use a higher version of JDK to build Hadoop, but produce Java 17 compatible bytecode.
Agreed. Removed jdk17+ profile and moved the release flag into maven-compiler-plugin config, so that it can be overridden by hadoop-annotation module.
pom.xml
Outdated
| <!-- maven plugin versions --> | ||
| <maven-deploy-plugin.version>2.8.1</maven-deploy-plugin.version> | ||
| <maven-site-plugin.version>3.9.1</maven-site-plugin.version> | ||
| <maven-site-plugin.version>3.21.0</maven-site-plugin.version> |
There was a problem hiding this comment.
I've been testing dry run releases of 3.5.0 from trunk with this patch applied. It generates the JavaDocs correctly, but then the JavaDocs don't show up in the final release of the site tarball. This looks similar to symptoms described in #5319, where we worked around it by downgrading maven-site-plugin.
There was a problem hiding this comment.
Here are some more details on this. When I build on branch-3.4 (without the plugin upgrade), the API docs land here:
target/staging/hadoop-project/api
On trunk, with this PR applied, they land here instead:
target/staging/apidocs
Then, the release script has an assumption that it can find everything in the hadoop-project sub-directory:
https://github.com/apache/hadoop/blob/trunk/dev-support/bin/create-release#L624-L628
There was a problem hiding this comment.
Thank you @cnauroth for the details.
Downgraded maven-site-plugin and maven-javadoc-plugin. maven-javadoc-plugin >= 3.10.0 cannot configure the output directory according to apache/maven-javadoc-plugin#1305
ce4ec5c to
fa3894c
Compare
|
💔 -1 overall
This message was automatically generated. |
|
Now the patch is ready for review |
|
Merged. Thank you @slfan1989, @pan3793, and @cnauroth ! |
Description of PR
Fix the following error while running
mvn sitewith JDK17JIRA: HADOOP-19785
--source 17and--target 17flags instead of--release 17flag to access JDK internal APIs in hadoop-annotation moduleHow was this patch tested?
Ran the following command and verified the above error has been fixed in JDK 17/21/25:
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?AI Tooling
Claude Code with Claude Sonnet 4.5 is used