Skip to content

HADOOP-19785. mvn site fails in JDK17#8182

Merged
aajisaka merged 12 commits intoapache:trunkfrom
aajisaka:fix-mvn-site-2
Feb 3, 2026
Merged

HADOOP-19785. mvn site fails in JDK17#8182
aajisaka merged 12 commits intoapache:trunkfrom
aajisaka:fix-mvn-site-2

Conversation

@aajisaka
Copy link
Member

@aajisaka aajisaka commented Jan 15, 2026

Description of PR

Fix the following error while running mvn site with JDK17

JIRA: HADOOP-19785

Error:  /home/runner/work/hadoop/hadoop/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSWebApp.java:26: error: cannot find symbol
Error:  import com.codahale.metrics.JmxReporter;
Error:                             ^
Error:    symbol:   class JmxReporter
Error:    location: package com.codahale.metrics
Error:  /home/runner/work/hadoop/hadoop/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSWebApp.java:69: error: cannot find symbol
Error:    private JmxReporter jmxReporter;
Error:            ^
Error:    symbol:   class JmxReporter
Error:    location: class KMSWebApp
Error:  /home/runner/work/hadoop/hadoop/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra/src/main/java/org/apache/hadoop/tools/dynamometer/SimulatedDataNodes.java:36: error: cannot find symbol
Error:  import org.apache.hadoop.hdfs.MiniDFSCluster;
Error:                               ^
Error:    symbol:   class MiniDFSCluster
Error:    location: package org.apache.hadoop.hdfs
Error:  /home/runner/work/hadoop/hadoop/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra/src/main/java/org/apache/hadoop/tools/dynamometer/SimulatedDataNodes.java:41: error: cannot find symbol
Error:  import org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils;
Error:                                               ^
Error:    symbol:   class DataNodeTestUtils
Error:    location: package org.apache.hadoop.hdfs.server.datanode
Error:  /home/runner/work/hadoop/hadoop/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra/src/main/java/org/apache/hadoop/tools/dynamometer/SimulatedDataNodes.java:42: error: cannot find symbol
Error:  import org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset;
Error:                                               ^
Error:    symbol:   class SimulatedFSDataset
Error:    location: package org.apache.hadoop.hdfs.server.datanode
Error: [ERROR] 5 errors
  • Upgraded Maven javadoc plugin
  • Added hadoop-hdfs test jar dependency because hadoop-dynamometer depends on it
  • Update Hadoop's custom doclet to support JDK 17/21/25
  • Fix Javadoc errors
  • Use --source 17 and --target 17 flags instead of --release 17 flag to access JDK internal APIs in hadoop-annotation module
  • Update checkstyle plugin versions to support JDK17 idioms

How was this patch tested?

Ran the following command and verified the above error has been fixed in JDK 17/21/25:

mvn clean install -DskipTests -DskipShade
mvn site
mvn site:stage -DstagingDirectory=<doc staging directory>

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

AI Tooling

Claude Code with Claude Sonnet 4.5 is used

@aajisaka
Copy link
Member Author

Hi @zhtttylz, you've implemented new custom Doclet to support JDK17 in #8038, however, unfortunately mvn site fails in Doclet part. The detail is below:

  1. RootDocProcessor.process(env) creates a Proxy object that implements DocletEnvironment interface
  2. This proxy is passed to StandardDoclet.run(filtered)
  3. Inside the StandardDoclet, the javadoc internals (specifically WorkArounds.()) try to cast this proxy to the concrete implementation class jdk.javadoc.internal.tool.DocEnvImpl: https://github.com/jonathan-gibbons/jdk/blob/bc1e60c1bf91678ef18652a00aa2ce55b0446caa/src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/toolkit/WorkArounds.java#L112
  4. The cast fails because a proxy object cannot be cast to a concrete implementation class

It seems our custom doclet implementation is prohibited after https://bugs.openjdk.org/browse/JDK-8253736:

One particularly annoying wart is the cast on DocletEnvironment to DocEnvImpl, which effectively prevents using subtypes to carry additional info. It is not clear (even now) what the best way is to replace that logic.

Now I feel it's becoming really hard to maintain Hadoop's custom Doclets, and therefore I would like to drop the custom implementation. The primary change is we are going to build Hadoop JavaDoc with @LimitedPrivate, @Private or @Unstable classes, which are now excluded by our custom Doclets.

@slfan1989 @cnauroth @zhtttylz What do you think?

Comment on lines +715 to +738
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${project.version}</version>
Copy link
Member Author

@aajisaka aajisaka Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added because Hadoop Dynamometer production classes depend on HDFS test jar

@hadoop-yetus

This comment was marked as outdated.

@slfan1989
Copy link
Contributor

Hi @zhtttylz, you've implemented new custom Doclet to support JDK17 in #8038, however, unfortunately mvn site fails in Doclet part. The detail is below:

  1. RootDocProcessor.process(env) creates a Proxy object that implements DocletEnvironment interface
  2. This proxy is passed to StandardDoclet.run(filtered)
  3. Inside the StandardDoclet, the javadoc internals (specifically WorkArounds.()) try to cast this proxy to the concrete implementation class jdk.javadoc.internal.tool.DocEnvImpl: https://github.com/jonathan-gibbons/jdk/blob/bc1e60c1bf91678ef18652a00aa2ce55b0446caa/src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/toolkit/WorkArounds.java#L112
  4. The cast fails because a proxy object cannot be cast to a concrete implementation class

It seems our custom doclet implementation is prohibited after https://bugs.openjdk.org/browse/JDK-8253736:

One particularly annoying wart is the cast on DocletEnvironment to DocEnvImpl, which effectively prevents using subtypes to carry additional info. It is not clear (even now) what the best way is to replace that logic.

Now I feel it's becoming really hard to maintain Hadoop's custom Doclets, and therefore I would like to drop the custom implementation. The primary change is we are going to build Hadoop JavaDoc with @LimitedPrivate, @Private or @Unstable classes, which are now excluded by our custom Doclets.

@slfan1989 @cnauroth @zhtttylz What do you think?

@aajisaka Thanks for the detailed analysis — after reading through it, I fully agree that the cost of maintaining this custom Doclet has become unreasonably high. With OpenJDK continuing to clean up internal APIs (the trend starting from JDK-8253736 is only getting stronger), future compatibility will only get worse, and the next LTS might break it completely.
I'm in favor of dropping the custom Doclet and switching to the standard doclet to generate complete JavaDocs (including all classes annotated with @Private / @unstable / @LimitedPrivate). The main reasons are:

  • The maintenance burden is too heavy and takes away energy from more valuable work;
  • The visibility of these annotated classes has very limited impact on most downstream users — they shouldn't be depending on @Private APIs anyway;
  • On the positive side, having the full picture can actually help developers/contributors who want to dig into the implementation details.

cc: @zhtttylz @cnauroth

@aajisaka
Copy link
Member Author

Thank you @slfan1989. I'll update the patch with using the standard doclet.

@aajisaka aajisaka changed the title [WIP] HADOOP-19785. mvn site fails in JDK17 HADOOP-19785. mvn site fails in JDK17 Jan 20, 2026
@aajisaka aajisaka marked this pull request as ready for review January 20, 2026 03:16
Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aajisaka , thank you for bringing this up. I'm not sure why we had a successful site build in the pre-commits for #8038 . I guess it wasn't really triggering a full site build.

I tried building the site with the current patch, and it has a pretty drastic impact on the user experience. The current Hadoop 3.4.2 API docs show ~13,000 classes. Without the visibility annotation filtering, it goes up to ~126,000 classes. A lot of it is filled with stuff like auto-generated Protobuf classes for internal protocols, which aren't particularly helpful for either end users or Hadoop contributors.

It's really unfortunate that there doesn't seem to be any standard solution for this, and the JDK has closed the door on the only available option. The only other thing I can think of is to manually manage include/exclude rules in the mave-javadoc-plugin configuration, but that would be a huge maintenance burden on us.

I'm going to start an email thread on common-dev@ to make sure this is visible to people not following this PR.

@pan3793
Copy link
Member

pan3793 commented Jan 26, 2026

The current Hadoop 3.4.2 API docs show ~13,000 classes. Without the visibility annotation filtering, it goes up to ~126,000 classes.

late chime in - the number sounds like a huge impact, was anyone considered migrating to japicmp? it looks like a modern and popular solution for binary compatibility check, I see that at least the Parquet project is using it. Specifically, the docs say it supports "exclusion based on annotations"

Per default all classes are tracked. If necessary, certain packages, classes, methods or fields can be excluded or explicitly included. Inclusion and exclusion is also possible based on annotations.

@aajisaka
Copy link
Member Author

Thank you @pan3793

There are 2 discussion points - jdiff and javadoc. I think jdiff can be covered by japicmp, but javadoc is not. Chris already started the thread in common-dev@ for doc side

@pan3793
Copy link
Member

pan3793 commented Jan 26, 2026

@aajisaka, thanks for the clarification. I re-read the conversation carefully and understand the situation now.

  1. The cast fails because a proxy object cannot be cast to a concrete implementation class

Specific to this issue, it looks like public class DocEnvImpl ... is not final, so it's feasible to create a derived class class HadoopDocEnvImpl extends DocEnvImpl that delegates all invocations to the Proxy one, to pass the explicitly cast check?

Now I feel it's becoming really hard to maintain Hadoop's custom Doclets, and therefore I would like to drop the custom implementation.

Agree, but given the impact number is huge, and I suppose that Hadoop will use Java 17 as a baseline for several years, it's still valuable if the problem can be solved by a few hundred lines of hacky code.

@aajisaka
Copy link
Member Author

Specific to this issue, it looks like public class DocEnvImpl ... is not final, so it's feasible to create a derived class class HadoopDocEnvImpl extends DocEnvImpl that delegates all invocations to the Proxy one, to pass the explicitly cast check?

You are right. I could successfully create HadoopDocEnvImpl extends DocEnvImpl, and also cleaned up invocations.

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aajisaka and @zhtttylz , I tried building the site with the changes in this PR. I think something is going wrong with the handling of @Public + @Evolving. Public evolving classes are included in the JavaDocs for published releases today, but I think they're missing now. Example classes: hadoop-common CompositeService and YARN TimelineClient.

I don't know if this problem was introduced by changes in this PR or in #8038.

@aajisaka
Copy link
Member Author

Thank you @cnauroth for your review. Updated the default stability level to include @Public and @Evolving classes.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I guess it may need small tweaks to pass CI.

</activation>
<properties>
<javac.version>17</javac.version>
<maven.compiler.release>${javac.version}</maven.compiler.release>
Copy link
Member

@pan3793 pan3793 Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to keep this, so that we can use a higher version of JDK to build Hadoop, but produce Java 17 compatible bytecode. If this fails on some modules, can we just disable it on these specific modules?

BTW, the activation condition of jdk17+ profile is always true, we can remove it and move the content to the top level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to keep this, so that we can use a higher version of JDK to build Hadoop, but produce Java 17 compatible bytecode.

Agreed. Removed jdk17+ profile and moved the release flag into maven-compiler-plugin config, so that it can be overridden by hadoop-annotation module.

pom.xml Outdated
<!-- maven plugin versions -->
<maven-deploy-plugin.version>2.8.1</maven-deploy-plugin.version>
<maven-site-plugin.version>3.9.1</maven-site-plugin.version>
<maven-site-plugin.version>3.21.0</maven-site-plugin.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been testing dry run releases of 3.5.0 from trunk with this patch applied. It generates the JavaDocs correctly, but then the JavaDocs don't show up in the final release of the site tarball. This looks similar to symptoms described in #5319, where we worked around it by downgrading maven-site-plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some more details on this. When I build on branch-3.4 (without the plugin upgrade), the API docs land here:

target/staging/hadoop-project/api

On trunk, with this PR applied, they land here instead:

target/staging/apidocs

Then, the release script has an assumption that it can find everything in the hadoop-project sub-directory:

https://github.com/apache/hadoop/blob/trunk/dev-support/bin/create-release#L624-L628

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cnauroth for the details.

Downgraded maven-site-plugin and maven-javadoc-plugin. maven-javadoc-plugin >= 3.10.0 cannot configure the output directory according to apache/maven-javadoc-plugin#1305

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 10s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+0 🆗 mvndep 2m 7s Maven dependency ordering for branch
+1 💚 mvninstall 52m 15s trunk passed
+1 💚 compile 20m 12s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 19m 34s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 checkstyle 3m 19s trunk passed
-1 ❌ mvnsite 9m 43s /branch-mvnsite-root.txt root in trunk failed.
-1 ❌ javadoc 9m 3s /branch-javadoc-root-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt root in trunk failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌ javadoc 8m 33s /branch-javadoc-root-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt root in trunk failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
+0 🆗 spotbugs 0m 24s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
-1 ❌ spotbugs 37m 57s /branch-spotbugs-root-warnings.html root in trunk has 94 extant spotbugs warnings.
+1 💚 shadedclient 33m 19s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 33m 45s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 43s Maven dependency ordering for patch
+1 💚 mvninstall 46m 16s the patch passed
+1 💚 compile 18m 47s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javac 18m 47s the patch passed
+1 💚 compile 19m 36s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 javac 19m 36s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 5m 53s the patch passed
+1 💚 mvnsite 19m 53s the patch passed
-1 ❌ javadoc 9m 43s /results-javadoc-javadoc-root-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt root-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 958 new + 8936 unchanged - 35326 fixed = 9894 total (was 44262)
-1 ❌ javadoc 9m 40s /results-javadoc-javadoc-root-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt root-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 776 new + 9320 unchanged - 31841 fixed = 10096 total (was 41161)
+0 🆗 spotbugs 0m 21s hadoop-project has no data from spotbugs
+1 💚 shadedclient 69m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 827m 33s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 1m 51s The patch does not generate ASF License warnings.
1246m 55s
Reason Tests
Failed junit tests hadoop.yarn.service.TestYarnNativeServices
hadoop.yarn.server.resourcemanager.TestRMHA
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation
hadoop.yarn.server.router.subcluster.fair.TestYarnFederationWithFairScheduler
hadoop.yarn.sls.appmaster.TestAMSimulator
Subsystem Report/Notes
Docker ClientAPI=1.53 ServerAPI=1.53 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8182/8/artifact/out/Dockerfile
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux ba47af204f4b 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 90150af
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8182/8/testReport/
Max. process+thread count 3811 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-common-project/hadoop-annotations hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp hadoop-tools/hadoop-azure . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8182/8/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

@aajisaka
Copy link
Member Author

aajisaka commented Feb 2, 2026

Now the patch is ready for review

Copy link
Contributor

@cnauroth cnauroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Thank you for this @aajisaka ! @pan3793 , thank you also for the helpful code review.

@aajisaka aajisaka merged commit 1f2ccc7 into apache:trunk Feb 3, 2026
1 of 3 checks passed
@aajisaka aajisaka deleted the fix-mvn-site-2 branch February 3, 2026 01:42
@aajisaka
Copy link
Member Author

aajisaka commented Feb 3, 2026

Merged. Thank you @slfan1989, @pan3793, and @cnauroth !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants