[SPARK-54276][BUILD] Bump Hadoop 3.4.3 by pan3793 · Pull Request #54029 · apache/spark

pan3793 · 2026-01-28T07:46:56Z

What changes were proposed in this pull request?

Upgrade Hadoop dependency to 3.4.3.

Why are the changes needed?

This release includes HADOOP-19212, which makes UGI work with Java 25.

https://hadoop.apache.org/release/3.4.3.html

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass CI. Also verified spark-sql can successfully bootstrap on JDK 25 now

$ java -version
openjdk version "25.0.1" 2025-10-21 LTS
OpenJDK Runtime Environment Temurin-25.0.1+8 (build 25.0.1+8-LTS)
OpenJDK 64-Bit Server VM Temurin-25.0.1+8 (build 25.0.1+8-LTS, mixed mode, sharing)

$ build/sbt -Phive,hive-thriftserver clean package

$ SPARK_PREPEND_CLASSES=true bin/spark-sql
NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
WARNING: Using incubator modules: jdk.incubator.vector
WARNING: package sun.security.action not in java.base
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
26/01/28 17:23:22 WARN Utils: Your hostname, H27212-MAC-01.local, resolves to a loopback address: 127.0.0.1; using 10.242.159.140 instead (on interface en0)
26/01/28 17:23:22 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
26/01/28 17:23:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
WARNING: A terminally deprecated method in sun.misc.Unsafe has been called
WARNING: sun.misc.Unsafe::arrayBaseOffset has been called by org.apache.spark.unsafe.Platform (file:/Users/chengpan/Projects/apache-spark/common/unsafe/target/scala-2.13/classes/)
WARNING: Please consider reporting this to the maintainers of class org.apache.spark.unsafe.Platform
WARNING: sun.misc.Unsafe::arrayBaseOffset will be removed in a future release
26/01/28 17:23:27 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
26/01/28 17:23:27 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore chengpan@127.0.0.1
26/01/28 17:23:27 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
Spark Web UI available at http://10.242.159.140:4040
Spark master: local[*], Application Id: local-1769592205115
spark-sql (default)> select version();
4.2.0 14557582199659d838bbaa7d7b182e5d92c3b907
Time taken: 1.376 seconds, Fetched 1 row(s)
spark-sql (default)>

Was this patch authored or co-authored using generative AI tooling?

No.

github-actions · 2026-01-28T07:47:07Z

JIRA Issue Information

=== Sub-task SPARK-54276 ===
Summary: Upgrade Hadoop to 3.4.3
Assignee: None
Status: Open
Affected: ["4.2.0"]

This comment was automatically generated by GitHub Actions

dongjoon-hyun · 2026-01-28T07:55:47Z

Nice! Thank you, @pan3793 .

pan3793 · 2026-01-28T08:19:13Z

~~@steveloughran, seems not lucky, there are no classes in~~
~~- hadoop-client-api~~
~~- hadoop-client-runtime~~
~~- hadoop-client-minicluster~~

this is actually caused by my local maven repo dirty cache, sorry for making noise, the jars in the staging repo are good.

steveloughran · 2026-01-28T14:02:01Z

@pan3793 sometimes it's good to rm -r all of ~/m2/repository/org/apache/hadoop (or any other project you actively work on). Saves disk space, even if your next few builds are slow.

steveloughran · 2026-01-28T14:05:25Z

@pan3793 thanks for testing this.
@dongjoon-hyun anything you can do to help test would be good too -really hard a hard time getting bits of the rc out. FWIW the maven artifacts are being built on a raspberry pi as that worked more reliably network-wise than EC2 VMs within the cloudera vpn

pan3793 · 2026-01-28T17:13:00Z

@steveloughran, thanks for tips, yes, I fixed it by rm -r ~/.m2/repository/org/apache/hadoop/**/3.4.3/.

For integration tests, I don't see any issue with default JDK 17, and I'm trying with JDK 25, so far, no issues are related to Hadoop.

pan3793 · 2026-01-29T07:15:54Z

Looks like all failed tests with Java 25 already have solutions or are easy to fix, except for datasketches-java 6.2.0 - it does not work with Java 25, upgrading involves API changes, which breaks the compile, opened apache/datasketches-memory#270, and hope that datasketches-memory 3.0.2 can have a new patch version to solve the Java 25 compatibility issues.

dongjoon-hyun · 2026-02-02T14:18:56Z

Thank you for pinging me, @steveloughran , and sorry for the late reply. I was traveling from South Korea to USA last weekend . I'm going to take a look at this PR.

I don't think there is an Hadoop issue here. It seems that @pan3793 just wanted to verify the result on Java 25.

The datasketches-java issue is a known issue of Apache Spark-side.

SPARK-53225 Upgrade datasketches-java to 7.0.1 and datasketches-memory to 4.1.0 was created on 2025-08-09 by me.
SPARK-53327 Upgrade datasketches to support Java 25 is created on 2025-08-19 by @pan3793

.github/workflows/build_main.yml

pan3793 · 2026-02-02T14:30:20Z

@dongjoon-hyun, let me revert unrelated changes an keep this a simple Hadoop version upgrade, and I will open a new draft PR for Java 25 integration. BTW, I think I already have a solution for datasketches-java.

dongjoon-hyun · 2026-02-02T15:30:24Z

Thank you always, @pan3793 .

dongjoon-hyun · 2026-02-02T17:01:27Z

Although the failures seem flaky ones, could you re-run the failed test pipelines to make it sure, @pan3793 ?

dongjoon-hyun · 2026-02-09T16:47:35Z

Hi, @pan3793 . I sent you an email (chengpan@apache.org).
Please check your email. Thank you always! 😄

pan3793 · 2026-02-09T23:32:11Z

thank you, @dongjoon-hyun, it's really a great news!

dongjoon-hyun · 2026-02-09T23:41:02Z

Oh, my bad. I mistakenly send you an PMC template. It should be an Apache Spark Commiter invitation. Let me send out once more a correct one for the official committment. Very sorry, @pan3793 ~

dongjoon-hyun · 2026-02-09T23:42:54Z

Definitely, I'll help you to the member of PMC later. But you know that it should start from the committer first.

dongjoon-hyun · 2026-02-09T23:45:26Z

I sent a new one to chengpan@apache.org . Could you please accept once more in the correctly email, @pan3793 ?

pan3793 · 2026-02-09T23:52:21Z

@dongjoon-hyun, I have replied to the email. Thank you again.

dongjoon-hyun · 2026-02-09T23:52:49Z

Now, I added you to the committer list. Please check your Whimsy, @pan3793 . It's my pleasure to cowork with you in the community.

dongjoon-hyun · 2026-02-10T00:08:37Z

It's announced too at dev@spark mailing list.

https://lists.apache.org/thread/zb71j720m95do24pmdxdww9o4z2kpqv6

BTW, do you have an LinkedIn account, @pan3793 ?

pan3793 · 2026-02-10T00:11:55Z

@dongjoon-hyun, thanks! I'm not active on LinkedIn

dongjoon-hyun · 2026-02-10T00:12:18Z

Got it. No problem~

steveloughran · 2026-02-17T12:01:23Z

there's a new RC out now; maven staging repo is
https://repository.apache.org/content/repositories/orgapachehadoop-1465

pan3793 · 2026-02-17T13:17:03Z

@steveloughran, thanks for the information, I found it and have updated here to use it a few days ago, so far, the test results look good. but I didn't find the vote mail in common-dev_at_hadoop, am I missed something?

dongjoon-hyun · 2026-02-17T16:59:51Z

Thank you, @steveloughran . The following seems to be the new RC1 email, @pan3793 .

https://lists.apache.org/thread/pwntvvrxc6vb5sod74qmsjtb9wq0cn18

dongjoon-hyun

Could you use the official Apache Hadoop 3.4.3 since the vote succeeded, @pan3793 ?

pan3793 · 2026-02-24T15:26:21Z

@dongjoon-hyun, I see, but it seems the jars are not available on Maven Central yet, I'm waiting for that.

dongjoon-hyun · 2026-02-24T15:42:31Z

Oh, ya. It's not synced yet. Thanks for checking.

BTW, for Java 25, we need Apache Hadoop 3.5.0 still for your HADOOP-19821, right?

pan3793 · 2026-02-24T15:57:33Z

@dongjoon-hyun, I can't say full Java 25 support, but Spark is already able to bootstrap and pass GHA (there are some issues unrelated to Hadoop need to fix though) with Hadoop 3.4.3 with Java 25.

dongjoon-hyun · 2026-02-24T22:34:11Z

Now, it's ready.

$ curl -I https://maven-central.storage-download.googleapis.com/maven2/org/apache/hadoop/hadoop-client-api/3.4.3/hadoop-client-api-3.4.3.pom
HTTP/2 200
...

pan3793 · 2026-02-24T23:30:26Z

@dongjoon-hyun, I contacted the ASF infra team, and it seems they fixed the Maven sync issue.

Removed the staging repo and rebased on the latest master. Now we just need to wait for CI pass (it should)

pan3793 · 2026-02-25T01:49:06Z

CI is green now, it's ready to go.

dongjoon-hyun

+1, LGTM. Thank you for working on this and collaborating the Apache Hadoop community, @pan3793 .

dongjoon-hyun · 2026-02-25T02:15:33Z

Merged to master for Apache Spark 4.2.0. I hope this unblocks the previous items.

github-actions bot added SQL KUBERNETES BUILD DOCS labels Jan 28, 2026

github-actions bot added the CORE label Jan 28, 2026

pan3793 force-pushed the SPARK-54276 branch from bddda74 to b1479e3 Compare January 29, 2026 01:22

dongjoon-hyun marked this pull request as draft January 29, 2026 05:11

dongjoon-hyun reviewed Feb 2, 2026

View reviewed changes

.github/workflows/build_main.yml Outdated Show resolved Hide resolved

pan3793 force-pushed the SPARK-54276 branch from b1479e3 to 1dad687 Compare February 2, 2026 14:32

pan3793 force-pushed the SPARK-54276 branch from 1dad687 to 9aa6977 Compare February 14, 2026 02:01

pan3793 changed the title ~~[WIP][SPARK-54276][BUILD] Bump Hadoop 3.4.3 RC0~~ [WIP][SPARK-54276][BUILD] Bump Hadoop 3.4.3 RC1 Feb 18, 2026

dongjoon-hyun reviewed Feb 24, 2026

View reviewed changes

pan3793 added 3 commits February 25, 2026 07:17

[WIP][SPARK-54276][BUILD] Bump Hadoop 3.4.3 RC0

b3d730e

Hadoop 3.4.3 RC1

d43a597

Hadoop 3.4.3 RC1 becomes 3.4.3

7d5541d

pan3793 force-pushed the SPARK-54276 branch from 9aa6977 to 7d5541d Compare February 24, 2026 23:17

pan3793 changed the title ~~[WIP][SPARK-54276][BUILD] Bump Hadoop 3.4.3 RC1~~ [SPARK-54276][BUILD] Bump Hadoop 3.4.3 Feb 24, 2026

pan3793 marked this pull request as ready for review February 24, 2026 23:23

pan3793 requested a review from dongjoon-hyun February 25, 2026 01:46

dongjoon-hyun approved these changes Feb 25, 2026

View reviewed changes

dongjoon-hyun closed this in 8fd58ed Feb 25, 2026

Comments

Conversation

pan3793 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

github-actions bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JIRA Issue Information

Uh oh!

dongjoon-hyun commented Jan 28, 2026

Uh oh!

pan3793 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steveloughran commented Jan 28, 2026

Uh oh!

steveloughran commented Jan 28, 2026

Uh oh!

pan3793 commented Jan 28, 2026

Uh oh!

pan3793 commented Jan 29, 2026

Uh oh!

dongjoon-hyun commented Feb 2, 2026

Uh oh!

Uh oh!

pan3793 commented Feb 2, 2026

Uh oh!

dongjoon-hyun commented Feb 2, 2026

Uh oh!

dongjoon-hyun commented Feb 2, 2026

Uh oh!

dongjoon-hyun commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pan3793 commented Feb 9, 2026

Uh oh!

dongjoon-hyun commented Feb 9, 2026

Uh oh!

dongjoon-hyun commented Feb 9, 2026

Uh oh!

dongjoon-hyun commented Feb 9, 2026

Uh oh!

pan3793 commented Feb 9, 2026

Uh oh!

dongjoon-hyun commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Feb 10, 2026

Uh oh!

pan3793 commented Feb 10, 2026

Uh oh!

dongjoon-hyun commented Feb 10, 2026

Uh oh!

steveloughran commented Feb 17, 2026

Uh oh!

pan3793 commented Feb 17, 2026

Uh oh!

dongjoon-hyun commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

pan3793 commented Feb 24, 2026

Uh oh!

dongjoon-hyun commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pan3793 commented Feb 24, 2026

Uh oh!

dongjoon-hyun commented Feb 24, 2026

Uh oh!

pan3793 commented Feb 24, 2026

Uh oh!

pan3793 commented Feb 25, 2026

pan3793 commented Jan 28, 2026 •

edited

Loading

github-actions bot commented Jan 28, 2026 •

edited

Loading

pan3793 commented Jan 28, 2026 •

edited

Loading

dongjoon-hyun commented Feb 9, 2026 •

edited

Loading

dongjoon-hyun commented Feb 9, 2026 •

edited

Loading

dongjoon-hyun commented Feb 17, 2026 •

edited

Loading

dongjoon-hyun commented Feb 24, 2026 •

edited

Loading