Skip to content

[SPARK-55665][PYTHON] Unify how workers establish connection with the executor#54458

Closed
gaogaotiantian wants to merge 2 commits intoapache:masterfrom
gaogaotiantian:sockfile-to-executor
Closed

[SPARK-55665][PYTHON] Unify how workers establish connection with the executor#54458
gaogaotiantian wants to merge 2 commits intoapache:masterfrom
gaogaotiantian:sockfile-to-executor

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

Unify all the sock file connections from different worker files together. And guarantee an explicit flush and close for the sock file.

Why are the changes needed?

We now copy/paste this piece of code all over our code base and it introduces a few issues.

  • Code duplication, obviously.
  • During the copy/paste, we actually made some mistake. data_source_pushdown_filters.py forgets to write pid back but we never test it.
  • We can't guarantee a flush and close for sock file. Now we rely on gc to do that but that's not reliable. We have issues for simple workers.
  • In the future, if we want to drop the PID communication (TODO) or for now if we want to do an explicit flush, we need to change all over our code base.

It's best to just organize the code at a single place.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Locally test_python_datasource passed, the rest is on CI.

Was this patch authored or co-authored using generative AI tooling?

Yes, Cursor(claude-4.6-opus-high).

@zhengruifeng
Copy link
Contributor

merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants