misc: preparation for CSEL redesign #7672

hzhou · 2025-11-14T00:31:14Z

Pull Request Description

Collecting misc. preparation commits to prepare for CSEL redesign.

[skip warnings]

Author Checklist

Provide Description
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits Follow Good Practice
Commits are self-contained and do not do two things at once.
Commit message is of the form: module: short description
Commit message explains what's in the commit.
Passes All Tests
Whitespace checker. Warnings test. Additional tests via comments.
Contribution Agreement
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.

Warning under -Wall.

Do not hide the script. Move it to maint/ as the reset of the autogen scripts.

ipc_p2p.h references MPIDI_POSIX_am_eager_limit, which is defined in shm_am.h. Currently it is probably pulled in by `shm_coll.h`, which we will remove in near future.

The fallback collectives (e.g. MPIR_Bcast_fallback) are manual "auto" functions that may not be the best algorithms for the system, but are sufficient for internal usages during init and object constricutions. Not all collective types have fallbacks defined since internally we use limited types. We'll define new fallback routines when we need them in the future. This prepares for the revamp of CSEL.

They are manual "auto" algorithms that selects from a small set of algorithms that are suitable for internal usages, e.g. during init, finalize, and communicator constructions.

hzhou · 2025-11-21T21:59:51Z

test:mpich/ch3/most
test:mpich/ch4/most

raffenet

Just a couple comments.

raffenet · 2025-12-09T18:16:51Z

src/mpid/ch4/shm/posix/posix_coll_gpu_ipc.h

    }

    /* allgather is needed to exchange all the IPC handles */
+    /* FIXME: call MPIR_Coll_auto */


why do we want MPIR_Coll_auto in this instance?

No, we don't. The comment slipped through from my early draft. I'll remove it.

raffenet · 2025-12-09T18:20:27Z

src/mpid/ch4/src/mpidig_win.h

    MPID_THREAD_CS_EXIT(VCI, MPIDI_VCI_LOCK(vci));
    need_unlock = 0;
-    mpi_errno = MPIR_Barrier(win->comm_ptr, MPIR_COLL_ATTR_SYNC);
+    mpi_errno = MPIR_Barrier_fallback(win->comm_ptr, MPIR_COLL_ATTR_SYNC);


I contend that this barrier is perf critical for RMA apps. It should be the case that we'll never reach it before selection is set, but I could be missing something.

The design for calling fallback interfaces is when we don't desire it subject to runtime selection. It is not part of the design that they are performance-insufficient. We can and should optimize the fallback routine to the point that the performance is sufficient. The fallback routine is designed to be a manual selection auto function. For now, most fallback routines just call a single algorithm because they are sufficient for the current usages. We can improve it in the future as necessary.

I think in this case, the performance from dissemination algorithm is sufficient for the fence synchronization.

I think this instance is unlike the rest in this PR. Fence is essentially barrier synchronization. If MPI_Barrier is subject to runtime selection, I don't see why MPI_Win_fence should not. A vendor may optimize their system for barrier, IIRC BlueGene did this, and we should not prevent that from being used in RMA apps.

I think the design purpose of collective tuning (e.g. tuning a Barrier algorithm) is to affect MPI_Barrier call (from user). If it also affects the behavior of MPI_Win_fence, I would say it is an unintended side effect. I don't disagree that a runtime tuning of MPI_Win_fence is a nice option, but adopting a side-effect tuning is not a good design.

A vendor may optimize their system for barrier, IIRC BlueGene did this, and we should not prevent that from being used in RMA apps.

Certainly. This is inside a ch4 mpidig routine. Vendor can modify it or implement their own fence algorithm.

Now, having one vendor or system using their specific Barrier algorithm and thus potentially affect another vendor's Win_fence algorithm, I think that would be problematic.

Your main argument is we are preventing something. What exactly do you think we are preventing? Then let's discuss how to enable that specific goal properly.

Also, the semantics of MPI_Win_fence is a memory fence, not a barrier. A barrier is an implementation side effect.

Your main argument is we are preventing something. What exactly do you think we are preventing? Then let's discuss how to enable that specific goal properly.

I want to enable runtime algorithm selection of the collective synchronization explicitly required by MPI_Win_fence.

hzhou force-pushed the 2511_coll_prep branch 2 times, most recently from c5e6c36 to ad46f16 Compare November 14, 2025 22:11

hzhou added 5 commits November 21, 2025 15:43

mpit: fix an unused function warning

418cf95

Warning under -Wall.

misc: spelling check

5a4c911

maint: move json_gen.sh to top of maint folder

6607d2d

Do not hide the script. Move it to maint/ as the reset of the autogen scripts.

ch4/shm: inclusion order for shm_am.h

87a6892

ipc_p2p.h references MPIDI_POSIX_am_eager_limit, which is defined in shm_am.h. Currently it is probably pulled in by `shm_coll.h`, which we will remove in near future.

hzhou force-pushed the 2511_coll_prep branch from ad46f16 to 24fbc37 Compare November 21, 2025 21:43

coll: define fallback algorithms

9158350

They are manual "auto" algorithms that selects from a small set of algorithms that are suitable for internal usages, e.g. during init, finalize, and communicator constructions.

hzhou force-pushed the 2511_coll_prep branch from 24fbc37 to 9158350 Compare November 21, 2025 21:53

hzhou requested a review from raffenet November 24, 2025 15:51

raffenet reviewed Dec 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

misc: preparation for CSEL redesign #7672

misc: preparation for CSEL redesign #7672

Uh oh!

hzhou commented Nov 14, 2025 •

edited

Loading

Uh oh!

hzhou commented Nov 21, 2025

Uh oh!

raffenet left a comment

Uh oh!

raffenet Dec 9, 2025

Uh oh!

hzhou Dec 9, 2025

Uh oh!

raffenet Dec 9, 2025 •

edited

Loading

Uh oh!

hzhou Dec 9, 2025

Uh oh!

raffenet Dec 12, 2025

Uh oh!

hzhou Dec 12, 2025

Uh oh!

hzhou Dec 12, 2025

Uh oh!

raffenet Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

misc: preparation for CSEL redesign #7672

Are you sure you want to change the base?

misc: preparation for CSEL redesign #7672

Uh oh!

Conversation

hzhou commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

Author Checklist

Uh oh!

hzhou commented Nov 21, 2025

Uh oh!

raffenet left a comment

Choose a reason for hiding this comment

Uh oh!

raffenet Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

hzhou Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

raffenet Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hzhou Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

raffenet Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

hzhou Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

hzhou Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

raffenet Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hzhou commented Nov 14, 2025 •

edited

Loading

raffenet Dec 9, 2025 •

edited

Loading