Skip to content

Properly del_proc spawned procs upon instance finalize#13667

Open
Dupratj wants to merge 2 commits intoopen-mpi:mainfrom
Dupratj:spawn_del_proc_main
Open

Properly del_proc spawned procs upon instance finalize#13667
Dupratj wants to merge 2 commits intoopen-mpi:mainfrom
Dupratj:spawn_del_proc_main

Conversation

@Dupratj
Copy link

@Dupratj Dupratj commented Jan 20, 2026

Upon finalize the ressources allocated for a spawned proc are not all released. This Pull request is a take over of 13445

@github-actions
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

61ca5fd: [pml/ubcl] Removed bad assert: a PML should suport...

  • check_signed_off: does not contain a valid Signed-off-by line

1f1f633: Properly del_proc spawned procs upon instance fina...

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@Dupratj Dupratj marked this pull request as draft January 20, 2026 08:28
Signed-off-by: DUPRAT, JULIEN <julien.duprat@eviden.com>
Signed-off-by: DUPRAT, JULIEN <julien.duprat@eviden.com>
@Dupratj Dupratj force-pushed the spawn_del_proc_main branch from 61ca5fd to 4379f15 Compare January 20, 2026 08:32
@Dupratj Dupratj changed the title Spawn del proc main Properly del_proc spawned procs upon instance finalize Jan 20, 2026
@Dupratj Dupratj marked this pull request as ready for review January 20, 2026 09:07
@hppritcha
Copy link
Member

The NVIDIA failure is not specific to this PR, something about NVIDIA CI breakage that's been going on for weeks so we'll ignore it.

* A: Most likely not, but it would be good to check.
*/
#define PREDEFINED_INSTANCE_PAD 512
#define PREDEFINED_INSTANCE_PAD 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit problematic as it represents an ABI break. any code using MPI_SESSION_NULL would get a runtime warning about an object being resized if it was originally linked against libmpi.so from the 5.0.x release but was not being run against the libmpi.so from main or v6.0.x. Did you have to increase this to get the code to compile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some run the spawn tests in the ompi-tests repo and did not see any regressions from main. Note several tests continued to fail but that's not due to changes in this PR.

Copy link
Author

@Dupratj Dupratj Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize it was an ABI break.
Yes the instance size is bigger than this PREDEFINED_INSTANCE_PAD with the additions of this PR. One thing I could do is use a pointer to a struct and allocate the needed memory after that way we would not exceed the 512 bytes limit.
Unless you see a better way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to change the char i_name[MPI_MAX_OBJECT_NAME]; to being a pointer. See for example commit 2d68804 for how this was done for communicators

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may want to do that sort of a change in separate PR.

@janjust
Copy link
Contributor

janjust commented Feb 10, 2026

@Dupratj please address @hppritcha comment, as is, it can't go in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants