Use MPI_Count for MpiInt only when appropriate #149

davidedelvento · 2025-10-20T23:56:33Z

This is follow-up on #83 and of course for now it's just a talking point not something anywhere close to be merged. I'm putting it out there to start the conversation with something explicit in front of people, to get constructive criticism, before embarking myself in the full work to implement it (in a possibly broken way)

With that attitude, let me start with some questions @jeffhammond and @albandil

In its current form I'm giving priority to BigMPI vs MPI v4 (because if one bothered to configure with the former, they more likely want to use it than not). If there is no BigMPI and the MPI implementation in use is v4, the PR in its current form blindly uses that. Perhaps we want a flag to revert the "non-big" file transfer regardless?
This PR also assumes that I am understanding correctly both BigMPI's MPIX_* and the MPI4's MPI_*_c invocations. I have only experimented with toy examples for both so please don't blindly assume I'm doing the right thing (TM)
This PR also assumes that the macro MpiInt is correctly implemented among the code and I believe I've seen a couple of places where it isn't (but even if it were, it obviously it has never been tested, so "by default" I assume there is at least a bug somewhere until someone or I proves otherwise). As such, part of the work of this PR will definitely include double checking all existing instances of MpiInt as well as the provenance of all the count fields for all MPI invocations. I haven't checked yet if this needs to involve Fortran too.

albandil · 2025-10-21T06:27:24Z

MpiInt is correctly implemented among the code

MpiInt is almost certainly not implemented completely. Even if the implementation was correct and complete, its aim was different: To conform to a possible (but in practice non-existing) ILP64 MPI interface, i.e., assuming all integers to be long integers, not only the counts. However, the large-counts MPI API is different. On top of that, I also see some places where MpiInt is not used even though it should (e.g., MPI_Irecv in BI_Arecv uses the attribute N of struct BLACBUFF as a count, but this is Int, not MpiInt).

I believe that the best approach is not to pay attention to the original use of MpiInt and replace that macro by the otherwise omnipresent Int in the few places where it appears. Then, starting with a clean MpiInt-less code, to carefully check all MPI calls that their arguments are correctly typed for the chosen MPI library and perform conversion as needed. This was also suggested in #83.

davidedelvento · 2025-10-21T11:56:54Z

MpiInt is almost certainly not implemented completely. Even if the implementation was correct and complete, its aim was different: To conform to a possible (but in practice non-existing) ILP64 MPI interface, i.e., assuming all integers to be long integers, not only the counts. However, the large-counts MPI API is different. On top of that, I also see some places where MpiInt is not used even though it should (e.g., MPI_Irecv in BI_Arecv uses the attribute N of struct BLACBUFF as a count, but this is Int, not MpiInt).

I noticed these things and I hinted to them with my "I believe I've seen a couple of places where it isn't", but I have not payed too much attention to them yet. I will. As such, it may be easier to just scrap the existing MpiInt occurrences and create new ones for count only, I see that. I think we are in the same page here.

I believe that the best approach is not to pay attention to the original use of MpiInt and replace that macro by the otherwise omnipresent Int in the few places where it appears. Then, starting with a clean MpiInt-less code, to carefully check all MPI calls that their arguments are correctly typed for the chosen MPI library and perform conversion as needed.

I am not sure I understand correctly what you are saying here. Int swapping for int or long cannot be used for MPI_Count since that is a MPI-specific thing which could potentially be implemented with something different than long. But you know that and we may be saying the same thing I wrote in the previous paragraph of this comment. To be super-clear my plan is to do the following... disregard the fact that MpiInt already exists, assume I have removed it everywhere and I am reintroducing it according to the following:

if one is compiling with BigMPI or MPIv4 #define MpiInt MPI_Count (as already done in the .h)
otherwise #define MpiInt int (as already done in the .h)
replace all MPI_whatever invocations in the code with _MPI_whatever wrappers (per the example .c file) and make sure all of the latter use MpiInt as a type for count (which isn't done at the moment in the example)
wrap all _MPI_whatever invocations to one of MPIX_whatever, MPI_whatever_c or MPI_whatever depending on how one is compiling the code (as already done in the .h)

Do you have any objections?

I believe this is the same thing suggested in #83 (comment) (but there it wasn't so explicit, hence I'm not totally sure of the original poster intentions).

albandil · 2025-10-21T14:09:28Z

Do you have any objections?

No objections, this sounds good to me.

My unclear comment about removing MpiInt first was meant in this way: Currently, MpiInt is also used in places where it is unnecessary and potentially incorrect whenever MpiInt stands for MPI_Count. An example is Cblacs_get in "BLACS/SRC/blacs_get_.c", where an MpiInt is used for the argument flag of MPI_Comm_get_attr. I believe that this argument is int in every implementation of MPI. So, in this particular occurence, MpiInt should be unconditionally reverted to int. There might be a few other places of this kind (related to the original ILP64 MPI interface ambition) that will require similar action.

… count (i.e. int or MPI_Count)

…ry occurence where the latter is used

davidedelvento · 2025-11-03T20:17:21Z

Sorry for not being clear. This PR continues to not being ready to merge, the reason why I am pushing is because the final version will be hundreds of files changed, and therefore I want to give people a chance to look at it piecewise, as I am developing it.

In its current form it does not even link because the definition of inline int _MPI_Irecv in BLACS/SRC/Bconfig.h is probably not inlined (same happens even if I add static or __attribute__((always_inline))). So I'm mulling if I should create a separate .c file (which I really would like to avoid), add this definition to an existing appropriate .c file (which I have not found yet) or debug the reason why it's not working in the header file (which would be my preference). While I think about that, I will probably keep pushing other files with "trivial" changes for people to review.

If you prefer me quitting this piecemeal approach and do instead a huge dump at the end, let me know and I'll be happy to oblige

…ry occurence where the latter is used

Wanted to check the CI. Not ready for review.

langou · 2025-11-03T20:46:47Z

Piecemeal is all good with me. Not a problem. Maybe have the PR as a draft then, but not a problem with me.

jeffhammond · 2025-11-03T21:31:14Z

MPI_Count is unspecified (outside of the MPI-5 ABI) but its properties require that it is large enough to hold a pointer, an off_t, a ptrdiff_t and a (s)size_t. In practice, it is a 64-bit integer on every system where ScaLAPACK will ever run. However, if you find a pure 32b system that still has the 2GiB file system limit, it might be smaller.

…for MPI

…or MPI

Davide Del Vento added 2 commits October 20, 2025 17:35

starting point to discuss the idea

ae518aa

prioritize BigMPI and add some comments

9c8e2b0

davidedelvento mentioned this pull request Oct 20, 2025

MpiInt is wrong and should be removed #83

Open

Davide Del Vento added 5 commits November 3, 2025 10:17

Making sure the BLACBUFF data structure uses the appropriate type for…

f95b1e3

… count (i.e. int or MPI_Count)

using wrapper in BI_Asend

79df540

return type for MPI_Error_class is always integer

2dbfc83

missing #endif

2153a19

using the (newly defined version of) MpiInt in BI_GetMpiGeType in eve…

7e18b41

…ry occurence where the latter is used

langou previously approved these changes Nov 3, 2025

View reviewed changes

using the (newly defined version of) MpiInt in BI_GetMpiTrType in eve…

1cd8097

…ry occurence where the latter is used

using wrapper in MPI_Op_create

1fbc2bd

Davide Del Vento added 9 commits November 3, 2025 14:48

using wrapper in all reduction sums

4456518

using wrapper in MPI_Type_create_struct (not sure about BigMPI for that)

f0c759b

using the (newly defined version of) MpiInt in BI_cvvamx and BI_cvvamx2

d956762

using the (newly defined version of) MpiInt in BI_ivvamn and BI_ivvamn2

593cc57

removing Int and MpiInt from places where not appropriate for MPI

97fbdb6

wrappers for MPI_Isend

dfa031d

removing Int and MpiInt from additional places where not appropriate …

88e10a6

…for MPI

wrapper for MPI_Type_indexed and MpiInt for BI_GetMpiTrType

3f885e8

removing Int and MpiInt from BI_TransUserComm where not appropriate f…

51955af

…or MPI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use MPI_Count for MpiInt only when appropriate #149

Use MPI_Count for MpiInt only when appropriate #149

Uh oh!

davidedelvento commented Oct 20, 2025 •

edited

Loading

Uh oh!

albandil commented Oct 21, 2025

Uh oh!

davidedelvento commented Oct 21, 2025

Uh oh!

albandil commented Oct 21, 2025

Uh oh!

davidedelvento commented Nov 3, 2025 •

edited

Loading

Uh oh!

langou commented Nov 3, 2025

Uh oh!

jeffhammond commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Use MPI_Count for MpiInt only when appropriate #149

Are you sure you want to change the base?

Use MPI_Count for MpiInt only when appropriate #149

Uh oh!

Conversation

davidedelvento commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albandil commented Oct 21, 2025

Uh oh!

davidedelvento commented Oct 21, 2025

Uh oh!

albandil commented Oct 21, 2025

Uh oh!

davidedelvento commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

langou commented Nov 3, 2025

Uh oh!

jeffhammond commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

davidedelvento commented Oct 20, 2025 •

edited

Loading

davidedelvento commented Nov 3, 2025 •

edited

Loading