Add version guards for v2_27 build compatibility (#2061) by lilyjanjigian · Pull Request #2061 · meta-pytorch/torchcomms

lilyjanjigian · 2026-04-14T00:00:51Z

Summary:

Several diffs landed over the past few months that introduced ncclx-only types (ncclWindow_t, ncclx::Hints, NCCL_FAST_INIT_MODE_RING) into torchcomms code without version guards. This broke the build when using hpc_comms.use_nccl=stable (upstream NCCL v2_27), which doesn't define these types. The ~15-20 backend_nccl and backend_gloo tests in TestX that build with this config were all failing at compile time.

Fixes:

NcclxApi.hpp: Replace constexpr NCCL_WIN_DEFAULT with #ifndef/#define guard to avoid collision with the macro in nccl.h
TorchCommNCCLXBootstrap.hpp/.cpp: Wrap ncclx::Hints and NCCL_FAST_INIT_MODE_RING usage with #ifdef NCCLX_CONFIG_SUPPORTED, with fallback paths for upstream NCCL
TorchCommNCCLX.cpp: Same ncclx::Hints guard in the split function
TorchCommWindowNCCLX.cpp: Wrap get_attr() body with #ifdef NCCL_RMA_SUPPORTED
DeviceBackendTraits.hpp: Conditional Window type alias (ncclWindow_t vs void*) based on NCCL_RMA_SUPPORTED
PipesDeviceBackend.hpp: Added NcclWin type alias with same conditional
ir_include/nccl.h: Added missing NCCL_RMA_SUPPORTED define to the IR stub header used by the device_window_bitcode genrule

Reviewed By: goelayu

Differential Revision: D100670686

meta-codesync · 2026-04-14T00:01:00Z

@lilyjanjigian has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100670686.

Summary: Several diffs landed over the past few months that introduced ncclx-only types (ncclWindow_t, ncclx::Hints, NCCL_FAST_INIT_MODE_RING) into torchcomms code without version guards. This broke the build when using hpc_comms.use_nccl=stable (upstream NCCL v2_27), which doesn't define these types. The ~15-20 backend_nccl and backend_gloo tests in TestX that build with this config were all failing at compile time. Fixes: - NcclxApi.hpp: Replace constexpr NCCL_WIN_DEFAULT with #ifndef/#define guard to avoid collision with the macro in nccl.h - TorchCommNCCLXBootstrap.hpp/.cpp: Wrap ncclx::Hints and NCCL_FAST_INIT_MODE_RING usage with #ifdef NCCLX_CONFIG_SUPPORTED, with fallback paths for upstream NCCL - TorchCommNCCLX.cpp: Same ncclx::Hints guard in the split function - TorchCommWindowNCCLX.cpp: Wrap get_attr() body with #ifdef NCCL_RMA_SUPPORTED - DeviceBackendTraits.hpp: Conditional Window type alias (ncclWindow_t vs void*) based on NCCL_RMA_SUPPORTED - PipesDeviceBackend.hpp: Added NcclWin type alias with same conditional - ir_include/nccl.h: Added missing NCCL_RMA_SUPPORTED define to the IR stub header used by the device_window_bitcode genrule Differential Revision: D100670686

Summary: Several diffs landed over the past few months that introduced ncclx-only types (ncclWindow_t, ncclx::Hints, NCCL_FAST_INIT_MODE_RING) into torchcomms code without version guards. This broke the build when using hpc_comms.use_nccl=stable (upstream NCCL v2_27), which doesn't define these types. The ~15-20 backend_nccl and backend_gloo tests in TestX that build with this config were all failing at compile time. Fixes: - NcclxApi.hpp: Replace constexpr NCCL_WIN_DEFAULT with #ifndef/#define guard to avoid collision with the macro in nccl.h - TorchCommNCCLXBootstrap.hpp/.cpp: Wrap ncclx::Hints and NCCL_FAST_INIT_MODE_RING usage with #ifdef NCCLX_CONFIG_SUPPORTED, with fallback paths for upstream NCCL - TorchCommNCCLX.cpp: Same ncclx::Hints guard in the split function - TorchCommWindowNCCLX.cpp: Wrap get_attr() body with #ifdef NCCL_RMA_SUPPORTED - DeviceBackendTraits.hpp: Conditional Window type alias (ncclWindow_t vs void*) based on NCCL_RMA_SUPPORTED - PipesDeviceBackend.hpp: Added NcclWin type alias with same conditional - ir_include/nccl.h: Added missing NCCL_RMA_SUPPORTED define to the IR stub header used by the device_window_bitcode genrule Reviewed By: goelayu Differential Revision: D100670686

Summary: The pipes triton alltoallv module imports from torchcomms.triton.fb at module level, which fails with ImportError when torchcomms is unavailable in CI. The original import guard (triton = None) was insufficient because triton.jit and requires_torchcomms decorators execute at module-load time, causing AttributeError and NameError respectively. Replace the None stubs with no-op decorator stubs (SimpleNamespace with a passthrough jit, and a passthrough requires_torchcomms) so the module can be imported safely and tests skip gracefully via their existing TRITON_AVAILABLE / CUDA_AVAILABLE checks Differential Revision: D100182678

… test Summary: The GetTopologyAssertsOnEmptyTopoData test used EXPECT_DEATH but getTopology() never aborted on empty topology data — it silently produced a topology vector with 0-length data entries. This caused the death test to either report "failed to die" or "threw an exception" depending on the execution. Fix both sides: 1. Add CHECK_THROW_EXCEPTION validation in getTopology() to reject empty per-transport topology data, consistent with uniflow's error handling conventions (throw, not abort). 2. Change the test from EXPECT_DEATH to EXPECT_THROW(std::runtime_error) to match. Differential Revision: D100359245

Summary: Several diffs landed over the past few months that introduced ncclx-only types (ncclWindow_t, ncclx::Hints, NCCL_FAST_INIT_MODE_RING) into torchcomms code without version guards. This broke the build when using hpc_comms.use_nccl=stable (upstream NCCL v2_27), which doesn't define these types. The ~15-20 backend_nccl and backend_gloo tests in TestX that build with this config were all failing at compile time. Fixes: - NcclxApi.hpp: Replace constexpr NCCL_WIN_DEFAULT with #ifndef/#define guard to avoid collision with the macro in nccl.h - TorchCommNCCLXBootstrap.hpp/.cpp: Wrap ncclx::Hints and NCCL_FAST_INIT_MODE_RING usage with #ifdef NCCLX_CONFIG_SUPPORTED, with fallback paths for upstream NCCL - TorchCommNCCLX.cpp: Same ncclx::Hints guard in the split function - TorchCommWindowNCCLX.cpp: Wrap get_attr() body with #ifdef NCCL_RMA_SUPPORTED - DeviceBackendTraits.hpp: Conditional Window type alias (ncclWindow_t vs void*) based on NCCL_RMA_SUPPORTED - PipesDeviceBackend.hpp: Added NcclWin type alias with same conditional - ir_include/nccl.h: Added missing NCCL_RMA_SUPPORTED define to the IR stub header used by the device_window_bitcode genrule Reviewed By: goelayu Differential Revision: D100670686

Summary: Pull Request resolved: meta-pytorch#2061 Several diffs landed over the past few months that introduced ncclx-only types (ncclWindow_t, ncclx::Hints, NCCL_FAST_INIT_MODE_RING) into torchcomms code without version guards. This broke the build when using hpc_comms.use_nccl=stable (upstream NCCL v2_27), which doesn't define these types. The ~15-20 backend_nccl and backend_gloo tests in TestX that build with this config were all failing at compile time. Fixes: - NcclxApi.hpp: Replace constexpr NCCL_WIN_DEFAULT with #ifndef/#define guard to avoid collision with the macro in nccl.h - TorchCommNCCLXBootstrap.hpp/.cpp: Wrap ncclx::Hints and NCCL_FAST_INIT_MODE_RING usage with #ifdef NCCLX_CONFIG_SUPPORTED, with fallback paths for upstream NCCL - TorchCommNCCLX.cpp: Same ncclx::Hints guard in the split function - TorchCommWindowNCCLX.cpp: Wrap get_attr() body with #ifdef NCCL_RMA_SUPPORTED - DeviceBackendTraits.hpp: Conditional Window type alias (ncclWindow_t vs void*) based on NCCL_RMA_SUPPORTED - PipesDeviceBackend.hpp: Added NcclWin type alias with same conditional - ir_include/nccl.h: Added missing NCCL_RMA_SUPPORTED define to the IR stub header used by the device_window_bitcode genrule Reviewed By: goelayu Differential Revision: D100670686

meta-codesync · 2026-04-17T02:19:46Z

This pull request has been merged in 13fddd5.

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 14, 2026

meta-codesync bot added fb-exported meta-exported labels Apr 14, 2026

meta-codesync bot changed the title ~~Add version guards for v2_27 build compatibility~~ Add version guards for v2_27 build compatibility (#2061) Apr 15, 2026

lilyjanjigian force-pushed the export-D100670686 branch from c30cb00 to 26b16ed Compare April 15, 2026 00:00

Lily Janjigian added 2 commits April 16, 2026 15:42

lilyjanjigian force-pushed the export-D100670686 branch from 26b16ed to 27d0723 Compare April 16, 2026 22:44

lilyjanjigian force-pushed the export-D100670686 branch from 27d0723 to 1d58c34 Compare April 16, 2026 22:48

meta-codesync bot closed this in 13fddd5 Apr 17, 2026

facebook-github-tools bot added the Merged label Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add version guards for v2_27 build compatibility (#2061)#2061

Add version guards for v2_27 build compatibility (#2061)#2061
lilyjanjigian wants to merge 3 commits intometa-pytorch:mainfrom
lilyjanjigian:export-D100670686

lilyjanjigian commented Apr 14, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

meta-codesync bot commented Apr 14, 2026

Uh oh!

meta-codesync bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lilyjanjigian commented Apr 14, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meta-codesync bot commented Apr 14, 2026

Uh oh!

meta-codesync bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lilyjanjigian commented Apr 14, 2026 •

edited by meta-codesync bot

Loading