-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Newton + Isaac RTX Rendering Performance Optimizations #5017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ncournia-nv
wants to merge
8
commits into
isaac-sim:develop
Choose a base branch
from
ncournia-nv:dev/ncournia/perf-test
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+580
−44
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
c3194ba
Lazy update fabric xforms.
ncournia-nv 181be17
cuda graph fix
ncournia-nv a3d8e00
write images
ncournia-nv 5c0145a
Works.
ncournia-nv c5e109d
Cubric bindings
ncournia-nv c383820
Remove debugging code.
ncournia-nv b9870db
Revert "write images"
ncournia-nv a0248bc
Simplify cuda graph code.
ncournia-nv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
268 changes: 268 additions & 0 deletions
268
source/isaaclab_newton/isaaclab_newton/physics/_cubric.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,268 @@ | ||
| # Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md). | ||
| # All rights reserved. | ||
| # | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
|
|
||
| """Pure-Python ctypes bindings for the cubric GPU transform-hierarchy API. | ||
|
|
||
| Acquires the ``omni::cubric::IAdapter`` carb interface directly from the | ||
| Carbonite framework and wraps its function-pointer methods so that Newton | ||
| can call cubric's GPU transform propagation without C++ pybind11 changes. | ||
|
|
||
| The flow mirrors PhysX's ``DirectGpuHelper::updateXForms_GPU()``: | ||
|
|
||
| 1. ``IAdapter::create`` → allocate a cubric adapter ID | ||
| 2. ``IAdapter::bindToStage`` → bind to the current Fabric stage | ||
| 3. ``IAdapter::compute`` → GPU kernel: propagate world transforms | ||
| 4. ``IAdapter::release`` → free the adapter | ||
|
|
||
| When cubric is unavailable (e.g. CPU-only machine, plugin not loaded), the | ||
| caller falls back to the CPU ``update_world_xforms()`` path. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import ctypes | ||
| import logging | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Carb Framework struct layout (CARB_ABI function-pointer offsets, x86_64) | ||
| # --------------------------------------------------------------------------- | ||
| # Counting only CARB_ABI fields from the top of ``struct Framework``: | ||
| # 0: loadPluginsEx | ||
| # 8: unloadAllPlugins | ||
| # 16: acquireInterfaceWithClient | ||
| # 24: tryAcquireInterfaceWithClient ← we use this one | ||
| _FW_OFF_TRY_ACQUIRE = 24 | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # IAdapter struct layout (from omni/cubric/IAdapter.h) | ||
| # --------------------------------------------------------------------------- | ||
| # 0: getAttribute | ||
| # 8: create(AdapterId*) | ||
| # 16: refcount | ||
| # 24: retain | ||
| # 32: release(AdapterId) | ||
| # 40: bindToStage(AdapterId, const FabricId&) | ||
| # 48: unbind | ||
| # 56: compute(AdapterId, options, dirtyMode, outFlags*) | ||
| _IA_OFF_CREATE = 8 | ||
| _IA_OFF_RELEASE = 32 | ||
| _IA_OFF_BIND = 40 | ||
| _IA_OFF_COMPUTE = 56 | ||
|
|
||
| # AdapterId sentinel | ||
| _INVALID_ADAPTER_ID = ctypes.c_uint64(~0).value | ||
|
|
||
| # AdapterComputeOptions flags (from IAdapter.h) | ||
| _OPT_FORCE_UPDATE = 1 << 0 # Force update, ignoring invalidation status | ||
| _OPT_FORCE_STATE_RECONSTRUCTION = 1 << 1 # Force full rebuild of internal accel structures | ||
| _OPT_SKIP_ISOLATED = 1 << 2 # Skip prims with connectivity degree 0 | ||
| _OPT_RIGID_BODY = 1 << 3 # Use PhysicsRigidBodyAPI tag for inverse propagation | ||
|
|
||
| # Newton prims get tagged with PhysicsRigidBodyAPI at init time so | ||
| # cubric's eRigidBody mode can distinguish rigid-body buckets | ||
| # (Inverse: preserve world matrix written by Newton, derive local) | ||
| # from non-rigid-body buckets (Forward: propagate to children). | ||
| # eForceUpdate is ORed in to bypass the change-listener check. | ||
| _OPT_DEFAULT = _OPT_RIGID_BODY | _OPT_FORCE_UPDATE | ||
|
|
||
| # AdapterDirtyMode | ||
| _DIRTY_ALL = 0 # eAll — dirty all prims in the stage | ||
| _DIRTY_COARSE = 1 # eCoarse — dirty all prims in visited buckets | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # ctypes struct mirrors | ||
| # --------------------------------------------------------------------------- | ||
| class _Version(ctypes.Structure): | ||
| _fields_ = [("major", ctypes.c_uint32), ("minor", ctypes.c_uint32)] | ||
|
|
||
|
|
||
| class _InterfaceDesc(ctypes.Structure): | ||
| """``carb::InterfaceDesc`` — {const char* name, Version version}.""" | ||
| _fields_ = [ | ||
| ("name", ctypes.c_char_p), | ||
| ("version", _Version), | ||
| ] | ||
|
|
||
|
|
||
| def _read_u64(addr: int) -> int: | ||
| return ctypes.c_uint64.from_address(addr).value | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Public API | ||
| # --------------------------------------------------------------------------- | ||
| class CubricBindings: | ||
| """Typed wrappers around the cubric ``IAdapter`` API. | ||
|
|
||
| Call :meth:`initialize` once; if it returns ``True``, the four adapter | ||
| methods are available. | ||
| """ | ||
|
|
||
| def __init__(self) -> None: | ||
| self._ia_ptr: int = 0 | ||
| self._create_fn = None | ||
| self._release_fn = None | ||
| self._bind_fn = None | ||
| self._compute_fn = None | ||
|
|
||
| # -- lifecycle ----------------------------------------------------------- | ||
|
|
||
| def initialize(self) -> bool: | ||
| """Acquire the cubric ``IAdapter`` from the carb framework.""" | ||
| # Ensure the omni.cubric extension (native carb plugin) is loaded. | ||
| try: | ||
| import omni.kit.app | ||
|
|
||
| ext_mgr = omni.kit.app.get_app().get_extension_manager() | ||
| if not ext_mgr.is_extension_enabled("omni.cubric"): | ||
| ext_mgr.set_extension_enabled_immediate("omni.cubric", True) | ||
| if not ext_mgr.is_extension_enabled("omni.cubric"): | ||
| logger.warning("Failed to enable omni.cubric extension") | ||
| return False | ||
| except Exception as exc: | ||
| logger.warning("Cannot enable omni.cubric: %s", exc) | ||
| return False | ||
|
|
||
| # Get Framework* via libcarb.so acquireFramework (singleton). | ||
| try: | ||
| libcarb = ctypes.CDLL("libcarb.so") | ||
| except OSError: | ||
| logger.warning("Could not load libcarb.so") | ||
| return False | ||
|
|
||
| libcarb.acquireFramework.restype = ctypes.c_void_p | ||
| libcarb.acquireFramework.argtypes = [ctypes.c_char_p, _Version] | ||
| fw_ptr = libcarb.acquireFramework(b"isaaclab.cubric", _Version(0, 0)) | ||
| if not fw_ptr: | ||
| logger.warning("acquireFramework returned null") | ||
| return False | ||
|
|
||
| # Read tryAcquireInterfaceWithClient fn-ptr from Framework vtable. | ||
| try_acquire_addr = _read_u64(fw_ptr + _FW_OFF_TRY_ACQUIRE) | ||
| if try_acquire_addr == 0: | ||
| logger.warning("tryAcquireInterfaceWithClient is null in Framework") | ||
| return False | ||
|
|
||
| try_acquire_fn = ctypes.CFUNCTYPE( | ||
| ctypes.c_void_p, # return: void* (IAdapter*) | ||
| ctypes.c_char_p, # clientName | ||
| _InterfaceDesc, # desc (by value) | ||
| ctypes.c_char_p, # pluginName | ||
| )(try_acquire_addr) | ||
|
|
||
| desc = _InterfaceDesc( | ||
| name=b"omni::cubric::IAdapter", | ||
| version=_Version(0, 1), | ||
| ) | ||
|
|
||
| # Try several acquisition strategies — the required client name | ||
| # varies across Kit configurations. | ||
| ia_ptr = try_acquire_fn(b"carb.scripting-python.plugin", desc, None) | ||
| if not ia_ptr: | ||
| ia_ptr = try_acquire_fn(None, desc, None) | ||
| if not ia_ptr: | ||
| acquire_addr = _read_u64(fw_ptr + 16) # acquireInterfaceWithClient | ||
| if acquire_addr: | ||
| acquire_fn = ctypes.CFUNCTYPE( | ||
| ctypes.c_void_p, | ||
| ctypes.c_char_p, | ||
| _InterfaceDesc, | ||
| ctypes.c_char_p, | ||
| )(acquire_addr) | ||
| ia_ptr = acquire_fn(b"isaaclab.cubric", desc, None) | ||
| if not ia_ptr: | ||
| logger.warning( | ||
| "Could not acquire omni::cubric::IAdapter — " | ||
| "cubric plugin may not be registered or interface version mismatch" | ||
| ) | ||
| return False | ||
| self._ia_ptr = ia_ptr | ||
|
|
||
| # Wrap the four IAdapter function pointers we need. | ||
| create_addr = _read_u64(ia_ptr + _IA_OFF_CREATE) | ||
| release_addr = _read_u64(ia_ptr + _IA_OFF_RELEASE) | ||
| bind_addr = _read_u64(ia_ptr + _IA_OFF_BIND) | ||
| compute_addr = _read_u64(ia_ptr + _IA_OFF_COMPUTE) | ||
|
|
||
| if not all([create_addr, release_addr, bind_addr, compute_addr]): | ||
| logger.warning("One or more IAdapter function pointers are null") | ||
| return False | ||
|
|
||
| self._create_fn = ctypes.CFUNCTYPE( | ||
| ctypes.c_bool, ctypes.POINTER(ctypes.c_uint64), | ||
| )(create_addr) | ||
|
|
||
| self._release_fn = ctypes.CFUNCTYPE( | ||
| ctypes.c_bool, ctypes.c_uint64, | ||
| )(release_addr) | ||
|
|
||
| # FabricId is uint64, passed by const-ref -> pointer on x86_64 | ||
| self._bind_fn = ctypes.CFUNCTYPE( | ||
| ctypes.c_bool, ctypes.c_uint64, ctypes.POINTER(ctypes.c_uint64), | ||
| )(bind_addr) | ||
|
|
||
| self._compute_fn = ctypes.CFUNCTYPE( | ||
| ctypes.c_bool, | ||
| ctypes.c_uint64, # adapterId | ||
| ctypes.c_uint32, # options (AdapterComputeOptions) | ||
| ctypes.c_int32, # dirtyMode (AdapterDirtyMode) | ||
| ctypes.c_void_p, # outAccountFlags* (nullable) | ||
| )(compute_addr) | ||
|
|
||
| logger.info("cubric IAdapter bindings ready") | ||
| return True | ||
|
|
||
| @property | ||
| def available(self) -> bool: | ||
| return self._ia_ptr != 0 | ||
|
|
||
| # -- cubric adapter methods ---------------------------------------------- | ||
|
|
||
| def create_adapter(self) -> int | None: | ||
| """Create a cubric adapter. Returns an adapter ID or ``None``.""" | ||
| if not self._create_fn: | ||
| return None | ||
| adapter_id = ctypes.c_uint64(_INVALID_ADAPTER_ID) | ||
| ok = self._create_fn(ctypes.byref(adapter_id)) | ||
| if not ok or adapter_id.value == _INVALID_ADAPTER_ID: | ||
| logger.warning("IAdapter::create failed") | ||
| return None | ||
| return adapter_id.value | ||
|
|
||
| def bind_to_stage(self, adapter_id: int, fabric_id: int) -> bool: | ||
| """Bind the adapter to a Fabric stage.""" | ||
| if not self._bind_fn: | ||
| return False | ||
| fid = ctypes.c_uint64(fabric_id) | ||
| ok = self._bind_fn(adapter_id, ctypes.byref(fid)) | ||
| if not ok: | ||
| logger.warning("IAdapter::bindToStage failed (adapter=%d, fabricId=%d)", adapter_id, fabric_id) | ||
| return ok | ||
|
|
||
| def compute(self, adapter_id: int) -> bool: | ||
| """Run the GPU transform-hierarchy compute pass. | ||
|
|
||
| Uses ``eRigidBody | eForceUpdate`` with ``eAll`` dirty mode. | ||
| ``eRigidBody`` makes cubric apply Inverse propagation on buckets | ||
| tagged with ``PhysicsRigidBodyAPI`` (keeps Newton's world transforms, | ||
| derives local) and Forward on everything else (propagates to children). | ||
| ``eForceUpdate`` bypasses the change-listener dirty check. | ||
| """ | ||
| if not self._compute_fn: | ||
| return False | ||
| flags = ctypes.c_uint32(0) | ||
| ok = self._compute_fn(adapter_id, _OPT_DEFAULT, _DIRTY_ALL, ctypes.byref(flags)) | ||
| if not ok: | ||
| logger.warning("IAdapter::compute returned false (flags=0x%x)", flags.value) | ||
| return ok | ||
|
|
||
| def release_adapter(self, adapter_id: int) -> None: | ||
| """Release an adapter.""" | ||
| if not adapter_id or not self._release_fn: | ||
| return | ||
| self._release_fn(adapter_id) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded byte offsets into native C++ struct layouts
The constants
_FW_OFF_TRY_ACQUIRE = 24,_IA_OFF_CREATE = 8,_IA_OFF_RELEASE = 32,_IA_OFF_BIND = 40, and_IA_OFF_COMPUTE = 56are hardcoded byte offsets into Carbonite'sFrameworkvtable andIAdapter's function-pointer table, derived from a specific build of the Carbonite SDK.If NVIDIA inserts or reorders fields in either struct in a future Kit/Isaac Sim release, every call through these pointers will silently dispatch to the wrong function. Because the pointer reads via
_read_u64bypass all type safety, the failure mode is either silent mis-computation or an immediate segfault — both hard to diagnose.A few mitigations worth considering:
acquireFrameworkreturns a version) to bail out early when the framework version changes.IAdapter::getAttribute(offset 0) and verify the returned version matches the expectedIAdapter0.1version before using the other slots.