perf(genesis): optimize step loop with batched state reads and render_depth toggle #37

Shr1ftyy · 2026-02-11T08:14:51Z

Summary

Batch GPU→CPU state transfers (robot + object reads) via torch.cat to minimize sync overhead
Add render_depth param/env-var to disable depth rendering when unused by policy
Pre-compute actuated_dof_idx and physics_steps once in __init__ instead of per-step
Log errors in fallback object-state reads instead of silently swallowing them

Benchmark Results

Configuration	Before	After	Improvement
Default (all rendering)	31.93 FPS	35.74 FPS	+11.9%

Per-component gains

Component	Before	After	Saved
Robot state read	4.11 ms	2.58 ms	-1.53 ms
Object state read	1.34 ms	1.05 ms	-0.29 ms
Action apply	2.29 ms	1.54 ms	-0.75 ms

API Changes

One new optional parameter added to GenesisBaseEnvironment.__init__() and G1Environment.__init__():

render_depth: bool = True — enable/disable depth rendering (default: enabled, backward-compatible)
Also configurable via GENESIS_RENDER_DEPTH env var inside evaluation containers

Benchmark script gains --no-depth CLI flag.

Testing

ruff check . ✅
ty check . ✅
pytest tests/unit/test_genesis.py — 72 passed ✅

Summary by CodeRabbit

Release Notes

New Features
- Added optional depth rendering control for Genesis environments; disable via --no-depth CLI flag or environment variable for improved step performance.
Documentation
- Added Genesis rendering configuration reference guide covering depth rendering controls and environment setup.
Chores
- Updated NumPy dependency constraints for improved compatibility.

Batch GPU→CPU state transfers, add configurable render frame-skip, optional depth rendering, cached image encoding, and pre-computed per-step constants. Brings env.step() from ~32 FPS to ~47.5 FPS (with render_interval=2), closing 95% of the gap to 50 Hz realtime. Key changes: - Batch 6 robot state reads into single torch.cat().cpu() transfer - Batch object state reads similarly - Add render_interval param: skip camera render on intermediate steps - Add render_depth param: disable depth when unused by policy - Cache encoded images across frame-skip steps (avoid redundant base64) - Pre-compute actuated_dof_idx and physics_steps in __init__ - Update benchmark script with --render-interval and --no-depth flags

coderabbitai · 2026-02-11T08:15:09Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Added optional depth-rendering control (constructor arg + GENESIS_RENDER_DEPTH), batched GPU→CPU state/object reads, precomputed actuated-DOF indices and physics-step reuse, camera move-to-attached-link before capture, conditional depth capture and image caching/encoding changes.

Changes

Cohort / File(s)	Summary
Base Environment `kinitro/environments/genesis/base.py`	Added `render_depth` ctor arg and env fallback (`_render_depth`); precompute `_actuated_dof_idx` and `_physics_steps`; batch GPU→CPU reads for robot and objects; use precomputed indices in `_apply_action`; conditional depth capture; maintain explicit RGB/Depth encoded caches.
Environment Subclass `kinitro/environments/genesis/envs/g1_humanoid.py`	Propagated new `render_depth: bool
CLI / Benchmark `scripts/benchmark_genesis.py`	Benchmark CLI and `benchmark()` accept depth toggle (`--no-depth`); pass `render_depth` to `G1Environment`; print depth-rendering state.
Env Example `.env.example`	Added commented `GENESIS_RENDER_DEPTH` example line in Genesis rendering section.
Packaging / Static Analysis `pyproject.toml`	Capped `numpy` to `<2.3` and added `torch` to `tool.ty.analysis.allowed-unresolved-imports`.
Docs `docs/backend-guide.md`, `environments/README.md`	Documented `GENESIS_RENDER_DEPTH`, constructor precedence, and new executor verification env vars and behavior.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Agent
    participant Env as GenesisEnv
    participant Camera
    participant GPU
    participant CPU
    participant Encoder

    Agent->>Env: step(action)
    Env->>Env: _apply_action(action using _actuated_dof_idx)
    Env->>GPU: simulate(steps = _physics_steps)
    Env->>Env: decide_render(_render_depth, caches)
    alt render required
        Env->>Camera: move_to_attached_link()
        Camera->>GPU: render_frame(RGB and maybe Depth)
        GPU->>CPU: batch_read_frames (torch.cat)
        CPU->>Encoder: encode_base64_if_needed
        Encoder->>Env: store_encoded_images
    else reuse cached frames
        Env->>Env: return cached encoded images
    end
    Env->>Agent: observation(state + encoded images)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

"🐇 I hop through frames with careful cheer,

Batching reads and keeping near,
Depth optional, joints in line,
Cached pixels saved just fine.
A rabbit nudges render clear."

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: performance optimizations through batched state reads and a render_depth toggle feature for the Genesis environment.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch perf/genesis-step-optimizations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

torch is a runtime-only dependency bundled with Genesis in the Docker container and not installed in the CI environment.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@kinitro/environments/genesis/base.py`:
- Around line 759-761: The cached encoded payloads are never invalidated when a
fresh render updates pixel buffers, so add invalidation in _capture_camera: when
you assign self._cached_rgb and/or self._cached_depth (the fresh-render path in
_capture_camera), also clear self._cached_rgb_encoded and
self._cached_depth_encoded (e.g., set to None or empty) so subsequent calls to
_build_observation will re-encode the new buffers; locate these members in
_capture_camera and ensure you only invalidate the encoded cache when you
replace the raw buffer references.

🧹 Nitpick comments (4)

kinitro/environments/genesis/base.py (3)
266-285: LGTM on precomputed indices and caching fields.

Clean initialization of rendering caches and precomputed action indices. One minor note:

Line 281: int(self.CONTROL_DT / self.SIM_DT) truncates rather than rounds. With the current constants (0.02 / 0.01 = 2.0) this is fine, but if either constant is ever changed to a value where floating-point division doesn't land exactly, int() will silently drop a substep. round(...) would be a safer choice.
Safer rounding for physics substeps
-        self._physics_steps = max(1, int(self.CONTROL_DT / self.SIM_DT))
+        self._physics_steps = max(1, round(self.CONTROL_DT / self.SIM_DT))
720-725: Silent except Exception: pass swallows per-object read failures.

In the fallback path, individual object read failures are silently discarded with no logging. This can mask real issues (e.g., an entity becoming invalid mid-episode). At minimum, add a logger.debug to match the outer handler.

As per coding guidelines, "Avoid broad except Exception unless you re-raise or return a clear error."
Add debug logging for individual object failures
                 except Exception:
-                    pass
+                    logger.debug("object_state_read_failed", object_id=obj_id)
786-795: Identity-based cache check is fragile (downstream of the caching bug above).

The cam_rgb is not self._cached_rgb check on Line 787 will always be False for fresh renders (same reference), so the guard relies entirely on the is None fallback. With the fix above (resetting _cached_rgb_encoded = None in _capture_camera), this works correctly — but the is not arm is dead code. Consider simplifying to just check for None:
Simplify to a None-only check
-            if cam_rgb is not self._cached_rgb or self._cached_rgb_encoded is None:
+            if self._cached_rgb_encoded is None:
                 self._cached_rgb_encoded = encode_image(cam_rgb)
             rgb["ego"] = self._cached_rgb_encoded
         depth: dict[str, dict | list] = {}
         if cam_depth is not None:
-            if cam_depth is not self._cached_depth or self._cached_depth_encoded is None:
+            if self._cached_depth_encoded is None:
                 self._cached_depth_encoded = encode_image(cam_depth)
             depth["ego"] = self._cached_depth_encoded
scripts/benchmark_genesis.py (1)

207-228: Decomposed benchmark loop doesn't reflect --render-interval / --no-depth.

The decomposed profiling loop (Lines 208–228) always renders every step with depth=True, so its timings represent the unoptimized baseline. The env.step() loop (Lines 252–258) captures the optimized path. This is fine for component-level profiling, but could confuse users comparing the two sections — consider adding a brief note in the output (e.g., "Decomposed timings use default rendering settings") to set expectations.

kinitro/environments/genesis/base.py

When _capture_camera produces new pixel data, the base64-encoded caches were not cleared, causing _build_observation to serve stale encoded frames from the first render for the entire episode.

…depth Docker eval containers can now set GENESIS_RENDER_INTERVAL and GENESIS_RENDER_DEPTH environment variables to tune rendering without any executor protocol changes. Constructor args still take precedence when passed explicitly.

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@kinitro/environments/genesis/base.py`:
- Around line 266-293: The code currently calls
int(os.environ.get("GENESIS_RENDER_INTERVAL", "1")) which will raise ValueError
on malformed env values; update the constructor logic around render_interval
(the branch that reads GENESIS_RENDER_INTERVAL and sets self._render_interval)
to defensively parse the env var with a try/except: attempt to convert the env
string to int, on ValueError fall back to the default (1) and proceed to apply
max(1, render_interval); optionally emit a warning via the module/class logger
when the env var is invalid so developers can notice misconfiguration.

🧹 Nitpick comments (2)

kinitro/environments/genesis/base.py (2)
726-732: Silent pass in fallback swallows per-object read errors with zero observability.

The outer batch failure is logged at debug level, but individual object failures in the fallback are silently discarded. If all per-object reads fail, callers receive an empty dict with no indication. Consider at minimum a debug-level log.
♻️ Add minimal logging
                 except Exception:
-                    pass
+                    logger.debug("object_state_read_failed", object_id=obj_id)
795-806: Identity-based cache reuse for encoded images is correct but subtle — worth a brief inline comment.

The is not check works because _capture_camera stores the reference to _cached_rgb and invalidates _cached_rgb_encoded on fresh renders, while frame-skip steps return the same cached reference with the encoded cache intact. The logic is correct, but the two conditions interact across two methods. A one-line comment explaining why is is safe here would help future readers.

kinitro/environments/genesis/base.py

…ENESIS_RENDER_DEPTH) Add configuration docs to backend guide, environments README, and .env.example for the new container-level rendering knobs.

…ender interval example Replace silent 'except Exception: pass' in the individual object read fallback loop with a debug-level structured log so failures are observable without breaking the fallback behavior. Update .env.example GENESIS_RENDER_INTERVAL to recommend 2 as the default and rewrite the comment to explain the frame-skip tradeoff clearly.

Remove GENESIS_RENDER_INTERVAL and all associated render frame-skip machinery. The camera now renders every step unconditionally, eliminating the cached-frame fields, identity-based encode caching, and the render_interval constructor/env-var plumbing across base.py, g1_humanoid.py, benchmark script, and documentation. GENESIS_RENDER_DEPTH remains as the sole rendering knob.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/benchmark_genesis.py (1)
213-215: ⚠️ Potential issue | 🟡 Minor

Decomposed benchmark loop ignores render_depth flag.

The manual camera.render(rgb=True, depth=True) on Line 214 always renders depth, even when --no-depth is passed. This means the per-component timings won't reflect the actual depth-disabled performance, while the env.step() comparison loop (Line 253) will skip depth — making the two measurements inconsistent.
Proposed fix
-            rgb, depth, _seg, _normal = camera.render(rgb=True, depth=True)
+            rgb, depth, _seg, _normal = camera.render(rgb=True, depth=render_depth)

🧹 Nitpick comments (3)

kinitro/environments/genesis/base.py (3)
730-733: Accessing _camera._attached_link is fragile.

This relies on a private attribute of the Genesis camera object. If the Genesis library changes its internal API, this will break silently or raise AttributeError.

Consider wrapping in hasattr or getattr(..., None) for resilience:
Proposed defensive check
-            if self._camera._attached_link is not None:
+            if getattr(self._camera, "_attached_link", None) is not None:
746-748: Redundant depth suppression after render call.

depth is already None when render_depth=False is passed to camera.render() (most renderers skip computation and return None). If Genesis does return a non-None depth tensor even when depth=False, this guard is necessary — but it would be worth a comment explaining why it's needed so future readers don't remove it as dead code.

766-772: Type annotation dict[str, dict | list] is overly broad.

The actual values are always dict (from encode_image). Consider dict[str, dict[str, Any]] for accuracy if you want to keep it simple, or just dict[str, Any].

…prehension Replace index-based parallel iteration with zip pairs and a dict comprehension for the batch path. The fallback loop now iterates over (obj, entity) tuples directly instead of manual indexing.

Replace manual offset arithmetic with np.split(packed, cumsum(sizes)) so split points are derived from actual tensor shapes. Removes all hand-computed index variables (dof_start, dof_end, vel_start).

Depth rendering is rarely used by current policies and adds overhead. Default GENESIS_RENDER_DEPTH to false; operators can opt in with GENESIS_RENDER_DEPTH=true if needed.

Replace string-matching boolean parsing (true/false/yes/no) with a simple int() cast. Cleaner, fails fast on bad input, and matches common env var conventions.

fix(ci): allow torch as unresolved import in ty type checker

4eb07d5

torch is a runtime-only dependency bundled with Genesis in the Docker container and not installed in the CI environment.

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

kinitro/environments/genesis/base.py Outdated Show resolved Hide resolved

Shr1ftyy added 2 commits February 11, 2026 18:52

fix(genesis): invalidate encoded image caches on fresh renders

9ba3dc8

When _capture_camera produces new pixel data, the base64-encoded caches were not cleared, causing _build_observation to serve stale encoded frames from the first render for the entire episode.

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

kinitro/environments/genesis/base.py Outdated Show resolved Hide resolved

Shr1ftyy added 3 commits February 11, 2026 19:59

docs: document Genesis rendering env vars (GENESIS_RENDER_INTERVAL, G…

54b9860

…ENESIS_RENDER_DEPTH) Add configuration docs to backend guide, environments README, and .env.example for the new container-level rendering knobs.

Shr1ftyy changed the title ~~perf(genesis): optimize step loop for ~49% faster env.step() throughput~~ perf(genesis): optimize step loop with batched state reads and render_depth toggle Feb 11, 2026

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

Shr1ftyy added 5 commits February 11, 2026 21:48

refactor(genesis): simplify _read_object_states with zip and dict com…

5ec60a4

…prehension Replace index-based parallel iteration with zip pairs and a dict comprehension for the batch path. The fallback loop now iterates over (obj, entity) tuples directly instead of manual indexing.

refactor(genesis): use np.split for robot state unpacking

e761325

Replace manual offset arithmetic with np.split(packed, cumsum(sizes)) so split points are derived from actual tensor shapes. Removes all hand-computed index variables (dof_start, dof_end, vel_start).

docs(genesis): add inline comments to batch object state read

aa1fd43

feat(genesis): disable depth rendering by default

a44c863

Depth rendering is rarely used by current policies and adds overhead. Default GENESIS_RENDER_DEPTH to false; operators can opt in with GENESIS_RENDER_DEPTH=true if needed.

refactor(genesis): simplify GENESIS_RENDER_DEPTH to 0/1 integer parsing

b66e0f7

Replace string-matching boolean parsing (true/false/yes/no) with a simple int() cast. Cleaner, fails fast on bad input, and matches common env var conventions.

Shr1ftyy merged commit de97b9d into main Feb 11, 2026
2 checks passed

Shr1ftyy deleted the perf/genesis-step-optimizations branch February 11, 2026 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(genesis): optimize step loop with batched state reads and render_depth toggle #37

perf(genesis): optimize step loop with batched state reads and render_depth toggle #37

Uh oh!

Shr1ftyy commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf(genesis): optimize step loop with batched state reads and render_depth toggle #37

perf(genesis): optimize step loop with batched state reads and render_depth toggle #37

Uh oh!

Conversation

Shr1ftyy commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark Results

Per-component gains

API Changes

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Shr1ftyy commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 11, 2026 •

edited

Loading