-
Notifications
You must be signed in to change notification settings - Fork 0
Nvidia Blackwell Support #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 7 files
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="BENCHMARKS.md">
<violation number="1" location="BENCHMARKS.md:50">
P2: Incorrect import pattern in code example. The L0 API exports `run` directly from the module, so `from l0 import l0` will fail. Use `import l0` instead, and replace `json_rule()` with an actual exported guardrail like `l0.JSON_ONLY_GUARDRAILS` or `l0.Guardrails.recommended()`.</violation>
</file>
<file name="tests/test_benchmark.py">
<violation number="1" location="tests/test_benchmark.py:291">
P2: Inconsistent TTFT measurement: `start_time` is set after `_internal_run` completes, but in `run_baseline_benchmark` it's set before iteration. This makes time-to-first-token comparisons between baseline and L0 benchmarks invalid. Consider moving `start_time` before the `_internal_run` call to be consistent.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
BENCHMARKS.md
Outdated
|
|
||
| Configure via `check_intervals`: | ||
| ```python | ||
| from l0 import l0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Incorrect import pattern in code example. The L0 API exports run directly from the module, so from l0 import l0 will fail. Use import l0 instead, and replace json_rule() with an actual exported guardrail like l0.JSON_ONLY_GUARDRAILS or l0.Guardrails.recommended().
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At BENCHMARKS.md, line 50:
<comment>Incorrect import pattern in code example. The L0 API exports `run` directly from the module, so `from l0 import l0` will fail. Use `import l0` instead, and replace `json_rule()` with an actual exported guardrail like `l0.JSON_ONLY_GUARDRAILS` or `l0.Guardrails.recommended()`.</comment>
<file context>
@@ -0,0 +1,97 @@
+
+Configure via `check_intervals`:
+```python
+from l0 import l0
+
+result = await l0.run(
</file context>
✅ Addressed in 768c1a8
tests/test_benchmark.py
Outdated
| check_intervals=check_intervals, | ||
| ) | ||
|
|
||
| start_time = time.perf_counter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Inconsistent TTFT measurement: start_time is set after _internal_run completes, but in run_baseline_benchmark it's set before iteration. This makes time-to-first-token comparisons between baseline and L0 benchmarks invalid. Consider moving start_time before the _internal_run call to be consistent.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_benchmark.py, line 291:
<comment>Inconsistent TTFT measurement: `start_time` is set after `_internal_run` completes, but in `run_baseline_benchmark` it's set before iteration. This makes time-to-first-token comparisons between baseline and L0 benchmarks invalid. Consider moving `start_time` before the `_internal_run` call to be consistent.</comment>
<file context>
@@ -0,0 +1,1044 @@
+ check_intervals=check_intervals,
+ )
+
+ start_time = time.perf_counter()
+
+ async for event in result:
</file context>
✅ Addressed in 9d74796
Summary by cubic
Optimized streaming guardrails and drift detection for Nvidia Blackwell–class throughput and added a full benchmark suite and docs. L0 now sustains 90K+ tokens/s with full features, providing ample headroom for 1000+ t/s models.
Performance
Benchmarks
Written for commit dcb1a47. Summary will update automatically on new commits.