Releases: ai-2070/l0-python
🏎️ L0-Python 0.19.0 - Performance Improvements
This release introduces optimizations to our core drift detection logic and updates our event tracing system for better performance.
🚀 Performance Improvements
Drift detection has been significantly optimized by pre-compiling all regex patterns and removing repeated per-check compilation. This reduces overhead across tone, format, repetition, markdown, and hedging detection while preserving identical behavior. The changes are entirely internal but materially improve throughput under high-token streaming workloads.
🧭 Deterministic Callback IDs (UUIDv7)
Guardrail and observability callbacks now use UUIDv7-based IDs instead of UUIDv4. UUIDv7 is time-ordered and faster to generate, improving traceability and event ordering in high-concurrency and distributed systems while maintaining global uniqueness.
🔥 Benchmark Results
Test Environment
- CPU: Apple M1 Max (10 cores)
- Runtime: Python 3.13, pytest 9 with pytest-asyncio 1.3.0
- Methodology: Mock token streams with zero inter-token delay to measure pure L0 overhead
| Scenario | Tokens/s | Avg Duration | TTFT | Overhead |
|---|---|---|---|---|
| Baseline (raw streaming) | 1,518,271 | 1.32 ms | 0.02 ms | - |
| L0 Core (no features) | 551,696 | 3.63 ms | 0.08 ms | 175% |
| L0 + JSON Guardrail | 469,922 | 4.26 ms | 0.07 ms | 223% |
| L0 + All Guardrails | 367,328 | 5.44 ms | 0.08 ms | 313% |
| L0 + Drift Detection | 119,758 | 16.70 ms | 0.08 ms | 1166% |
| L0 Full Stack | 108,257 | 18.48 ms | 0.07 ms | 1301% |
📦 Installation
pip install ai2070-l0
# or
pip install ai2070-l0[openai]
pip install ai2070-l0[litellm]🙀 L0-Python 0.18.0 - Full Pydantic Model Suite
This release delivers a complete Pydantic model export layer for every major L0 type.
✨ New: Full Pydantic Model Suite (l0.pydantic)
L0 now provides a complete Pydantic BaseModel mirror of every major internal dataclass.
You can now import Pydantic equivalents for:
- Core types (
StateModel,RetryModel,TimeoutModel,TelemetryModel, etc.) - Consensus models
- Drift detection
- Guardrails
- Metrics snapshots
- Parallel/race operations
- Pipeline execution
- Pool operations
- Event sourcing + replay
- Observability events
- Windowing/document chunking
Example:
from l0.pydantic import StateModel, RetryModel, DriftResultModel
state = StateModel(content="hello", token_count=5)
json_data = state.model_dump_json()
schema = StateModel.model_json_schema()This enables:
- Typed JSON schemas for OpenAPI/SDKs
- Runtime-safe structured logging
- Interop with FastAPI / Litestar
- Persisting structured observability events
- Easier debugging & replay
📦 The new module contains over 1,500 lines of typed models, covering all L0 dataclasses.
📈 Benchmark Improvements
BENCHMARKS.md received several updates:
- Updated environment to Python 3.13,
pytest 9, andpytest-asyncio 1.3.0 - Clarified methodology
- Updated Nvidia Blackwell section
- Added Python 3.14 performance note:
Pydantic import overhead currently impacts async iteration speed by ~30% in Python 3.14; this appears to be a Pydantic compatibility issue, not a Python regression - Updated instructions for running benchmarks (now explicitly using Python 3.13)
🧩 Summary of Changes
| Area | Change |
|---|---|
| Pydantic Export Layer | Full Pydantic BaseModel suite for all L0 types |
| README | New Pydantic section + improvements |
| Benchmarks | Updated environment, performance notes, 3.14 caveats, commands |
| Events | Updated/expanded Pydantic event definitions |
| Testing | New comprehensive Pydantic model tests |
🎯 Why This Matters
This release lays the foundation for:
- Strong typing across every L0 subsystem
- First-class OpenAPI / schema-driven integrations
- Richer tooling: dashboards, telemetry pipelines, logging processors
- Fully typed observability + replay pipelines
- Easier internal and external adapter development
L0 now provides one of the most complete type-model sets in the Python AI ecosystem.
🐍 Python v0.17.0 - High-Throughput Upgrade
The Python runtime for L0 receives the same performance-focused overhaul as the TypeScript version targeting Nvidia Blackwell support. This release introduces incremental JSON guardrails, sliding-window drift detection, new high-throughput defaults, and a brand-new benchmark suite demonstrating Python’s ability to sustain 120K+ tokens/sec.
This update includes major internal upgrades across guardrails and drift detection.
✨ Highlights
1. ⚡ Incremental JSON Guardrails (O(delta) cost)
json_rule() has been rewritten to match the new TS architecture:
- New
IncrementalJsonStatedataclass - Tracks braces, brackets, string/escape state incrementally
- Only processes delta (new characters), not full content
- Full
analyze_json_structure()executed only at stream completion - Automatic state reset on new/shortened streams
Result: ~5–10× faster per-token guardrail checks under streaming load.
2. 🎯 Sliding Window Drift Detection
DriftConfig now includes:
sliding_window_size: int = 500Drift detection now:
- Analyzes only the last N characters
- Meta commentary, repetition, markdown collapse, tone shift all run on the window
- Reduces drift-detection cost by O(content_length) → O(window_size)
- Matches the TS implementation for cross-platform parity
3. 🚀 New High-Throughput Default Intervals
Python now uses the same optimized defaults as TS:
| Interval | Old | New |
|---|---|---|
| Guardrails | 5 tokens | 15 |
| Drift | 10 tokens | 25 |
| Checkpoint | 10 tokens | 20 |
Updated in ADVANCED.md and CheckIntervals (src/l0/types.py).
4. 🧪 New Benchmark Suite (BENCHMARKS.md)
Full benchmarking added (99 additions):
- Baseline vs core vs guardrails vs drift vs full-stack
- Measured on Apple M1 Max with Python 3.13
- Python achieves 1.5M tokens/sec raw iteration and 120K TPS full-stack with all guardrails enabled
- Ready for 1000+ TPS Nvidia Blackwell inference loads
Benchmarks include reproducible pytest commands.
🗑️ Targeted Deletions / Optimization Removals
- Removed old full-content drift detection paths
- Removed malformed-pattern reporting in streaming phase (now done incrementally)
- Removed obsolete default interval values (5/10/10)
- Removed non-window-based drift comparisons to last full content
L0 for Python - Initial Release (Full Lifecycle + Event Compatibility)
This is the first release of L0 for Python, the deterministic execution substrate for reliable AI streaming - now with full lifecycle parity and event-type compatibility with the TypeScript implementation.
L0 provides the missing reliability layer for all AI streams: deterministic token delivery, retries, fallbacks, guardrails, drift detection, checkpoint resumption, network protection, and full observability - all transparently wrapped around any LLM provider stream.
This release is built for production workloads and ships with 1,800+ tests, real adapter integrations for OpenAI and LiteLLM (100+ providers), and a fully instrumented streaming runtime covering 25+ structured lifecycle events.
🔥 Key Highlights
✅ Full Lifecycle Compatibility
The Python version now includes the complete deterministic lifecycle flow - retries, fallbacks, checkpoints, resume logic, guardrail phases, drift detection, tool-call phases, and completion flow identical in semantics to the TypeScript implementation.
All lifecycle callbacks (on_start, on_event, on_violation, on_retry, on_fallback, on_resume, on_timeout, etc.) are implemented and follow the same event order and guarantees.
🎛️ Central Event Bus with 25+ Structured Event Types
This release introduces the full observability and event-sourcing infrastructure:
SESSION_START,STREAM_INIT,ADAPTER_DETECTEDTIMEOUT_*,RETRY_*,FALLBACK_*GUARDRAIL_*,DRIFT_*,CHECKPOINT_SAVEDTOOL_REQUESTED,TOOL_RESULT,TOOL_ERRORSESSION_SUMMARY&SESSION_END
These events enable complete introspection, replay, debugging, supervision, and telemetry in production systems.
⚡ Deterministic Streaming Runtime
- Token-by-token normalization
- Timeout enforcement (initial + inter-token)
- Checkpointing and last-known-good-token resumption
- Drift detection & pattern-based guardrails
- Network protection across 12+ failure patterns
🔁 Smart Retries & Fallbacks
- Distinguishes model errors from network/transient errors
- Sequential fallback chain with
on_fallbacktelemetry - AWS-style fixed-jitter backoff by default
- Full retry/fallback reasoning surfaced through lifecycle events
🧱 Structured Output with Automatic Repair
- Native Pydantic integration
- Corrects malformed JSON (missing braces, broken fences, trailing commas)
- Guaranteed schema validity
🔌 Adapters
- OpenAI adapter (auto-detected)
- LiteLLM adapter (100+ providers)
- Full API-compatible adapter protocol for custom providers
🧪 Battle-Tested
- 1,800+ unit tests
- 100+ integration tests simulating real streaming conditions
📦 Installation
pip install ai2070-l0
# or
pip install ai2070-l0[openai]
pip install ai2070-l0[litellm]🏁 Quick Example
import asyncio
from openai import AsyncOpenAI
import l0
async def main():
client = l0.wrap(AsyncOpenAI())
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
async for event in response:
if event.is_token:
print(event.text, end="", flush=True)
asyncio.run(main())