Skip to content

Releases: ai-2070/l0-python

🏎️ L0-Python 0.19.0 - Performance Improvements

13 Dec 23:58

Choose a tag to compare

This release introduces optimizations to our core drift detection logic and updates our event tracing system for better performance.


🚀 Performance Improvements

Drift detection has been significantly optimized by pre-compiling all regex patterns and removing repeated per-check compilation. This reduces overhead across tone, format, repetition, markdown, and hedging detection while preserving identical behavior. The changes are entirely internal but materially improve throughput under high-token streaming workloads.


🧭 Deterministic Callback IDs (UUIDv7)

Guardrail and observability callbacks now use UUIDv7-based IDs instead of UUIDv4. UUIDv7 is time-ordered and faster to generate, improving traceability and event ordering in high-concurrency and distributed systems while maintaining global uniqueness.


🔥 Benchmark Results

Test Environment

  • CPU: Apple M1 Max (10 cores)
  • Runtime: Python 3.13, pytest 9 with pytest-asyncio 1.3.0
  • Methodology: Mock token streams with zero inter-token delay to measure pure L0 overhead
Scenario Tokens/s Avg Duration TTFT Overhead
Baseline (raw streaming) 1,518,271 1.32 ms 0.02 ms -
L0 Core (no features) 551,696 3.63 ms 0.08 ms 175%
L0 + JSON Guardrail 469,922 4.26 ms 0.07 ms 223%
L0 + All Guardrails 367,328 5.44 ms 0.08 ms 313%
L0 + Drift Detection 119,758 16.70 ms 0.08 ms 1166%
L0 Full Stack 108,257 18.48 ms 0.07 ms 1301%

📦 Installation

pip install ai2070-l0
# or
pip install ai2070-l0[openai]
pip install ai2070-l0[litellm]

🙀 L0-Python 0.18.0 - Full Pydantic Model Suite

10 Dec 14:07
b5b77f1

Choose a tag to compare

This release delivers a complete Pydantic model export layer for every major L0 type.


✨ New: Full Pydantic Model Suite (l0.pydantic)

L0 now provides a complete Pydantic BaseModel mirror of every major internal dataclass.

You can now import Pydantic equivalents for:

  • Core types (StateModel, RetryModel, TimeoutModel, TelemetryModel, etc.)
  • Consensus models
  • Drift detection
  • Guardrails
  • Metrics snapshots
  • Parallel/race operations
  • Pipeline execution
  • Pool operations
  • Event sourcing + replay
  • Observability events
  • Windowing/document chunking

Example:

from l0.pydantic import StateModel, RetryModel, DriftResultModel

state = StateModel(content="hello", token_count=5)
json_data = state.model_dump_json()
schema = StateModel.model_json_schema()

This enables:

  • Typed JSON schemas for OpenAPI/SDKs
  • Runtime-safe structured logging
  • Interop with FastAPI / Litestar
  • Persisting structured observability events
  • Easier debugging & replay

📦 The new module contains over 1,500 lines of typed models, covering all L0 dataclasses.


📈 Benchmark Improvements

BENCHMARKS.md received several updates:

  • Updated environment to Python 3.13, pytest 9, and pytest-asyncio 1.3.0
  • Clarified methodology
  • Updated Nvidia Blackwell section
  • Added Python 3.14 performance note:
    Pydantic import overhead currently impacts async iteration speed by ~30% in Python 3.14; this appears to be a Pydantic compatibility issue, not a Python regression
  • Updated instructions for running benchmarks (now explicitly using Python 3.13)

🧩 Summary of Changes

Area Change
Pydantic Export Layer Full Pydantic BaseModel suite for all L0 types
README New Pydantic section + improvements
Benchmarks Updated environment, performance notes, 3.14 caveats, commands
Events Updated/expanded Pydantic event definitions
Testing New comprehensive Pydantic model tests

🎯 Why This Matters

This release lays the foundation for:

  • Strong typing across every L0 subsystem
  • First-class OpenAPI / schema-driven integrations
  • Richer tooling: dashboards, telemetry pipelines, logging processors
  • Fully typed observability + replay pipelines
  • Easier internal and external adapter development

L0 now provides one of the most complete type-model sets in the Python AI ecosystem.

🐍 Python v0.17.0 - High-Throughput Upgrade

08 Dec 22:40

Choose a tag to compare

The Python runtime for L0 receives the same performance-focused overhaul as the TypeScript version targeting Nvidia Blackwell support. This release introduces incremental JSON guardrails, sliding-window drift detection, new high-throughput defaults, and a brand-new benchmark suite demonstrating Python’s ability to sustain 120K+ tokens/sec.

This update includes major internal upgrades across guardrails and drift detection.


✨ Highlights

1. ⚡ Incremental JSON Guardrails (O(delta) cost)

json_rule() has been rewritten to match the new TS architecture:

  • New IncrementalJsonState dataclass
  • Tracks braces, brackets, string/escape state incrementally
  • Only processes delta (new characters), not full content
  • Full analyze_json_structure() executed only at stream completion
  • Automatic state reset on new/shortened streams

Result: ~5–10× faster per-token guardrail checks under streaming load.


2. 🎯 Sliding Window Drift Detection

DriftConfig now includes:

sliding_window_size: int = 500

Drift detection now:

  • Analyzes only the last N characters
  • Meta commentary, repetition, markdown collapse, tone shift all run on the window
  • Reduces drift-detection cost by O(content_length) → O(window_size)
  • Matches the TS implementation for cross-platform parity

3. 🚀 New High-Throughput Default Intervals

Python now uses the same optimized defaults as TS:

Interval Old New
Guardrails 5 tokens 15
Drift 10 tokens 25
Checkpoint 10 tokens 20

Updated in ADVANCED.md and CheckIntervals (src/l0/types.py).


4. 🧪 New Benchmark Suite (BENCHMARKS.md)

Full benchmarking added (99 additions):

  • Baseline vs core vs guardrails vs drift vs full-stack
  • Measured on Apple M1 Max with Python 3.13
  • Python achieves 1.5M tokens/sec raw iteration and 120K TPS full-stack with all guardrails enabled
  • Ready for 1000+ TPS Nvidia Blackwell inference loads

Benchmarks include reproducible pytest commands.


🗑️ Targeted Deletions / Optimization Removals

  1. Removed old full-content drift detection paths
  2. Removed malformed-pattern reporting in streaming phase (now done incrementally)
  3. Removed obsolete default interval values (5/10/10)
  4. Removed non-window-based drift comparisons to last full content

L0 for Python - Initial Release (Full Lifecycle + Event Compatibility)

08 Dec 04:19

Choose a tag to compare

This is the first release of L0 for Python, the deterministic execution substrate for reliable AI streaming - now with full lifecycle parity and event-type compatibility with the TypeScript implementation.

L0 provides the missing reliability layer for all AI streams: deterministic token delivery, retries, fallbacks, guardrails, drift detection, checkpoint resumption, network protection, and full observability - all transparently wrapped around any LLM provider stream.

This release is built for production workloads and ships with 1,800+ tests, real adapter integrations for OpenAI and LiteLLM (100+ providers), and a fully instrumented streaming runtime covering 25+ structured lifecycle events.


🔥 Key Highlights

Full Lifecycle Compatibility

The Python version now includes the complete deterministic lifecycle flow - retries, fallbacks, checkpoints, resume logic, guardrail phases, drift detection, tool-call phases, and completion flow identical in semantics to the TypeScript implementation.
All lifecycle callbacks (on_start, on_event, on_violation, on_retry, on_fallback, on_resume, on_timeout, etc.) are implemented and follow the same event order and guarantees.

🎛️ Central Event Bus with 25+ Structured Event Types

This release introduces the full observability and event-sourcing infrastructure:

  • SESSION_START, STREAM_INIT, ADAPTER_DETECTED
  • TIMEOUT_*, RETRY_*, FALLBACK_*
  • GUARDRAIL_*, DRIFT_*, CHECKPOINT_SAVED
  • TOOL_REQUESTED, TOOL_RESULT, TOOL_ERROR
  • SESSION_SUMMARY & SESSION_END

These events enable complete introspection, replay, debugging, supervision, and telemetry in production systems.

Deterministic Streaming Runtime

  • Token-by-token normalization
  • Timeout enforcement (initial + inter-token)
  • Checkpointing and last-known-good-token resumption
  • Drift detection & pattern-based guardrails
  • Network protection across 12+ failure patterns

🔁 Smart Retries & Fallbacks

  • Distinguishes model errors from network/transient errors
  • Sequential fallback chain with on_fallback telemetry
  • AWS-style fixed-jitter backoff by default
  • Full retry/fallback reasoning surfaced through lifecycle events

🧱 Structured Output with Automatic Repair

  • Native Pydantic integration
  • Corrects malformed JSON (missing braces, broken fences, trailing commas)
  • Guaranteed schema validity

🔌 Adapters

  • OpenAI adapter (auto-detected)
  • LiteLLM adapter (100+ providers)
  • Full API-compatible adapter protocol for custom providers

🧪 Battle-Tested

  • 1,800+ unit tests
  • 100+ integration tests simulating real streaming conditions

📦 Installation

pip install ai2070-l0
# or
pip install ai2070-l0[openai]
pip install ai2070-l0[litellm]

🏁 Quick Example

import asyncio
from openai import AsyncOpenAI
import l0

async def main():
    client = l0.wrap(AsyncOpenAI())

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
        stream=True,
    )

    async for event in response:
        if event.is_token:
            print(event.text, end="", flush=True)

asyncio.run(main())