Skip to content

[FEATURE] Constraint code base to statically typed Python subset #214

@oberstet

Description

@oberstet

Summary

Constrain the txaio codebase to a statically typed Python subset, enabling:

  1. All objects have a static type — either automatically inferred by the type checker or explicitly annotated
  2. Implicit Any is forbidden — must be eliminated or explicitly justified
  3. Public API surface is fully typed and safely inferable by tooling
  4. Alignment with modern (2025) Python typing best practices and PEP-compliant conventions

This constraint is a prerequisite for future work toward SLSA Level 4 — compiling typed Python to WASM components for reproducible, verifiable builds.

See also:


Strategic Context

This issue is part of a broader initiative to enable deterministic, reproducible compilation of the WAMP Python ecosystem (txaio, autobahn-python, zlmdb, cfxdb, wamp-xbr, crossbar) to WebAssembly.

The key architectural principle is:

Python is treated as a source language, not a runtime platform.

Type checking is not merely a linting step — it is the first stage of compilation. To use type tools as a compiler frontend, we must ensure every symbol's type is statically known.

See design documents:


Why txaio?

txaio is the foundation of the WAMP Python stack:

  • Pure-Python library for Twisted ↔ asyncio compatibility and async abstraction
  • Already well-aligned with WASI component model (separates async semantics from loop implementation)
  • Small codebase — ideal starting point for demonstrating the typed subset approach

Currently, txaio likely contains:

  • Unannotated functions and methods
  • Implicit attribute creation in __init__
  • Untyped containers and dynamic patterns

This must be addressed systematically.


What "Statically Typed Subset" Means

Required Typing Discipline

All public functions and methods must have:

  • Parameter type annotations
  • Explicit return type annotation
def create_future(result: T | None = None) -> Future[T]:
    ...

Every class must declare instance attributes upfront:

class Foo:
    x: int
    y: str

    def __init__(self, x: int, y: str) -> None:
        self.x = x
        self.y = y

All module-level globals must be typed:

_log: Logger = make_logger()
DEFAULT_TIMEOUT: Final[float] = 30.0

All containers must have explicit type parameters:

callbacks: list[Callable[[], None]] = []
cache: dict[str, Future[Any]] = {}
pending: set[int] = set()

Forbidden Patterns

Implicit Any:

# BAD: Element type cannot be inferred
items = []

# GOOD: Explicit type parameter
items: list[Message] = []

Dynamic attribute creation after __init__:

# BAD: Attribute not declared at class level
self.foo = 1

# GOOD: Declared at class level with type
class Bar:
    foo: int
    def __init__(self) -> None:
        self.foo = 1

Runtime type hacks:

# FORBIDDEN
eval("...")
exec("...")
getattr(obj, dynamic_name)  # where dynamic_name is not a literal
setattr(obj, dynamic_name, value)

Python Typing Style Guide

This project follows modern Python 3.11+ typing conventions aligned with official PEPs.

Minimum Python Version

Python 3.11+ is required. This enables:

  • Self type (PEP 673)
  • Required / NotRequired for TypedDict (PEP 655)
  • ExceptionGroup and except* syntax
  • Native union syntax without from __future__ import annotations

Required Import

Every module must begin with:

from __future__ import annotations

This enables:

  • Forward references without quotes (PEP 563)
  • Consistent annotation behavior
  • Future compatibility with PEP 649 (Python 3.14+)

Union Types (PEP 604)

Use X | Y syntax, not Union[X, Y] or Optional[X]:

# GOOD: Modern union syntax
def process(value: str | None) -> int | str:
    ...

# BAD: Legacy typing module imports
def process(value: Optional[str]) -> Union[int, str]:
    ...

Rationale: PEP 604 union syntax is the standard since Python 3.10. It is more readable and does not require imports from typing.

Built-in Generic Types (PEP 585)

Use lowercase built-in generics, not typing module equivalents:

# GOOD: Lowercase built-in generics
items: list[int] = []
mapping: dict[str, bytes] = {}
pair: tuple[int, str] = (1, "a")
names: set[str] = set()

# BAD: Legacy typing module imports
items: List[int] = []
mapping: Dict[str, bytes] = {}
pair: Tuple[int, str] = (1, "a")

Rationale: PEP 585 made built-in types subscriptable in Python 3.9+. The typing equivalents are now redundant.

Type Aliases

For simple aliases, use direct assignment with explicit annotation:

# GOOD: Simple type alias
Callback: TypeAlias = Callable[[int], None]

# For Python 3.12+, prefer PEP 695 syntax:
type Callback = Callable[[int], None]

TypeVar and Generics

Use TypeVar for generic functions and classes:

from typing import TypeVar

T = TypeVar("T")

def identity(x: T) -> T:
    return x

For constrained type variables:

T = TypeVar("T", bound=Protocol)
AnyStr = TypeVar("AnyStr", str, bytes)

Protocol for Structural Typing (PEP 544)

Prefer Protocol over ad-hoc duck typing:

from typing import Protocol

class Closeable(Protocol):
    def close(self) -> None: ...

def cleanup(resource: Closeable) -> None:
    resource.close()

Constants with Final (PEP 591)

Use Final for module-level constants:

from typing import Final

MAX_RETRIES: Final[int] = 3
DEFAULT_ENCODING: Final[str] = "utf-8"

Literal Types (PEP 586)

Use Literal for specific value constraints:

from typing import Literal

def set_mode(mode: Literal["read", "write", "append"]) -> None:
    ...

Overloads for Conditional Return Types (PEP 484)

Use @overload when return type depends on argument values:

from typing import Literal, overload

@overload
def get_value(key: str, default: None = None) -> str | None: ...
@overload
def get_value(key: str, default: str) -> str: ...

def get_value(key: str, default: str | None = None) -> str | None:
    result = _lookup(key)
    return result if result is not None else default

Self Type (PEP 673)

Use Self for methods returning the same class type:

from typing import Self

class Builder:
    def with_name(self, name: str) -> Self:
        self._name = name
        return self

Avoiding Any

Any defeats static analysis and must be avoided.

If unavoidable, Any must be:

  1. Explicitly justified in a comment
  2. Isolated to minimal scope
  3. Wrapped in a typed facade if possible
# ACCEPTABLE: External library returns untyped data
# TODO(typing): Remove when upstream provides types
raw_data: Any = external_lib.fetch()  # type: ignore[no-untyped-call]
result: ProcessedData = validate_and_convert(raw_data)

Docstrings

Remove redundant :type: and :rtype: annotations when type hints are present:

# GOOD: Type in signature, description in docstring
def connect(host: str, port: int, timeout: float = 30.0) -> Connection:
    """
    Establish a connection to the specified host.

    :param host: The hostname or IP address.
    :param port: The port number.
    :param timeout: Connection timeout in seconds.
    :returns: An established connection.
    :raises ConnectionError: If connection fails.
    """

# BAD: Redundant type information
def connect(host: str, port: int, timeout: float = 30.0) -> Connection:
    """
    Establish a connection to the specified host.

    :param host: The hostname or IP address.
    :type host: str  # REMOVE - redundant
    :param port: The port number.
    :type port: int  # REMOVE - redundant
    :rtype: Connection  # REMOVE - redundant
    """

Rationale: Modern documentation tools (Sphinx, mkdocs) can extract type information from annotations. Duplicating types in docstrings creates maintenance burden and risk of divergence.


Type Checking Configuration

Primary Tool: ty (strict mode)

ty is the authoritative type checker for this project.

Configuration in pyproject.toml:

[tool.ty]
python-version = "3.11"

Strict mode invocation:

ty check --warn any-type

Goals:

  • Eliminate all implicit Any
  • Ensure type inference succeeds everywhere
  • Make errors expressions of real potential runtime issues

Linting: ruff

Enforce annotation presence and style using ruff:

[tool.ruff]
target-version = "py311"
line-length = 120

[tool.ruff.lint]
select = [
    "ANN",  # flake8-annotations
    "I",    # isort
    "E",    # pycodestyle errors
    "F",    # pyflakes
    "W",    # pycodestyle warnings
    "UP",   # pyupgrade (modernize syntax)
    "TCH",  # flake8-type-checking (imports)
]

[tool.ruff.lint.flake8-annotations]
mypy-init-return = true
suppress-none-returning = false
allow-star-arg-any = false

[tool.ruff.lint.isort]
required-imports = ["from __future__ import annotations"]

Implementation Workflow

Phase 1: Configuration

  1. Update pyproject.toml with ruff and ty configuration
  2. Add from __future__ import annotations to all modules
  3. Add type checking to justfile recipes

Phase 2: Progressive Typing

  1. Run ruff check . — fix all ANN violations
  2. Run ty check --warn any-type — eliminate errors
  3. Address one module at a time, starting with core APIs

Phase 3: CI Integration

  1. Add type checking to CI workflow
  2. Gate PRs on passing type checks
  3. Document typed subset contract

Scope and Constraints

In scope:

  • Add type annotations to all public APIs
  • Declare class-level attributes
  • Type all containers explicitly
  • Add from __future__ import annotations to all files

Out of scope (for this issue):

  • Runtime behavior changes
  • Logic rewrites (unless required for stable types)
  • Test typing (tracked separately)

Acceptance Criteria

  • All public functions/methods have parameter and return type annotations
  • All classes declare instance attributes at class level
  • All containers have explicit type parameters
  • No implicit Any (explicit Any only with justification)
  • from __future__ import annotations in every module
  • ty check --warn any-type passes with zero errors
  • ruff check . passes with zero errors (ANN rules enabled)
  • Type checking added to CI

Related Work

  • PR #1838 (autobahn-python): Community contribution adding type hints — to be aligned with this style guide
  • SLSA Level 3 implementation: Current focus on provenance; typed subset enables future Level 4
  • WASM compilation roadmap: Typed subset is prerequisite for Python → WASM compiler frontend

References

PEPs

PEP Title Relevance
PEP 484 Type Hints Foundation
PEP 526 Variable Annotations Class attributes
PEP 544 Protocols: Structural subtyping Interface typing
PEP 563 Postponed Evaluation of Annotations from __future__ import annotations
PEP 585 Type Hinting Generics In Standard Collections list[T] vs List[T]
PEP 586 Literal Types Literal["a", "b"]
PEP 591 Adding a final qualifier Final[T]
PEP 604 Union Operators X | Y syntax
PEP 612 Parameter Specification Variables ParamSpec
PEP 655 Required and NotRequired for TypedDict TypedDict fields
PEP 673 Self Type Self return type
PEP 695 Type Parameter Syntax Python 3.12+ type statement

Tools

  • ty — Astral's type checker (strict mode)
  • ruff — Fast Python linter with annotation rules
  • pyright — Alternative type checker (reference)

Checklist

  • I have searched existing issues to avoid duplicates
  • I have described the problem clearly
  • I have provided use cases
  • I have considered alternatives
  • I have assessed impact and breaking changes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions