Skip to content

sqlalchemy-spanner: _designate_type crashes with KeyError on TOKENLIST columns #16621

@waiho-gumloop

Description

@waiho-gumloop

sqlalchemy-spanner version

1.17.3 (latest as of 2026-04-12; also reproducible on 1.17.2)

google-cloud-spanner version

3.63.0

What happened?

SpannerDialect._designate_type() crashes with KeyError: 'TOKENLIST' when introspecting tables that have TOKENLIST columns (used for Spanner full-text search indexes).

Reproduction

-- Create a table with a TOKENLIST column (standard Spanner FTS pattern)
ALTER TABLE my_table ADD COLUMN search_content STRING(MAX);

ALTER TABLE my_table
  ADD COLUMN search_content_tokens TOKENLIST
  AS (TOKENIZE_FULLTEXT(search_content)) HIDDEN;

CREATE SEARCH INDEX MySearchIndex
  ON my_table(search_content_tokens)
  PARTITION BY some_id;
from sqlalchemy import create_engine, inspect

engine = create_engine("spanner+spanner:///projects/.../instances/.../databases/...")
inspector = inspect(engine)
# This crashes:
cols = inspector.get_columns("my_table")
Traceback (most recent call last):
  File ".../google/cloud/sqlalchemy_spanner/sqlalchemy_spanner.py", line 1216, in _designate_type
    return _type_map[str_repr]
KeyError: 'TOKENLIST'

Root cause

`_type_map` does not include `TOKENLIST`, and `_designate_type()` has no fallback for unknown types — it does a bare dict lookup that raises `KeyError`.

Impact

This blocks all table introspection (not just the TOKENLIST table) for any database that uses Spanner full-text search, because tools like `sqlacodegen` and `MetaData.reflect()` call `get_multi_columns()` which hits this crash. The entire introspection fails, not just the one column.

Two issues

  1. `TOKENLIST` is missing from `_type_map`. It should map to a proper `TypeEngine` subclass (not `NullType` — see note below) so that schema introspection roundtrips correctly through Alembic and other DDL tools, and so TOKENLIST columns can be referenced in `SEARCH()`/`SCORE()`/`SNIPPET()` queries through SQLAlchemy.

  2. `_designate_type()` has no fallback for unknown types. Any new Spanner type will crash introspection until the dialect is updated.

Why not `NullType`?

`_type_map_inv` maps `NullType → "INT64"`. If TOKENLIST were mapped to NullType, any roundtrip through Alembic `--autogenerate` would silently produce `INT64` DDL instead of `TOKENLIST`. A proper `TypeEngine` subclass preserves type identity.

Why not filter HIDDEN columns?

TOKENLIST columns are always `HIDDEN` (excluded from `SELECT *`), but they are readable by name and actively used in full-text search queries (`SEARCH()`, `SCORE()`, `SNIPPET()`, `DEBUG_TOKENLIST()`). Filtering them from introspection would prevent users from referencing them in ORM models. The dialect should reflect them with their proper type and let consumers decide how to handle them.

Prior art

The Go client had the same issue and fixed it by adding `TOKENLIST` as a first-class `TypeBase` constant:

Suggested fix

Two complementary changes:

  1. Add a first-class `TOKENLIST` type — a `TypeEngine` subclass with entries in `_type_map` (forward), `_type_map_inv` (inverse), and `SpannerTypeCompiler` (DDL generation).

  2. Add a fallback in `_designate_type()` — `try/except KeyError` returning `NullType` with a `warnings.warn()` for truly unknown future types.

Fix PR: #16622

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions