feat(dbapi): add retry_aborts_internally option to disable internal statement-replay retry

## Summary

The Spanner DBAPI layer (`spanner_dbapi`) always retries aborted transactions internally by replaying all recorded statements and validating checksums. There is no way to disable this behavior. Applications that implement their own transaction retry logic (re-invoking a callable with a fresh session on abort) experience **nested retry loops** that cause severe contention amplification under concurrent writes.

## Background

When `commit()` receives an `Aborted` exception from Spanner, the DBAPI enters an internal retry loop in [`TransactionRetryHelper.retry_transaction()`](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_dbapi/transaction_helper.py#L165-L210). This loop replays all statements recorded during the transaction and validates checksums of read results to ensure consistency. It retries up to 50 times with exponential backoff.

This mechanism was designed for **Django** and other PEP 249 ORMs that build transactions incrementally through individual `cursor.execute()` calls ([original motivation: googleapis/python-spanner-django#34](https://github.com/googleapis/python-spanner-django/issues/34)). In this model, the DBAPI layer is the only component that can retry — the ORM has no concept of "re-run this transaction from scratch."

However, many applications use a different pattern: wrapping the entire transaction in a callable and re-invoking it on abort (similar to [`Session.run_in_transaction`](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_v1/session.py#L519-L645)). For these applications, the internal retry is unnecessary and harmful.

### The nested retry problem

When an application wraps transactions in its own retry loop **and** the DBAPI also retries internally, the two layers interfere:

1. **Contention amplification (thundering herd)**: The internal replay re-acquires locks on the same rows that caused the original abort. Under concurrent writes, each replay attempt can abort another thread's replay, leading to exponential retry growth across threads.

2. **Wasted wall-clock time**: The internal retry loop accumulates 13–19 seconds of lock wait time (observed in production with 10 concurrent writers) before finally raising `RetryAborted`. The outer application retry then starts fresh, having wasted all that time.

3. **Checksum mismatches on contended rows**: For read-modify-write patterns, replayed reads almost always return different data (because another transaction committed in between), causing [`_compare_checksums()`](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_dbapi/checksum.py#L64-L80) to fail. The internal retry is structurally unable to succeed in this scenario — it always falls through to `RetryAborted` after exhausting retries.

### Relevant code paths

| File | Function | Role |
|---|---|---|
| [`connection.py` L505-515](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_dbapi/connection.py#L489-L497) | `Connection.commit()` | Catches `Aborted`, calls `retry_transaction()`, then recursively calls `commit()` |
| [`transaction_helper.py` L165-210](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_dbapi/transaction_helper.py#L165-L210) | `TransactionRetryHelper.retry_transaction()` | The internal retry loop — replays statements, validates checksums |
| [`checksum.py` L64-80](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_dbapi/checksum.py#L64-L80) | `_compare_checksums()` | Raises `RetryAborted` on checksum mismatch |
| [`exceptions.py` L165-172](https://github.com/googleapis/python-spanner/blob/81f94519db37a7d2eb2d8e966330f48f2e6fdebe/google/cloud/spanner_dbapi/exceptions.py#L165-L172) | `RetryAborted` | Exception raised when internal retry fails validation |

## Timeline

| Date | Commit / PR | Event |
|---|---|---|
| Oct 2020 | [googleapis/python-spanner-django#34](https://github.com/googleapis/python-spanner-django/issues/34) | Original request — Django needs transparent transaction retry |
| Nov 2020 | PR googleapis/python-spanner#156, googleapis/python-spanner#160, googleapis/python-spanner#168 | DBAPI created with built-in statement replay and checksum validation |
| Feb 2021 | [JDBC `RETRY_ABORTS_INTERNALLY`](https://github.com/googleapis/java-spanner-jdbc/blob/main/src/main/java/com/google/cloud/spanner/jdbc/ConnectionOptions.java) | JDBC driver adds opt-out flag for the same reason |
| 2021+ | Go client | Go provides `NewReadWriteStmtBasedTransaction` (with internal retry) vs `ReadWriteTransaction` (without) as separate APIs |
| Mar 2026 | This issue | Python DBAPI still has no way to disable internal retry |

## Proposed Change

Add a `retry_aborts_internally` parameter to `Connection` and `connect()`, following the same pattern used for `read_only` and `request_priority`:

- **Default `True`** — preserves existing behavior; no breaking change
- **When `False`** — `commit()` wraps `Aborted` in `RetryAborted` and raises immediately, bypassing the statement-replay loop

### Files changed

1. **`connection.py`** — Add `retry_aborts_internally` parameter to `__init__` and `connect()`, add property getter/setter, modify `commit()` to check the flag
2. **`test_connection.py`** — 8 new unit tests

### Usage

```python
from google.cloud.spanner_dbapi import connect

# Default (unchanged) — internal retry enabled
conn = connect(instance_id, database_id, project=project)

# Disable internal retry for application-managed retries
conn = connect(instance_id, database_id, project=project,
               retry_aborts_internally=False)

# SQLAlchemy via connect_args
engine = create_engine("spanner:///...",
                       connect_args={"retry_aborts_internally": False})
```

### Production impact

In our workload (10 concurrent writers updating JSON array columns on the same row):

| Configuration | Success rate | Abort-to-recovery time |
|---|---|---|
| Default (nested retries) | ~55% | 13–19 seconds |
| `retry_aborts_internally=False` + app retry | 98–100% | 0.01–0.08 seconds |

## Related

- PR: googleapis/python-spanner#1538
- JDBC equivalent: [`RETRY_ABORTS_INTERNALLY`](https://github.com/googleapis/java-spanner-jdbc/blob/main/src/main/java/com/google/cloud/spanner/jdbc/ConnectionOptions.java)
- Go equivalent: [`NewReadWriteStmtBasedTransaction`](https://pkg.go.dev/cloud.google.com/go/spanner#NewReadWriteStmtBasedTransaction) vs [`ReadWriteTransaction`](https://pkg.go.dev/cloud.google.com/go/spanner#ReadWriteTransaction)
- Django original motivation: [googleapis/python-spanner-django#34](https://github.com/googleapis/python-spanner-django/issues/34)

File	Function	Role
`connection.py` L505-515	`Connection.commit()`	Catches `Aborted`, calls `retry_transaction()`, then recursively calls `commit()`
`transaction_helper.py` L165-210	`TransactionRetryHelper.retry_transaction()`	The internal retry loop — replays statements, validates checksums
`checksum.py` L64-80	`_compare_checksums()`	Raises `RetryAborted` on checksum mismatch
`exceptions.py` L165-172	`RetryAborted`	Exception raised when internal retry fails validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dbapi): add retry_aborts_internally option to disable internal statement-replay retry #16491

Summary

Background

The nested retry problem

Relevant code paths

Timeline

Proposed Change

Files changed

Usage

Production impact

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Date	Commit / PR	Event
Oct 2020	googleapis/python-spanner-django#34	Original request — Django needs transparent transaction retry
Nov 2020	PR googleapis/python-spanner#156, googleapis/python-spanner#160, googleapis/python-spanner#168	DBAPI created with built-in statement replay and checksum validation
Feb 2021	JDBC `RETRY_ABORTS_INTERNALLY`	JDBC driver adds opt-out flag for the same reason
2021+	Go client	Go provides `NewReadWriteStmtBasedTransaction` (with internal retry) vs `ReadWriteTransaction` (without) as separate APIs
Mar 2026	This issue	Python DBAPI still has no way to disable internal retry

Configuration	Success rate	Abort-to-recovery time
Default (nested retries)	~55%	13–19 seconds
`retry_aborts_internally=False` + app retry	98–100%	0.01–0.08 seconds

feat(dbapi): add retry_aborts_internally option to disable internal statement-replay retry #16491

Description

Summary

Background

The nested retry problem

Relevant code paths

Timeline

Proposed Change

Files changed

Usage

Production impact

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions