diff --git a/RFC-direct-decode.md b/RFC-direct-decode.md new file mode 100644 index 000000000..3a4a7c354 --- /dev/null +++ b/RFC-direct-decode.md @@ -0,0 +1,1055 @@ +# RFC: Direct Row Codecs for persistent + +## Status + +Draft. I have a proof-of-concept implementation PostgreSQL that I'm getting into a publishable state. + +## Summary + +Add a backend-agnostic direct codec layer to persistent that bypasses `PersistValue` entirely. On the **decode** side, typed query results go from the database's wire format to Haskell records without intermediate representations. On the **encode** side, Haskell values go from record fields to the wire format without detouring through `PersistValue`. + +The decode path uses continuation-passing style (CPS) throughout, so the success path allocates zero `Either` constructors and zero `PersistValue` wrappers. The encode path operates similarly, but in the opposite direction. + +The existing `PersistValue`-based path is unchanged. All current code continues to compile and run without modification. + +## Motivation + +### Allocation overhead + +Every row decoded from any persistent backend follows this path today: + +```mermaid +flowchart LR + A["wire format"] -->|"decode to PersistValue"| B["PersistValue per column"] + B -->|"cons cell"| C["[PersistValue]"] + C -->|"fromPersistValue per field"| D["Haskell record"] +``` + +For a 10-column row, this allocates: +- 10 `PersistValue` heap objects (tagged union, 2+ words each) +- 10 list cons cells (3 words each) +- ~9 intermediate `Either Text (a -> b -> ...)` values from the applicative `fromPersistValues` chain +- Then `fromPersistValue` pattern-matches each `PersistValue` again to extract the payload + +That works out to roughly 70 words (~560 bytes) of pure overhead per row, all short-lived. For a 100k row result set, this produces ~56MB of intermediate garbage that exists only to be collected. + +Encoding has a symmetric problem: `toPersistFields` converts each record field into a `PersistValue`, then the backend immediately inspects it to produce the wire format. Every insert and update pays this cost. + +### Limited type vocabulary + +`PersistValue` has a fixed set of constructors that cannot represent backend-specific types natively. This implementation was chosen back in the 2010s when the most common backends were MySQL and PostgreSQL, and the most common use cases were simple CRUD operations. Nowadays, production-grade applications often need to support more types, often with database-specific support. In order to support these types, we currently have to shoehorn them into the existing constructors, losing type safety and precision. Examples from the PostgreSQL backend include: + +- **JSON/JSONB**: stored as `PersistByteString`, losing the distinction between JSON and raw bytes +- **UUID**: stored as `PersistLiteralEscaped` with hex-encoded bytes, requiring a text-to-UUID conversion on each read +- **Intervals**: stored as `PersistRational`, losing semantic meaning (PostgreSQL's `interval` has year/month/day/time components that `Rational` cannot faithfully represent) +- **Composite types**: PostgreSQL composite values have no `PersistValue` representation at all +- **Inet/cidr**: IP address types round-trip through `PersistLiteralEscaped` +- **Arrays of custom types**: `PersistList`/`PersistArray` elements must themselves be `PersistValue`, so arrays of domain types require double conversion +- **Enums**: backend enums are sent as text with no type safety at the Haskell boundary + +Furthermore, heavily using the custom type support makes it harder to improve the underlying library- for example, we could switch to the binary protocol that PostgreSQL supports, but custom values all need to be converted to use the binary protocol, or at least audited to ensure they are compatible. + +This means that nearly any changes to the underlying library constitutes a breaking change for all users of the library, which is long-term unsustainable. + +With the proposed `FieldDecode`/`FieldEncode` classes, backend packages can provide instances for any Haskell type directly, and the type mapping is no longer constrained by `PersistValue`'s fixed vocabulary. This gives us significant flexibility to improve the underlying library without introducing breaking changes for users. + +### Runtime type checks + +`fromPersistValue` performs runtime pattern matching on the `PersistValue` constructor at every field– effectively a type check that the schema already guarantees at compile time. The direct path eliminates this redundancy: TH generates code that calls the correct decoder for each field's statically-known type. + +### Extensibility without breaking changes + +When a database adds a new type (or a backend author wants to support an existing type more faithfully), persistent's current architecture requires either adding a constructor to `PersistValue` (a breaking change to every backend and every `PersistField` instance) or cramming the value into an existing constructor like `PersistLiteralEscaped` (lossy, with no type safety). + +With `FieldDecode`/`FieldEncode`, a backend package can add support for any new type by publishing a new instance- without touching persistent core, without coordinating with other backends, and without breaking any downstream users. A PostgreSQL backend could add `FieldDecode PgRowEnv HStore` or `FieldDecode PgRowEnv TSVector` the day PostgreSQL ships them. Users who don't reference these types are completely unaffected. + +### Stable SQL for prepared statements and pipelining + +Today, persistent generates bulk inserts by expanding `INSERT INTO t VALUES (?,?,?), (?,?,?), ...` with one bind parameter per value. For 10k rows of 10 columns, that's 100k bind parameters and a SQL string whose length changes with every batch– defeating prepared statement caching and forcing the database to re-plan the query each time. + +Similarly, `WHERE id IN (?,?,?,...)` expands to a variable number of bind parameters. + +The direct encode path opens the door to column-oriented encoding: a single fixed SQL template like `INSERT ... SELECT * FROM UNNEST($1, $2, ...)` where each parameter is an array of all values for one column, or `WHERE id = ANY($1)` with a single array parameter. The SQL string stays the same regardless of how many rows or values are involved, which is a prerequisite for proper prepared statement reuse.[^pipeline] + +[^pipeline]: Fixed SQL templates are also a prerequisite for prepared statements, which parse and plan the SQL once and execute it many times. Variable-length SQL strings change with every batch size, so they can never be prepared: every call is a cold parse and plan. Prepared statements in turn enable *protocol-level pipelining*: sending multiple execute messages to the database without waiting for each response before sending the next (see PostgreSQL's [pipeline mode](https://www.postgresql.org/docs/current/libpq-pipeline-mode.html)). Pipeline mode doesn't require prepared statements, but unless prepared statements can be run with a consistent set of parameters, you can't use them effectively in pipeline mode since you have to create a huge number of them. + +## Design: Decoding + +### Core abstraction: `RowReader` + `FieldDecode` + `FromRow` + +I propose introducing three new types/classes in persistent core, all backend-agnostic: + +```haskell +-- CPS column-cursor monad +newtype RowReader env a = RowReader + { unRowReader + :: forall r. env -> Counter + -> (Text -> IO r) -- on error + -> (a -> IO r) -- on success + -> IO r + } + +-- Per-field direct decoding, split into prepare + run. +class FieldDecode env a where + -- | Inspect column metadata once per result set, returning a + -- specialized runner that can decode every row without + -- re-checking types. + prepareField + :: env -> FieldNameDB -> Int + -> (Text -> IO r) -> (FieldRunner env a -> IO r) -> IO r + +-- | The product of 'prepareField'. Runs on each row with only the +-- row-varying data (row index, raw bytes, etc.)- column type +-- dispatch has already happened. +newtype FieldRunner env a = FieldRunner + { runField :: forall r. env -> (Text -> IO r) -> (a -> IO r) -> IO r } + +-- Per-entity direct decoding. TH generates instances. +class FromRow env a where + rowReader :: RowReader env a +``` + +`env` is an opaque, backend-specific type that carries whatever the backend needs to read a column: a result pointer and row index for SQL databases, a BSON document for MongoDB, a key-value list for Redis, etc. + +The key insight behind the prepare/run split is that within a single result set, the column types are fixed: every row has the same OIDs (PostgreSQL), the same field types (MySQL), or the same BSON structure (MongoDB). `FromRow` exposes a `prepareRow` method that runs `prepareField` for each column once per result set, producing a `RowDecoder` that captures the resolved `FieldRunner`s in a closure. The per-row loop then calls only `runField`– no OID dispatch, no vector index, no branching on column types. + +```haskell +-- Produced by prepareRow: no Counter needed, column positions +-- are baked into the captured FieldRunners. +newtype RowDecoder env a = RowDecoder + { runRowDecoder :: forall r. env -> (Text -> IO r) -> (a -> IO r) -> IO r } +``` + +The conduit loop prepares once against the first row's metadata, then reuses the decoder: + +```haskell +ctr <- newCounter +decoder <- prepareRow metaEnv ctr onErr pure -- runs prepareField × N columns +-- per-row loop: only runField, zero dispatch +go decoder row rowCount = ... + val <- runRowDecoderCPS decoder (PgRowEnv ret row colTypes cache) onErr pure + yield val +``` + +`FromRow` instances built from `nextField` (via `rowReader`) still work– the default `prepareRow` falls back to calling `rowReader` per row. TH-generated instances and the stock tuple instances provide real `prepareRow` implementations that capture the runners. + +`FieldDecode` takes both a `FieldNameDB` (for document stores that look up fields by name) and an `Int` column index (for SQL backends that read by position). Each backend uses whichever access pattern is natural and ignores the other. + +### Why CPS? + +A naive `IO (Either Text a)` return type allocates a `Right` constructor at every step in the applicative chain, even on the success path where the error case is never used. With CPS, the success continuation is passed directly and the `Either` is never constructed: + +```haskell +RowReader ff <*> RowReader fa = RowReader $ \env ctr onErr onOk -> + ff env ctr onErr $ \f -> + fa env ctr onErr $ \a -> + onOk (f a) +``` + +Both `prepareField` and `FieldRunner` are CPS too, so the backend's decoder feeds its result directly into the continuation. Callers use `runRowReaderCPS` to supply their success and failure actions without materializing unnecessary values: + +```haskell +val <- runRowReaderCPS rowReader env ctr + (throwIO . PersistMarshalError) -- on error: throw directly + pure -- on success: return value directly +yield val +``` + +The entire path from wire format to `yield` produces zero `Either` constructors and zero `PersistValue` wrappers. + +### TH generates `FromRow` alongside `PersistEntity` + +For an entity `User { userName :: Text, userAge :: Maybe Int }`, `mkPersist` generates both the existing `PersistEntity` instance (unchanged) and a new `FromRow` instance: + +```haskell +instance (FieldDecode env Text, FieldDecode env (Maybe Int)) + => FromRow env User where + rowReader = User + <$> nextField (FieldNameDB "name") + <*> nextField (FieldNameDB "age") +``` + +Under the hood, `nextField` calls `prepareField` on the first row and caches the resulting `FieldRunner`, then applies `runField` for each subsequent row. The instance is polymorphic over `env`– the `FieldDecode` constraints resolve when a concrete backend is chosen, and the entity definition itself remains backend-agnostic. + +### Backend implementations + +Each backend provides its own `env` type and `FieldDecode` instances. The same TH-generated `FromRow` code works across all of them without modification. + +#### PostgreSQL + +I've got a prototype implementation that fully passes all tests for persistent and esqueleto. The `env` environment value wraps a `PGresult` handle, a row index, a vector of column OIDs classified into a `PgType` ADT, and an `OidCache` for resolving dynamically-assigned OIDs (composites, enums, domains). `FieldDecode` instances inspect the `PgType` once during `prepareField` and return a `FieldRunner` that calls the appropriate `postgresql-binary` decoder directly, without re-checking the OID on each row. Instances cover `Bool`, `Int16`-`Int64`, `Int`, `Double`, `Scientific`, `Rational`, `Text`, `ByteString`, `Day`, `TimeOfDay`, `UTCTime`, and `Maybe a`. Backend-specific types like `UUID`, `IPRange`, `DiffTime`, `Value` (JSON), and composite types can be added as further instances without any changes to persistent core. Compound types (arrays, composites) are handled by a value-level `PgDecode`/`PgEncode` layer that composes through the binary wire format (see "Compound types" below). + +```haskell +-- Sketch (simplified from the prototype): +instance FieldDecode PgRowEnv Text where + prepareField env _ col onErr onOk = do + pgType <- columnType env col + case pgType of + Scalar PgText -> onOk (FieldRunner $ \env' onErr' onOk' -> readBytes env' col >>= decodeWith textDecoder onErr' onOk') + Scalar PgVarchar -> onOk (FieldRunner $ \env' onErr' onOk' -> readBytes env' col >>= decodeWith textDecoder onErr' onOk') + _ -> onErr ("type mismatch: expected text, got " <> show pgType) +``` + +The `case pgType of` branch runs once per result set via `prepareRow`. On subsequent rows, only the captured `FieldRunner` executes– no OID lookup, no branching on column types. + +#### SQLite + +The environment wraps a `Sqlite.Statement`. SQLite's dynamic typing means the column type can technically vary per row, so `prepareField` here is lightweight, but it still validates that the column index is in range and captures the statement reference, keeping the per-row `FieldRunner` simple. + +```haskell +instance FieldDecode SqliteRowEnv Text where + prepareField (SqliteRowEnv stmt) _ col onErr onOk = + onOk $ FieldRunner $ \(SqliteRowEnv stmt') onErr' onOk' -> do + ty <- Sqlite.columnType stmt' (colIdx col) + case ty of + Sqlite.NullColumn -> onErr' "unexpected NULL for Text" + _ -> Sqlite.columnText stmt' (colIdx col) >>= onOk' +``` + +#### MySQL + +The environment wraps a vector of `MySQLBase.Field` metadata and a vector of `Maybe ByteString` row data. `prepareField` captures the field metadata once, and the returned `FieldRunner` uses it to call `MySQL.convert` on each row's data without re-reading the metadata. + +```haskell +instance FieldDecode MySQLRowEnv Text where + prepareField env _ col onErr onOk = + let field = mysqlFields env V.! col + in onOk $ FieldRunner $ \env' onErr' onOk' -> + case mysqlRow env' V.! col of + Nothing -> onErr' "unexpected NULL for Text" + Just bs -> case MySQL.convert field bs of + Just t -> onOk' t + Nothing -> onErr' "MySQL: cannot convert to Text" +``` + +#### MongoDB + +The environment wraps a BSON `Document`. `FieldDecode` looks up fields by name (ignoring the column index), which is why the class takes both parameters. For MongoDB, the prepare step is essentially a no-op since each document is self-describing, but the split still keeps the interface uniform. + +```haskell +instance FieldDecode MongoRowEnv Text where + prepareField _ name _ onErr onOk = + onOk $ FieldRunner $ \(MongoRowEnv doc) onErr' onOk' -> + case DB.look (unFieldNameDB name) doc of + Just (DB.String t) -> onOk' t + Just DB.Null -> onErr' "unexpected NULL for Text" + Nothing -> onErr' ("missing field: " <> unFieldNameDB name) + _ -> onErr' "expected String" + +-- Distinguishes absent fields from null fields: +instance FieldDecode MongoRowEnv a => FieldDecode MongoRowEnv (Maybe a) where + prepareField env name col onErr onOk = + prepareField env name col onErr $ \inner -> + onOk $ FieldRunner $ \(MongoRowEnv doc) onErr' onOk' -> + case DB.look (unFieldNameDB name) doc of + Nothing -> onOk' Nothing + Just DB.Null -> onOk' Nothing + _ -> runField inner (MongoRowEnv doc) onErr' (onOk' . Just) +``` + +#### Redis + +The environment wraps binary-encoded key-value pairs. Like MongoDB, `FieldDecode` looks up by field name. The prepare step captures the encoded field name to avoid re-encoding it on every lookup. + +```haskell +instance FieldDecode RedisRowEnv Text where + prepareField _ name _ onErr onOk = + let key = encodeUtf8 (unFieldNameDB name) + in onOk $ FieldRunner $ \(RedisRowEnv pairs) onErr' onOk' -> + case V.find (\(k, _) -> k == key) pairs of + Just (_, bs) -> case Binary.decode (L.fromStrict bs) of + BinPersistText t -> onOk' t + _ -> onErr' "expected Text" + Nothing -> onErr' ("missing field: " <> unFieldNameDB name) +``` + +#### How `FromRow` unifies all backends + +The same TH-generated instance resolves to different concrete code depending on the `env` type. The prepare/run split means that backends with uniform column types (PostgreSQL, MySQL) get the type dispatch out of the hot loop, while document stores (MongoDB, Redis) still benefit from the uniform interface even though their "prepare" step is lighter. + +| `env` | What `prepareField` inspects | What `FieldRunner` reads from | +|-------|------------------------------|-------------------------------| +| PostgreSQL | column OID + `OidCache` for composites/enums | binary row data via `postgresql-binary` decoders | +| SQLite | (lightweight– validates column index) | `sqlite3_column_*` C API per row | +| MySQL | `MySQLBase.Field` metadata | `Maybe ByteString` row data + `MySQL.convert` | +| MongoDB | (no-op– documents are self-describing) | `DB.look` on BSON `Document` (by field name) | +| Redis | encodes field name to `ByteString` | `Binary.decode` from key-value pair | + +The entity definition contains no backend-specific code, and `PersistValue` appears nowhere in the decode path. + +### Specialization: eliminating dictionary overhead + +The `FromRow` and `FieldDecode` instances are polymorphic over `env`, which means that in the general case GHC passes typeclass dictionaries at runtime: an indirect function call per field, per row. For a 10-column entity over 100k rows, that's a million indirect calls that could be direct invocations insteadj. + +The standard fix is `SPECIALIZE` pragmas. Backend packages can emit specializations for their concrete `env` type, and we will extend TH to generate them automatically from the `MkPersistSettings`. + +In order to achieve this, `mkPersist` will gain a new configuration field `mpsDirectEnvTypes :: [Name]` for environments that consumers want to be specialized. When this list contains `'PgRowEnv`, the generated `FromRow` instance for an entity `User` looks like: + +```haskell +instance (FieldDecode env Text, FieldDecode env (Maybe Int)) + => FromRow env User where + rowReader = User <$> nextField "name" <*> nextField "age" + {-# SPECIALIZE instance FromRow PgRowEnv User #-} +``` + +The pragma is placed inside the body of the instance declaration. When GHC sees it, it creates a monomorphic copy of `rowReader @PgRowEnv @User`, and is then able to inline all the `FieldRunner` calls and eliminate dictionary indirection entirely. The per-row loop collapses to a straight sequence of concrete decoder calls with no polymorphism left at runtime. + +Because the list lives in `MkPersistSettings`, code that uses `persistent` against several different backends from a single module (e.g. one module that needs both `PgRowEnv` and `SqliteRowEnv`) can set `mpsDirectEnvTypes = [''PgRowEnv, ''SqliteRowEnv]` and `TH` will emit specializations for both. Backends that don't appear in the list do not get specialized code generated; yet the instance remains polymorphic and continues to compile normally. + +This matters most for the `FieldRunner` closures produced by `prepareField`. After specialization, GHC can see through the closure and inline the decoder body directly into the row-reading loop, which in turn enables further optimizations like unboxing intermediate results and eliminating redundant null checks across adjacent fields. Without specialization, the closure is opaque to the optimizer because its concrete type isn't known at the call site. + +The encode path benefits similarly: `ToRow`'s `toRowBuilder` and each `FieldEncode` instance can be specialized per `Param` type as soon as `TH` generates `SPECIALIZE` pragmas for the encoding side. + +### Query execution + +```haskell +class HasDirectQuery backend where + type Env backend + directQuerySource + :: MonadIO m + => backend -> Text -> [PersistValue] + -> Acquire (ConduitM () (Env backend) m ()) + +class HasDirectInsert backend where + type Param backend + directInsert + :: MonadIO m + => backend -> Text -> SmallArray (Param backend) + -> m () +``` + +Each backend implements `HasDirectQuery` (naming subject to change, just calling it `Direct` to indicate that we're bypassing `PersistValue`,) to send a query and yield one `Env backend` per result row. The conduit consumer runs `prepareField` on the first row (or the result metadata) to obtain `FieldRunner`s, then applies them to every subsequent row via `runField`. The type dispatch happens once and the per-row loop is a straight-line decode. + +`HasDirectInsert` is the encoding counterpart, accepting pre-encoded parameters and sending them to the database. A backend instance for PostgreSQL, for example, would define `type Param PgBackend = PgParam` where `PgParam` carries the OID, binary payload, and format code for each parameter. + +### User-facing API + +In order to make migration as straightforward as possible, the `.Experimental` modules export the same function names as the originals, with additional constraints that indicate that the direct path should be used: + +```haskell +-- Database.Persist.Sql.Experimental: +rawQuery :: (FromRow (Env backend) a, HasDirectQuery backend, ...) + => Text -> [PersistValue] -> ReaderT backend m (Acquire (ConduitM () a m ())) + +rawSql :: (FromRow (Env backend) a, HasDirectQuery backend, ...) + => Text -> [PersistValue] -> ReaderT backend m [a] +``` + +### Esqueleto integration + +A `SqlSelectDirect` class parallels esqueleto's `SqlSelect` with a CPS `RowReader`-based decoder: + +```haskell +class SqlSelect a r => SqlSelectDirect a r env where + sqlSelectDirectRow :: RowReader env r +``` + +This class is parameterized by `env` rather than `backend`– the mapping from backend to env happens at the call site via the `Env` type family. This means `SqlSelectDirect` instances are written per environment type, which is the right granularity: the row format depends on the env, not on which backend wrapper is in use. + +Instances for `Entity`, `Value`, `Maybe (Entity)`, and tuples. The `.Experimental.Direct` module exports `select` with the extra constraint– same name, same query DSL: + +```haskell +-- Database.Esqueleto.Experimental.Direct: +import Database.Esqueleto.Experimental.Direct + +users <- select $ do + p <- from $ table @Person + where_ (p ^. PersonAge >=. val 18) + return p +-- Identical syntax. The direct path is chosen by the import, not the function name. +``` + +The `backend` type variable already carried by esqueleto resolves `Env backend` via the associated type family in `HasDirectQuery`, which then satisfies the `SqlSelectDirect a r (Env backend)` constraint. + +## Design: Encoding + +The decode side has a working prototype. The encode side follows the same principles. + +### The problem + +```mermaid +flowchart LR + A["Haskell record"] -->|"toPersistFields"| B["[PersistValue]"] + B -->|"encode per field"| C["wire format"] +``` + +Every field is boxed into `PersistValue` and then immediately unboxed. For bulk inserts of 10k rows × 10 columns, that's 100k unnecessary `PersistValue` allocations. + +### `FieldEncode`: one class, one method + +```haskell +class FieldEncode param a where + encodeField :: a -> param +``` + +`param` is a backend-specific encoded parameter type. Each backend decides what `param` looks like: a PostgreSQL backend might use a type carrying an OID, encoded bytes, and a format tag; an SQLite backend might use a sum type mirroring SQLite's type affinity (`SqliteInt !Int64 | SqliteText !Text | ...`). + +The class is deliberately minimal: it has one class with one method, but enables a wide range of backends to be supported without breaking changes. + +### Alternative: contravariant encoders (hasql-style) + +It's worth considering the approach that hasql takes here, which uses contravariant functors to compose encoders: + +```haskell +-- hasql's style: +userParams :: Params User +userParams = + (userName >$< param (nonNullable text)) + <> (userAge >$< param (nullable int4)) +``` + +The `>$<` operator (contramap from `Contravariant`) lets you project a field out of a record and feed it into an encoder, and `<>` sequences them. The result is a single `Params User` value that knows how to encode an entire record in one pass. + +This approach is advantageous in several ways: + +1. The encoder is a first-class value that can be composed, stored, and reused; you can build an encoder for a composite type by combining encoders for its parts, and the types enforce that every field is accounted for. +2. It separates the *description* of the encoding from the *execution*, which pairs well with the prepare/run split on the decode side: you could prepare a `Params` once and run it per row. + +My main concern around this is the learning curve: `Contravariant` and `Divisible` are less familiar than `Applicative` to most Haskell developers, and the corpus of good intuition around them is a bit lacking. For a library like `persistent`, whose user base spans from beginners to experts, that friction matters. `hasql`'s encoding API is one of the things that I've found to be something of a barrier to adoption with `hasql`, even though I'm relatively fluent with Haskell and appreciate the type safety it provides. + +That said, if we're already asking users to change imports and adopt new constraints, the marginal cost of learning contravariant composition might be acceptable? Especially since TH can generate the encoders automatically for `Entity` types, meaning most users would only encounter the raw API when writing custom queries. The generated code would look something like: + +```haskell +instance HasEncoder param User where + encoder = + (userName >$< fieldEncoder @Text) + <> (userAge >$< fieldEncoder @(Maybe Int)) +``` + +This is an area where we should probably prototype both approaches and see which one leads to clearer error messages and more natural composition in practice. The simple `FieldEncode` class is easier to explain and implement first, but if we're going to make breaking changes at some point anyway, the contravariant approach might be the better long-term bet. It may also be the case that we can build a small DSL on top of `Contravariant` that doesn't feel as foreign to users, which may give us the best of both worlds. + +### `ToRow`: TH-generated, produces a builder + +```haskell +-- Writes encoded params into a SmallMutableArray, avoiding intermediate lists. +newtype ParamBuilder param = ParamBuilder (SmallMutableArray RealWorld param -> Int -> IO Int) + +instance Monoid (ParamBuilder param) + +writeParam :: FieldEncode param a => a -> ParamBuilder param +buildParams :: Int -> ParamBuilder param -> IO (SmallArray param) +``` + +TH generates: + +```haskell +instance (FieldEncode param Text, FieldEncode param (Maybe Int)) + => ToRow param User where + toRowBuilder (User name age) = writeParam name <> writeParam age +``` + +Usage: + +```haskell +params <- buildParams 2 (toRowBuilder user) +-- params :: SmallArray param, ready for the backend +``` + +### Typed query parameters + +`FieldEncode` also gives us typed query parameters without routing through `[PersistValue]`: + +```haskell +rawQueryDirectTyped @(Entity User) + "SELECT ?? FROM user WHERE age > $1 AND name LIKE $2" + (writeParam (18 :: Int) <> writeParam ("A%" :: Text)) +``` + +Each Haskell value goes directly to the backend's encoded format, skipping `toPersistValue` entirely. + +### Column-oriented encoding: UNNEST and = ANY + +The direct encode path is designed to support column-oriented parameter encoding, which enables two important patterns: + +**Bulk inserts via UNNEST**: instead of `INSERT INTO t VALUES (?,?,?), (?,?,?), ...` with a dynamic SQL string, use `INSERT INTO t (c1, c2, ...) SELECT * FROM UNNEST($1, $2, ...)` where each `$N` is an array containing all values for that column across all rows. The SQL template is fixed regardless of batch size, enabling prepared statement reuse. + +**IN-clause via = ANY**: instead of `WHERE id IN (?,?,?,...)` with a variable number of parameters, use `WHERE id = ANY($1)` with one array parameter. Again, the SQL is fixed. + +Both patterns require encoding a collection of Haskell values directly into the database's binary array format: + +```haskell +class FieldEncodeArray param a where + encodeColumnArray :: Vector a -> param +``` + +TH can generate a columnar encoder that transposes a `Vector record` into per-column arrays and encodes each one directly: + +```haskell +class ToRowColumnar param a where + toColumnarBuilder :: Vector a -> ParamBuilder param + +instance (FieldEncodeArray param Text, FieldEncodeArray param (Maybe Int)) + => ToRowColumnar param User where + toColumnarBuilder users = + writeParam (encodeColumnArray (V.map userName users)) + <> writeParam (encodeColumnArray (V.map userAge users)) +``` + +For 10k rows × 10 columns, this means that instead of 100k `PersistValue` objects transposed into `PersistArray` lists, the direct path builds 10 binary column arrays from the record fields with no intermediate representation. + +## Allocation comparison (decode path) + +For a 10-column entity, per row: + +|| | `PersistValue` path | Direct CPS path | +|---|---|---|---| +| `PersistValue` objects | 10 | 0 | +| List cons cells | 10 | 0 | +| `Either` from applicative chain | ~9 | 0 | +| `Either` from field decode | 10 | 0 (CPS) | +| `Either` at boundary | 0 | 0 (`runRowReaderCPS`) | +| Boxed `Int` (column counter) | 10 | 0 (unboxed counter) | +| Column type dispatch | 10 per row | 10 once (via `prepareRow`) | +| **Total intermediate objects** | **~49** | **0** | + +For 100k rows, that's roughly 4.9 million intermediate objects eliminated. + +## Migration strategy: `.Experimental` modules + +Following the precedent set by esqueleto (which introduced `Database.Esqueleto.Experimental` for its new `FROM` syntax), the direct codec API lives in `.Experimental` modules alongside the existing API. Users opt in by changing their imports. The existing modules are unchanged. + +### persistent core + +| Module | Contents | +|--------|----------| +| `Database.Persist.Sql` | Unchanged. `selectList`, `get`, `rawSql`, etc. continue to use `[PersistValue]`. | +| `Database.Persist.Sql.Experimental` | **New.** Re-exports everything from `Database.Persist.Sql`, plus direct-path variants with additional `FromRow`/`ToRow` constraints. Same names, same signatures– just extra constraints. | + +The experimental module provides versions of the standard operations that take the direct path when the constraints are satisfied: + +**Note on Standard Operations:** + +As of this RFC, only raw query functions are implemented (`rawQueryDirect` and `rawSqlDirect`). The higher-level operations listed above (`selectList`, `get`, `insertMany_`) are planned but not yet implemented. These would be implemented in experimental modules to give users typed, zero-allocation versions of core persistent operations. + +Users switch by changing one import: + +```haskell +-- Before: +import Database.Persist.Sql + +-- After (direct path, zero PersistValue): +import Database.Persist.Sql.Experimental +``` + +All existing code continues to work because the experimental signatures are strictly more constrained: they accept a subset of the original callers (those whose backend supports direct codecs). If a backend doesn't have `HasDirectQuery`/`FromRow` instances, etc., the experimental variants simply won't type-check, and the user stays on the original import. + +### esqueleto + +| Module | Contents | +|--------|----------| +| `Database.Esqueleto.Experimental` | Unchanged. | +| `Database.Esqueleto.Experimental.Direct` | **New.** Re-exports everything from `Database.Esqueleto.Experimental`, plus `selectDirect`, `selectOneDirect`, etc. with `SqlSelectDirect` constraints. | + +```haskell +-- Database.Esqueleto.Experimental (unchanged): +select :: (SqlSelect a r, MonadIO m, SqlBackendCanRead backend) + => SqlQuery a -> ReaderT backend m [r] + +-- Database.Esqueleto.Experimental.Direct (new): +select :: ( SqlSelect a r, SqlSelectDirect a r (Env backend) + , MonadIO m, SqlBackendCanRead backend, HasDirectQuery backend ) + => SqlQuery a -> ReaderT backend m [r] +``` + +As long as we provide the same function name, same return type, same query DSL- the only difference is the extra constraints ensuring the direct path is used. The one area where users will run into labor on their end is converting their custom type instances to use the new direct encoding/decoding APIs. + +### Backend packages + +Each backend package that supports direct codecs exports its `env` type from the primary existing module, as well as its `FieldDecode`/`FieldEncode` instances. These don't need their own `.Experimental` modules: the instances are always available, and they are picked up via normal instance resolution when the user uses the experimental persistent/esqueleto modules. + +### Graduation path + +Once the direct path is proven stable: + +1. The `.Experimental` signatures move into the main modules. The extra constraints are additive: they narrow the accepted backends but don't change behavior for backends that satisfy them. +2. Backends that don't support direct codecs continue to work via the `[PersistValue]` path. +3. Eventually, `PersistValue`-based operations could be deprecated in favor of the direct path. + +This mirrors esqueleto's migration of its `FROM` syntax from `.Experimental` to the recommended default. + +## Relationship to `SqlBackend` + +A key design tension that I think requires some discussion: much of the persistent ecosystem (esqueleto's `rawSelectSource`, persistent's default `PersistQueryRead` implementation, user code using `SqlPersistT`) projects down to bare `SqlBackend`, erasing the concrete backend type. The `HasDirectQuery backend` constraint needs to reduce `Env backend` to a concrete type, but `SqlBackend` doesn't correspond to any particular backend, so `Env SqlBackend` has no meaningful definition in terms of the way we want to use it in this proposal. + +There are two approaches I think we could take, and they complement each other to some extent. + +### Approach 1: Backend-specific types (recommended for new code) + +Users and libraries work with the concrete backend type (`WriteBackend PostgreSQLBackend`, `SqliteBackend`, etc.) instead of `SqlBackend`. The `HasDirectQuery` constraint resolves naturally. + +For most _application_ code, the concrete type appears only in the runner (`withPostgresqlPool`, `withSqliteConn`, etc.) and the `ReaderT backend m` type. Switching from `SqlPersistT` to a backend-specific `ReaderT` is a relatively straightforward find-and-replace change. + +### Approach 2: A direct-codec field on `SqlBackend` + +For code that flows through `SqlBackend`, and there's a lot of it, we'd ideally bridge the gap by adding a dedicated field that carries the direct codec machinery from whatever backend created the connection. `SqlBackend` already works this way for other operations: `connInsertSql`, `connPrepare`, `connInsertManySql`, etc. are all function-valued fields that the backend populates at connection setup time, closing over its own concrete types. The direct path would fit naturally into the same pattern: + +```haskell +data SqlBackend = SqlBackend + { ...existing fields... + , connDirectCodecs :: Maybe DirectCodecs + } +``` + +The naive idea is to use a rank-2 type so the caller passes a *polymorphic* `RowReader` and the backend instantiates it at its own concrete `env`: + +```haskell +-- ⚠ THIS DOES NOT TYPECHECK: see below +data DirectCodecs = DirectCodecs + { dcQueryAndDecode + :: forall a m. MonadIO m + => (forall env. FromRow env a => RowReader env a) + -> Text -> [PersistValue] + -> Acquire (ConduitM () a m ()) + } +``` + +#### Why the rank-2 approach fails + +The argument `(forall env. FromRow env a => RowReader env a)` is fine from the *caller's* perspective. the TH-generated `rowReader` class method has exactly this type (polymorphic in `env`, constrained by `FromRow env a`). The caller can provide it. + +The problem is on the *consumption* side. The backend's implementation needs to instantiate `env` at its concrete type: + +```haskell +mkPgDirectCodecs :: PgConnection -> DirectCodecs +mkPgDirectCodecs conn = DirectCodecs + { dcQueryAndDecode = \polyReader sql params -> do + let reader = polyReader @PgRowEnv -- needs FromRow PgRowEnv a + src <- pgQuerySource conn sql params + src .| decodeRowsConduit reader + } +``` + +When the backend writes `polyReader @PgRowEnv`, GHC needs to discharge the constraint `FromRow PgRowEnv a`. But `a` is a rigid type variable from the outer `forall a`. The backend's lambda must work for *all* `a`, and GHC has no instance `FromRow PgRowEnv a` that covers every possible type. The fact that `FieldDecode PgRowEnv Text` etc. are "in scope" doesn't help; GHC can't perform instance resolution for `FromRow PgRowEnv a` without knowing what `a` is. This produces: + +``` +Could not deduce (FromRow PgRowEnv a) arising from a use of 'polyReader' +``` + +This is a fundamental tension: you can't erase both `env` (via the closure in `SqlBackend`) and `a` (via `forall a`) while still relying on typeclass instance resolution to connect them. The dictionary for something like `FromRow PgRowEnv User` exists at each specific call site, but nobody at the backend's definition site can work with it. + +#### Possible directions + +Several alternative encodings were considered, but each has significant drawbacks: + +**Parameterize `DirectCodecs` by `env`:** If `DirectCodecs env` carries the env type, the backend can accept `RowReader env a` directly. But `SqlBackend` would need to hold `DirectCodecs env` for some `env`, meaning either `SqlBackend` becomes parameterized (a massive breaking change to the entire ecosystem) or the `env` is hidden behind an existential... at which point the caller can't construct a `RowReader` for an env it doesn't know. + +**Compile the decoder at the call site:** The call site knows both `a` and the backend, so it *could* resolve all constraints and produce a fully monomorphic decoding function to pass into `DirectCodecs`. But this requires the call site to know the concrete env type, which means it already has the concrete backend type– and if it has the concrete backend type, it doesn't need the `SqlBackend` bridge. + +**Add a `FromRow` constraint to `dcQueryAndDecode`:** This just pushes the problem: `DirectCodecs` is stored in `SqlBackend` where `a` isn't in scope, so there's nowhere to put the constraint. + +#### Solution: `DirectEntity` with `Typeable`-based dispatch + +The key insight is that the *entity itself* can carry the knowledge of which `env` types it supports– doing at the term level what `PersistEntityBackend` does at the type level. The mechanism is `Typeable` + `eqTypeRep`. + +**A new class: the entity advertises its decoders** + +```haskell +class DirectEntity a where + lookupDirectDecoder + :: forall env. Typeable env + => Proxy env -> Maybe (RowDecoder env a) +``` + +TH generates instances using `eqTypeRep` to test the existentially-hidden `env` at runtime. When `mpsDirectEnvTypes = [''PgRowEnv]`: + +```haskell +instance DirectEntity User where + lookupDirectDecoder :: forall env. Typeable env + => Proxy env -> Maybe (RowDecoder env User) + lookupDirectDecoder _ = + case eqTypeRep (typeRep @env) (typeRep @PgRowEnv) of + Just HRefl -> Just $ RowDecoder $ \env' onErr onOk -> do + ctr <- newCounter + prepareRow env' ctr onErr $ \decoder -> + runRowDecoder decoder env' onErr onOk + Nothing -> Nothing +``` + +When `HRefl` matches, GHC learns `env ~ PgRowEnv` in that branch and can discharge `FromRow PgRowEnv User` via the polymorphic instance (since the `FieldDecode PgRowEnv` instances are in scope from the backend import). + +For applications that target multiple backends, `mpsDirectEnvTypes` lists all of them and TH chains the `eqTypeRep` cases: + +```haskell +-- mpsDirectEnvTypes = [''PgRowEnv, ''SqliteRowEnv] +instance DirectEntity User where + lookupDirectDecoder :: forall env. Typeable env + => Proxy env -> Maybe (RowDecoder env User) + lookupDirectDecoder _ = + case eqTypeRep (typeRep @env) (typeRep @PgRowEnv) of + Just HRefl -> Just $ RowDecoder $ \env' onErr onOk -> do + ctr <- newCounter + prepareRow env' ctr onErr $ \d -> runRowDecoder d env' onErr onOk + Nothing -> case eqTypeRep (typeRep @env) (typeRep @SqliteRowEnv) of + Just HRefl -> Just $ RowDecoder $ \env' onErr onOk -> do + ctr <- newCounter + prepareRow env' ctr onErr $ \d -> runRowDecoder d env' onErr onOk + Nothing -> Nothing +``` + +Each branch independently resolves its `FromRow` constraint– the PostgreSQL branch needs `FieldDecode PgRowEnv Text` etc. in scope, the SQLite branch needs `FieldDecode SqliteRowEnv Text` etc. Both are available from their respective backend imports. The fallback `Nothing` is returned for any `env` type not in the list. + +At runtime, the `SqlBackend` carries a `DirectQueryCap` from whichever backend created the connection. A PostgreSQL connection stores `MkDirectQueryCap (Proxy @PgRowEnv) ...`, a SQLite connection stores `MkDirectQueryCap (Proxy @SqliteRowEnv) ...`. When `lookupDirectDecoder` runs, exactly one branch matches and produces the decoder; the others are never entered. + +**`SqlBackend` stores an existential capability** + +```haskell +data DirectQueryCap where + MkDirectQueryCap + :: forall env. Typeable env + => !(Proxy env) + -> (forall a. RowDecoder env a -> Text -> [PersistValue] -> IO [a]) + -> DirectQueryCap +``` + +Each backend populates this at connection setup with its own concrete `env`: + +```haskell +-- PostgreSQL backend +mkPgDirectQueryCap :: PgConnection -> DirectQueryCap +mkPgDirectQueryCap conn = MkDirectQueryCap (Proxy @PgRowEnv) $ + \decoder sql params -> do + ret <- pgExecParams conn sql params + decodeAllRows decoder ret -- builds PgRowEnv per row + +-- SQLite backend +mkSqliteDirectQueryCap :: Sqlite.Connection -> DirectQueryCap +mkSqliteDirectQueryCap conn = MkDirectQueryCap (Proxy @SqliteRowEnv) $ + \decoder sql params -> do + stmt <- Sqlite.prepare conn sql + bindParams stmt params + decodeAllRows decoder stmt -- builds SqliteRowEnv per row +``` + +The backend's lambda receives a `RowDecoder env a` with `env` already fixed to its concrete type. It doesn't need `FromRow PgRowEnv a` or `FromRow SqliteRowEnv a`– it already has the fully-resolved decoder. It just constructs `env` values and feeds them in. + +**Call site: everything lines up** + +```haskell +rawSqlDirectCompat + :: forall record m. (DirectEntity record, MonadIO m) + => Text -> [PersistValue] -> ReaderT SqlBackend m (Maybe [record]) +rawSqlDirectCompat sql pvs = do + backend <- ask + case connDirectQueryCap backend of + Nothing -> pure Nothing -- backend doesn't support direct path + Just (MkDirectQueryCap (proxy :: Proxy env) queryFn) -> + case lookupDirectDecoder @record proxy of + Nothing -> pure Nothing -- entity doesn't support this env + Just decoder -> liftIO $ Just <$> queryFn decoder sql pvs +``` + +Type trace: + +- Pattern-matching `MkDirectQueryCap` binds existential `env` with `Typeable env` +- `lookupDirectDecoder @record proxy` tests `env` against known envs via `eqTypeRep` +- If it matches, `decoder :: RowDecoder env record` with the same existential `env` +- `queryFn :: RowDecoder env a -> ...`– same `env` +- `queryFn decoder sql pvs` typechecks; `a` unifies with `record` + +The dictionary resolution happens inside `lookupDirectDecoder` (at the entity's definition site, where the `FieldDecode` instances are in scope), not at the backend's definition site. That asymmetry is what makes this work where the rank-2 approach fails. + +**Composability for tuples and `Entity`** + +```haskell +instance (DirectEntity a, DirectEntity b) => DirectEntity (a, b) where + lookupDirectDecoder proxy = do + da <- lookupDirectDecoder @a proxy + db <- lookupDirectDecoder @b proxy + pure $ RowDecoder $ \env onErr onOk -> + runRowDecoder da env onErr $ \va -> + runRowDecoder db env onErr $ \vb -> + onOk (va, vb) + +instance (DirectEntity (Key record), DirectEntity record) + => DirectEntity (Entity record) where + lookupDirectDecoder proxy = do + dk <- lookupDirectDecoder @(Key record) proxy + dv <- lookupDirectDecoder @record proxy + pure $ RowDecoder $ \env onErr onOk -> + runRowDecoder dk env onErr $ \k -> + runRowDecoder dv env onErr $ \v -> + onOk (Entity k v) +``` + +`Maybe` short-circuits the composition: if either inner `lookupDirectDecoder` returns `Nothing` (e.g. because one component doesn't support this backend's `env`), the whole tuple/entity returns `Nothing`. + +**Concrete backends bypass this entirely** + +The `Typeable` mechanism is strictly the `SqlBackend` compatibility bridge. With a concrete backend type, normal static dispatch applies: + +``` +ReaderT PgBackend m a → static FromRow (Env PgBackend) record + no Typeable, no Maybe, full specialization + +ReaderT SqlBackend m a → DirectEntity record + Typeable-based lookupDirectDecoder + graceful fallback if env doesn't match +``` + +There is no overlap: `SqlBackend` and concrete backends are different types. Library code (esqueleto, persistent's default `PersistQueryRead` impl) that works with `SqlBackend` gets the dynamic path. Application code that uses the concrete backend type gets the static path. Both share the same TH-generated `FromRow` instances and the same `FieldRunner` hot loop– the only difference is how the initial decoder is resolved. + +**Cost** + +- One `eqTypeRep` call per query setup (fingerprint comparison, negligible) +- `Typeable env` constraint on the backend's env type (trivially derivable, already holds for any concrete type in GHC) +- `lookupDirectDecoder` returns `Nothing` if the entity wasn't generated for that backend's env → graceful fallback to `PersistValue` path +- Default `lookupDirectDecoder _ = Nothing` keeps full backward compatibility + +**What TH generates** + +For `mpsDirectEnvTypes = [''PgRowEnv, ''SqliteRowEnv]`, a single `mkPersist` call emits: + +```haskell +-- Polymorphic: works for both the concrete-backend and SqlBackend paths +instance (FieldDecode env Text, FieldDecode env (Maybe Int)) + => FromRow env User where + rowReader = User <$> nextField "name" <*> nextField "age" + prepareRow env ctr onErr onOk = do + col0 <- advanceCounter ctr + prepareField env "name" col0 onErr $ \r0 -> do + col1 <- advanceCounter ctr + prepareField env "age" col1 onErr $ \r1 -> + onOk $ RowDecoder $ \env' onErr' onOk' -> + runField r0 env' onErr' $ \v0 -> + runField r1 env' onErr' $ \v1 -> + onOk' (User v0 v1) + {-# SPECIALIZE instance FromRow PgRowEnv User #-} + {-# SPECIALIZE instance FromRow SqliteRowEnv User #-} + +-- SqlBackend bridge: chains eqTypeRep for each env in mpsDirectEnvTypes +instance DirectEntity User where + lookupDirectDecoder _ = + case eqTypeRep (typeRep @env) (typeRep @PgRowEnv) of + Just HRefl -> Just $ RowDecoder $ \env' onErr onOk -> do + ctr <- newCounter + prepareRow env' ctr onErr $ \d -> runRowDecoder d env' onErr onOk + Nothing -> case eqTypeRep (typeRep @env) (typeRep @SqliteRowEnv) of + Just HRefl -> Just $ RowDecoder $ \env' onErr onOk -> do + ctr <- newCounter + prepareRow env' ctr onErr $ \d -> runRowDecoder d env' onErr onOk + Nothing -> Nothing +``` + +The same entity works with both backends through `SqlBackend`. At runtime, the connection determines which branch fires. + +## Compound types: value-level codecs + +`FieldDecode` operates at the column level– it gets `env` (with a result pointer and column index) and decodes one result-set column into one Haskell value. This is sufficient for scalar types and embedded entities (which flatten into multiple columns and are handled by `FromRow`'s applicative composition). + +However, backends with compound wire-format types need a second layer. PostgreSQL has arrays, composites, ranges, and nested combinations thereof– all encoded as a single column whose binary payload contains structured sub-values. These sub-values are not result-set columns, so `FieldDecode` can't recurse into them. + +### The two-layer architecture + +``` +PostgreSQL binary wire format + │ + ▼ + PgDecode / PgDecoder ← value-level: operates on raw bytes + │ composes for arrays, composites, ranges + │ + ▼ + FieldDecode PgRowEnv ← column-level: wraps PgDecode + │ adds column metadata lookup, prepare/run split + │ + ▼ + FromRow PgRowEnv record ← row-level: sequences FieldDecode across columns + TH-generated, applicative composition +``` + +The value-level layer is backend-specific. Other backends define their own if needed (MongoDB doesn't need one– BSON provides typed access; SQLite doesn't– it has no compound column types). + +### `PgDecoder`: a reader over `OidCache` + +Both encode and decode of compound types need access to an `OidCache`– a mapping from dynamically-assigned PostgreSQL OIDs (composites, enums, domains) to their type metadata, populated at connection time from `pg_type`. + +`PgDecoder` wraps `postgresql-binary`'s `PD.Value` in a reader so the cache is threaded without manual plumbing: + +```haskell +newtype PgDecoder a = PgDecoder { runPgDecoder :: OidCache -> PD.Value a } + +instance Functor PgDecoder +instance Applicative PgDecoder +instance Monad PgDecoder +``` + +`PgComposite` wraps `PD.Composite` the same way: + +```haskell +newtype PgComposite a = PgComposite (OidCache -> PD.Composite a) + +instance Functor PgComposite +instance Applicative PgComposite +``` + +### DSL combinators + +```haskell +pgValue :: PD.Value a -> PgDecoder a -- lift raw decoder +pgComposite :: PgComposite a -> PgDecoder a -- composite → value +pgField :: PgDecoder a -> PgComposite a -- non-nullable sub-field +pgFieldNullable :: PgDecoder a -> PgComposite (Maybe a) -- nullable sub-field +pgArray :: PgDecoder a -> PgDecoder [a] -- array of non-nullable +pgArrayNullable :: PgDecoder a -> PgDecoder [Maybe a] -- array of nullable +``` + +### `PgDecode` class + +```haskell +class PgDecode a where + pgDecoder :: PgDecoder a +``` + +Scalar instances are one-liners wrapping `postgresql-binary` decoders: + +```haskell +instance PgDecode Text where pgDecoder = pgValue PD.text_strict +instance PgDecode Int64 where pgDecoder = pgValue PD.int +instance PgDecode Bool where pgDecoder = pgValue PD.bool +instance PgDecode UTCTime where pgDecoder = pgValue PD.timestamptz_int +-- ...etc. +``` + +These don't replace the existing `FieldDecode PgRowEnv Text` instances– the scalar `FieldDecode` instances still handle OID dispatch at the column level (accepting `PgText`, `PgVarchar`, `PgBpchar`, etc. for `Text`). `PgDecode` is used only when composing inside compound types. + +Arrays get a single generic instance: + +```haskell +instance PgDecode a => PgDecode [a] where + pgDecoder = pgArray pgDecoder + +instance PgDecode a => PgDecode [Maybe a] where + pgDecoder = pgArrayNullable pgDecoder +``` + +A composite type like: + +```sql +CREATE TYPE address AS (street text, city text, zip text); +``` + +becomes: + +```haskell +instance PgDecode Address where + pgDecoder = pgComposite $ Address + <$> pgField pgDecoder + <*> pgField pgDecoder + <*> pgField pgDecoder +``` + +And `FieldDecode` for composite columns is a one-liner: + +```haskell +instance FieldDecode PgRowEnv Address where + prepareField = compositeFieldDecode -- provided helper +``` + +Arrays of composites, composites containing arrays, nested composites– all compose through `pgDecoder` without duplication: + +```haskell +-- [Address] works automatically via PgDecode [a] +instance FieldDecode PgRowEnv [Address] where + -- generic instance: PgDecode a => FieldDecode PgRowEnv [a] + -- resolves via PgDecode Address above +``` + +### `PgEncode`: symmetric encode side + +```haskell +newtype PgEncoder a = PgEncoder { runPgEncoder :: OidCache -> a -> PE.Encoding } + +class PgEncode a where + pgEncoder :: PgEncoder a + pgTypeOid' :: OidCache -> Proxy a -> Word32 +``` + +Scalar instances are pure (ignoring the cache); composite instances use it to look up dynamically-assigned OIDs: + +```haskell +instance PgEncode Text where + pgEncoder = pgConst PE.text_strict + pgTypeOid' _ _ = scalarOidWord32 PgText + +instance PgEncode Address where + pgTypeOid' cache _ = lookupCompositeOid cache "address" + pgEncoder = pgCompositeEncoder $ \cache (Address s c z) -> + pgEncodeField cache s + <> pgEncodeField cache c + <> pgEncodeField cache z +``` + +`pgEncodeField` produces a `PE.Composite` field tagged with the value's OID: + +```haskell +pgEncodeField :: PgEncode a => OidCache -> a -> PE.Composite +``` + +Arrays compose generically: + +```haskell +instance PgEncode a => PgEncode [a] where + pgEncoder = pgArrayEncoder + pgTypeOid' cache _ = lookupArrayOid cache (pgTypeOid' cache (Proxy @a)) +``` + +### OidCache: bridging PostgreSQL's dynamic type system + +`OidCache` is populated once at connection setup from `pg_type`: + +```sql +SELECT oid, typname, typtype, typarray, typrelid +FROM pg_type +WHERE typtype IN ('c', 'e', 'd', 'r') +``` + +This maps dynamic OIDs to structured `PgType` values: + +```haskell +data PgType + = Scalar !PgScalar + | Array !PgScalar + | Composite !Text !Int -- type name, array OID + | CompositeArray !Text -- element type name + | Enum !Text !Int -- type name, array OID + | EnumArray !Text -- element type name + | Unrecognized !Int +``` + +`PgRowEnv` carries the cache so `FieldDecode` instances can validate composite/enum OIDs at prepare time rather than blindly accepting any `Unrecognized` OID: + +```haskell +data PgRowEnv = PgRowEnv + { pgResult :: !LibPQ.Result + , pgRow :: !LibPQ.Row + , pgCols :: !(V.Vector (LibPQ.Column, PgType)) + , pgRowCache :: !OidCache + } +``` + +On the encode side, `PgParam` closes over the cache as a function rather than a flat triple, so composite/enum types can resolve their OIDs at send time without changing `FieldEncode`'s pure signature: + +```haskell +newtype PgParam = PgParam + { materializeParam :: OidCache -> Maybe (LibPQ.Oid, ByteString, LibPQ.Format) } +``` + +The cache is applied once when the backend sends parameters to `libpq`. + +### `PersistField` compatibility + +`PersistField` instances are still required for all entity field types. The `PersistValue`-based path serves as the fallback for: + +- `SqlBackend` when `lookupDirectDecoder` returns `Nothing` +- Entities not generated with `mpsDirectEnvTypes` +- Any backend that doesn't populate `connDirectQueryCap` + +Beyond the fallback, `PersistField` is baked into `PersistEntity` itself– the existing TH-generated `fromPersistValues`/`toPersistFields` use it, and migration/schema introspection depends on it. + +For custom types, authors write both: + +- `PersistField` (required by `PersistEntity`, used by fallback and existing code) +- `FieldDecode`/`FieldEncode` (used by the direct path for backends they care about) + +For built-in types (`Text`, `Int`, `Bool`, `UTCTime`, etc.), persistent core provides `PersistField`, and the backend packages provide `FieldDecode`/`FieldEncode`. TH generates `FromRow` and `DirectEntity`. Users of standard types see no additional work. + +For custom types that don't have native `FieldDecode` instances, a bridge helper can decode via `PersistField` at the cost of a per-field `PersistValue` allocation for that field only– the rest of the row still decodes directly. diff --git a/persistent-postgresql-ng/ARCHITECTURE.md b/persistent-postgresql-ng/ARCHITECTURE.md new file mode 100644 index 000000000..515d0ad2b --- /dev/null +++ b/persistent-postgresql-ng/ARCHITECTURE.md @@ -0,0 +1,408 @@ +# persistent-postgresql-ng Architecture + +A PostgreSQL backend for the [persistent](https://hackage.haskell.org/package/persistent) library that uses the **binary wire protocol** and **libpq pipeline mode** to reduce round-trips and improve throughput. + +## Motivation + +The standard `persistent-postgresql` backend sends every operation as a synchronous request-response pair over the text protocol via `postgresql-simple`. For workloads that issue many small DML statements (deletes, updates, replaces) this means one network round-trip per operation. + +`persistent-postgresql-ng` changes two things: + +1. **Binary protocol**– parameters and results use PostgreSQL's binary format via `postgresql-binary`, avoiding text serialization overhead. +2. **Automatic pipelining**– DML operations are sent eagerly without waiting for a response. Results are deferred until a read operation or transaction commit forces them to be consumed. + +The pipelining design is inspired by [Hedis](https://hackage.haskell.org/package/hedis), where commands are sent eagerly and replies are read lazily. Instead of lazy IO, this library uses a pending-result counter with explicit drain points. + +## Module Overview + +``` +Database.Persist.Postgresql +├── Pipeline.hs -- SqlBackend construction, statement execution, pipeline lifecycle +├── Pipeline/ +│ ├── Internal.hs -- PgConn type, libpq wrappers, pipeline mode primitives +│ └── FFI.hs -- C bindings for chunked-row mode detection +├── Internal.hs -- Escape functions, PgInterval, re-exports from Migration +├── Internal/ +│ ├── Decoding.hs -- Binary result column → PersistValue decoding +│ ├── Encoding.hs -- PersistValue → binary parameter encoding +│ ├── DirectDecode.hs -- PgRowEnv, FieldDecode instances, compositeFieldDecode +│ ├── DirectEncode.hs -- PgParam ADT, FieldEncode instances +│ ├── PgCodec.hs -- PgDecode/PgEncode classes, PgDecoder/PgEncoder reader types, DSL +│ ├── PgType.hs -- OID classification (PgType/PgScalar), OidCache +│ ├── Migration.hs -- DDL migration logic (adapted from persistent-postgresql) +│ └── Placeholders.hs -- ? → $1,$2,... placeholder rewriting +├── JSON.hs -- JSON column support +└── CustomType.hs -- Custom PostgreSQL type support +``` + +## Connection Lifecycle + +### Opening + +`openPgConn` (in `Connection.hs`) does three things: + +1. Calls `LibPQ.connectdb` to establish a TCP/Unix socket connection. +2. Queries the server version (via `LibPQ.serverVersion` or `SHOW server_version` as fallback). +3. **Enables nonblocking mode** via `LibPQ.setnonblocking`– required to prevent deadlock in pipeline mode (see below). +4. **Enters pipeline mode** via `LibPQ.enterPipelineMode`– this stays on for the lifetime of the connection. + +The result is a `PgConn`: + +```haskell +data PgConn = PgConn + { pgConn :: !LibPQ.Connection + , pgVersion :: !(NonEmpty Word) + , pgPending :: !(IORef Int) + , pgFetchMode :: !FetchMode + , pgOidCache :: !(IORef OidCache) + } +``` + +- `pgPending` tracks the number of fire-and-forget query results that have been sent but not yet read. +- `pgFetchMode` controls how result rows are fetched: `FetchAll` (default), `FetchSingleRow`, or `FetchChunked n` (PG17+ libpq). +- `pgOidCache` maps dynamically-assigned OIDs (composites, enums, domains) to `PgType` values. Starts empty; populated at connection time or on first encounter. Used by `FieldDecode` instances for OID validation and by `PgEncode` for composite/enum OID lookup. + +### Closing + +`closePgConn` drains any pending results (via `pipelineSync` + drain), exits pipeline mode, and calls `LibPQ.finish`. + +### Nonblocking Mode and Buffer Management + +The connection is set to nonblocking mode before entering pipeline mode. This is critical: in blocking mode, `LibPQ.flush` blocks until the entire send buffer is written to the socket. If the server's send buffer is also full (because we haven't consumed its results), both sides block and deadlock. + +In nonblocking mode, `LibPQ.flush` returns `FlushWriting` when the socket buffer is full. Our `pgFlush` handles this by: + +1. Calling `threadWaitWrite` (GHC's I/O manager) to wait for socket writability without busy-waiting. +2. Calling `consumeInput` to read any pending server data– this prevents the server from blocking on *its* send buffer. +3. Retrying `flush` until `FlushOk`. + +This cooperative flush loop ensures neither client nor server blocks indefinitely, even under heavy pipeline load. + +### Connection Pooling + +Each pooled connection has its own independent `pgPending` counter and lazy reply stream. There is no cross-connection pipelining– operations on one connection do not affect another. + +When a connection is returned to the pool, `closePgConn` drains any pending results before closing. If the pool reuses a connection (via `runSqlPool`), persistent's transaction management (`connCommit` / `connRollback`) ensures the pipeline is clean before the connection is returned. + +## Pipeline Mode + +### How libpq Pipeline Mode Works + +In normal (non-pipeline) mode, `execParams` sends a query and blocks until the server responds. In pipeline mode: + +- **`sendQueryParams`** queues a query in the client's send buffer without waiting for a response. +- **`pipelineSync`** inserts a sync point– the server processes all queued queries and sends back results in order. +- **`sendFlushRequest`** asks the server to flush its output buffer without a full sync point. +- **`getResult`** reads the next result from the connection. + +Each query's result follows a protocol: +- `getResult` → `Just result` (the actual result) +- `getResult` → `Nothing` (NULL separator marking end of that query's results) + +A `pipelineSync` result is different: +- `getResult` → `Just result` with status `PipelineSync` (no NULL separator follows) + +### Automatic Pipelining Strategy + +Pipeline mode is **always on**. Operations fall into three categories: + +**Hedis-style lazy reads** (`pipelinedGet`, `pipelinedInsert`, `pipelinedGetBy`, `pipelinedCount`, `pipelinedExists`): +Sends the query with `sendQueryParams` into the output buffer (no flush), pops a lazy reply from the reply stream (no IO forced). The result is returned as an `unsafeInterleaveIO` thunk. When the caller inspects the value, the thunk fires: flushes the send buffer (sending ALL queued queries in one batch), reads the result. Operations that use this path: + +- `get`, `getBy`, `count`, `exists`– return lazy results +- `insert`– sends INSERT RETURNING, returns lazy `Key` + +This means `mapM get keys` sends all 100 queries before reading any results, achieving **20-29× speedup** at realistic network latencies. + +**Fire-and-forget** (via `stmtExecute` → `execute'`): +Sends the query with `sendQueryParams`, increments `pgPending`, returns immediately. No round-trip. Operations that use this path: + +- `delete`, `update`, `updateWhere`, `deleteWhere` +- `replace`, `insertKey`, `repsert`, `putMany` +- `rawExecute`, `insert_` + +**Conduit-based reads** (via `stmtQuery` → `withStmt'`): +Drains pending fire-and-forget results, then sends the query, pops from the reply stream, and reads the result eagerly (for conduit streaming compatibility). Operations that use this path: + +- `selectList`, `selectFirst`, `selectSourceRes`, `selectKeysRes` +- `rawSql`, `rawQuery` +- `insertMany` (needs RETURNING for multiple keys) + +**Note on `rawExecuteCount`:** persistent's `rawExecuteCount` goes through `stmtExecute`, so it always returns 0 in this backend (the actual affected row count is never read). This affects esqueleto's `deleteCount`, `updateCount`, and `insertSelectWithConflictCount`. The non-count variants (`delete`, `update`, `insertSelectWithConflict`) work correctly since they discard the return value. + +### Lazy Reply Stream + +All pipeline results are read through a single lazy reply stream, inspired by [Hedis](https://www.iankduncan.com/engineering/2026-02-17-archive-redis-pipelining). + +```haskell +data PgConn = PgConn + { ... + , pgReplies :: !(IORef [LibPQ.Result]) -- lazy reply stream + } +``` + +The stream is built at connection time with `unsafeInterleaveIO`: + +```haskell +mkReplyStream :: PgConn -> IO [LibPQ.Result] +mkReplyStream pc = go + where + go = unsafeInterleaveIO $ do + pgFlush pc -- flush send buffer + ret <- readResultAndSep pc -- read one result + NULL separator + rest <- go -- next element (lazy thunk) + return (ret : rest) +``` + +`pgRecvResult` pops using `head`/`tail` (not pattern matching) to keep the cons cell lazy: + +```haskell +pgRecvResult :: PgConn -> IO LibPQ.Result +pgRecvResult pc = atomicModifyIORef (pgReplies pc) (\xs -> (tail xs, head xs)) +``` + +`atomicModifyIORef` is lazy in the function result– neither `head` nor `tail` is evaluated. The IO only fires when the caller forces the returned `LibPQ.Result`. This is the key property that enables automatic pipelining: multiple `pgRecvResult` calls accumulate unevaluated thunks, and the first force triggers a flush that sends all queued queries at once. + +The ordering guarantee: each thunk N is created inside thunk N-1's `unsafeInterleaveIO` body, so results are always read in pipeline order regardless of which thunk the caller forces first. + +### Drain Points + +Results accumulate in the server's output buffer and are read at these drain points: + +1. **`withStmt'` (any read operation)**– calls `drainPending` before executing the read query. +2. **`connCommit`**– drains all pending results plus the COMMIT result, verifying none failed. +3. **`connRollback`**– drains everything to the sync point, ignoring errors. + +### Transaction Lifecycle + +``` +connBegin: sendQueryParams "BEGIN" → increment pgPending (fire-and-forget) + [user DML operations → each increments pgPending] + [user read operations → each drains all pending first] +connCommit: sendQueryParams "COMMIT" → pipelineSync + → drain (N pending + 1 COMMIT) results + → read PipelineSync marker + → throw if any query failed +``` + +BEGIN is pipelined with the first user query– zero extra round-trips for transaction setup. + +### Examples + +**100 deletes then select (fire-and-forget pipelining):** + +```haskell +forM_ keys delete -- 100x sendQueryParams, pgPending = 100 +selectList [] [] -- drainPending (reads 100 results in one pass) + -- then sends SELECT and reads its result +``` + +Without pipelining: 101 round-trips. With pipelining: ~2 round-trips. + +**100 gets (Hedis-style lazy pipelining):** + +```haskell +results <- mapM get keys -- 100x send SELECT (no flush, no read) + -- 100x pop lazy reply from stream +print results -- forces first thunk → flushes ALL 100 queries + -- reads 100 results sequentially (already buffered) +``` + +Without pipelining: 100 round-trips. With pipelining: 1 flush + 100 sequential reads. +At 1ms/direction: **14ms** (pipeline) vs **280ms** (sequential)– **20× faster**. + +**100 inserts (pipelined RETURNING):** + +```haskell +keys <- mapM insert recs -- 100x send INSERT RETURNING (no flush, no read) + -- 100x pop lazy reply +evaluate (length keys) -- forces first thunk → flushes ALL 100 queries + -- reads 100 keys sequentially +``` + +Without pipelining: 100 round-trips. With pipelining: 1 flush + 100 sequential reads. +At 5ms/direction: **41ms** (pipeline) vs **1.2s** (sequential)– **29× faster**. + +### Error Handling + +**Mid-transaction error (during `drainPending`):** +1. All N pending results are drained and the counter is reset to 0. +2. Errors are collected; the first error is thrown. +3. The exception propagates to persistent, which calls `connRollback`. +4. Rollback sends ROLLBACK + sync → `drainToSync` consumes everything (including `PipelineAbort` status for queries after the failed one) → connection is clean. + +**Commit-time error:** +1. All N+1 results (pending + COMMIT) are drained, sync marker is consumed. +2. Pipeline is fully clean before the error is thrown. +3. persistent calls `connRollback` → sends into an empty pipeline → clean drain. + +**Pipeline abort state:** After a query error in pipeline mode, PostgreSQL marks subsequent queued queries with `PipelineAbort` status until the next sync point. The `drainNResults` helper handles this by collecting errors and skipping aborted results. The sync in commit/rollback resets the abort state. + +## Binary Protocol + +### Encoding (`Encoding.hs`) + +`encodePersistValue` converts each `PersistValue` variant to a `(Oid, ByteString, Format)` triple using `postgresql-binary` encoders: + +| PersistValue | PostgreSQL Type | OID | +|---|---|---| +| `PersistText` | text | 25 | +| `PersistInt64` | int8 | 20 | +| `PersistDouble` | float8 | 701 | +| `PersistBool` | bool | 16 | +| `PersistDay` | date | 1082 | +| `PersistUTCTime` | timestamptz | 1184 | +| `PersistByteString` | bytea | 17 | +| `PersistRational` | numeric | 1700 | +| `PersistNull` | – | (Nothing) | +| `PersistArray` | typed array | inferred | +| `PersistList` | unknown (JSON text) | 0 | +| `PersistMap` | unknown (JSON text) | 0 | +| `DbSpecific`/`Escaped` | unknown (text format) | 0 | + +`PersistArray` (used by the IN→ANY rewrite) infers the element type from the first non-null element and encodes as a native PostgreSQL array. + +`PersistLiteral_ Unescaped` values are inlined into the SQL text before encoding (see SQL Rewriting below) and should never reach the encoder. + +### Decoding (`Decoding.hs`) + +`decodePersistValue` dispatches on the column OID to the appropriate `postgresql-binary` decoder. Covers scalar types, array types (bool[], int8[], text[], timestamptz[], etc.), JSON/JSONB, UUID (binary → hex text), money, interval, and more. Unknown OIDs fall back to `PersistLiteralEscaped` with raw bytes. + +## Direct Decode/Encode Path + +In addition to the `PersistValue`-based path, the backend supports a direct codec path that bypasses `PersistValue` entirely. See the [RFC](../RFC-direct-decode.md) for the full design rationale. + +### Three layers + +| Layer | Type | Scope | Purpose | +|-------|------|-------|---------| +| `FromRow` / `RowReader` | Backend-agnostic (in `persistent` core) | Row | Sequence field decoders across columns | +| `FieldDecode` / `FieldRunner` | Backend-agnostic (in `persistent` core) | Column | Prepare-once OID dispatch, per-row decode | +| `PgDecode` / `PgDecoder` | PostgreSQL-specific (`PgCodec.hs`) | Value | Compose inside arrays/composites | + +### `PgRowEnv`– the row environment + +```haskell +data PgRowEnv = PgRowEnv + { pgResult :: !LibPQ.Result + , pgRow :: !LibPQ.Row + , pgCols :: !(V.Vector (LibPQ.Column, PgType)) + , pgRowCache :: !OidCache + } +``` + +`FieldDecode PgRowEnv` instances inspect `pgCols` to select the right `postgresql-binary` decoder once per result set (via `prepareRow`), then read binary data from `pgResult`/`pgRow` on each row. + +### Prepare-once execution + +`FromRow` exposes `prepareRow` which calls `prepareField` for each column once, captures the `FieldRunner`s in a `RowDecoder` closure, and reuses them across all rows. The per-row loop calls only `runField`– no OID dispatch, no vector lookup, no branching on column types. + +### `PgParam`– encoded parameters + +```haskell +data PgParam + = PgNull + | PgValue {-# UNPACK #-} !LibPQ.Oid !ByteString !LibPQ.Format +``` + +Unpacked ADT replacing `Maybe (Oid, ByteString, Format)`. Converted to libpq's representation via `pgParamToLibPQ` at the send boundary. + +### Value-level codecs (`PgCodec.hs`) + +For compound types (arrays, composites), `PgDecode` / `PgEncode` compose through PostgreSQL's binary wire format: + +```haskell +newtype PgDecoder a = PgDecoder { runPgDecoder :: OidCache -> PD.Value a } + +class PgDecode a where pgDecoder :: PgDecoder a +class PgEncode a where pgEncoder :: PgEncoder a; pgTypeOid' :: OidCache -> Proxy a -> Word32 +``` + +DSL: `pgValue`, `pgComposite`, `pgField`, `pgFieldNullable`, `pgArray`, `pgArrayNullable`. Generic `FieldDecode PgRowEnv [a]` instance via `PgDecode`. + +### `SqlBackend` bridge (`DirectEntity`) + +`SqlBackend` stores a `DirectQueryCap` that existentially hides `PgRowEnv`. Entities with `DirectEntity` instances use `Typeable`/`eqTypeRep` to recover the concrete env at query time. See `rawSqlDirectCompat` in `DirectRaw.hs`. + +### `HasDirectQuery`– concrete backend path + +For code that retains the concrete backend type (e.g. `WriteBackend PostgreSQLBackend`), `HasDirectQuery` provides zero-overhead static dispatch with no `Typeable` involved. The `SqlBackend` bridge is only for code that flows through the erased `SqlBackend` type. + +## SQL Rewriting + +Three transformations happen between persistent's generated SQL and what gets sent to PostgreSQL: + +### 1. Unescaped Literal Inlining (`inlineUnescaped`) + +`PersistLiteral_ Unescaped` values are raw SQL fragments (e.g., `EXCLUDED."field_name"`). These can't be sent as bind parameters– they're spliced directly into the SQL text, and removed from the parameter list. + +### 2. IN → ANY Collapsing (`collapseInClauses`) + +Rewrites: +- `IN (?,?,?)` → `= ANY(?)` with parameters collapsed into a single `PersistArray` +- `NOT IN (?,?,?)` → `<> ALL(?)` with the same collapsing + +This reduces the number of bind parameters and lets PostgreSQL use its native array comparison operators. The rewriter is SQL-aware: it skips string literals, quoted identifiers, and comments. + +### 3. Placeholder Rewriting (`rewritePlaceholders`) + +Converts `?` placeholders to `$1, $2, ...` numbered parameters as required by libpq's `sendQueryParams`. `??` (persistent's column-expansion escape) becomes a literal `?`. + +## Pipeline Helpers + +Six internal functions implement the libpq pipeline result protocol: + +| Function | Purpose | +|---|---| +| `drainOneResult` | Read one result + NULL separator, free it, return error if any | +| `readOneQueryResult` | Read one result + NULL separator, return it (caller frees), throw on error | +| `drainNResults` | Drain N results collecting errors, does NOT throw | +| `drainSyncResult` | Read PipelineSync result (no NULL separator after it) | +| `drainToSync` | Drain everything until PipelineSync, ignoring all errors | +| `drainPending` | Flush + drain all pending fire-and-forget results, throw if any failed | + +## API Surface + +### Drop-in replacement for `persistent-postgresql` + +```haskell +createPostgresqlPipelinePool :: ConnectionString -> Int -> m (Pool SqlBackend) +withPostgresqlPipelinePool :: ConnectionString -> Int -> (Pool SqlBackend -> m a) -> m a +withPostgresqlPipelineConn :: ConnectionString -> (SqlBackend -> m a) -> m a + +getPipelineConn :: backend -> Maybe PgConn +createRawPostgresqlPipelinePool :: ConnectionString -> Int -> m (Pool (RawPostgresqlPipeline SqlBackend)) +``` + +All standard persistent operations (`insert`, `get`, `selectList`, `delete`, `update`, `upsert`, etc.) work transparently. No user code changes required. + +### Direct decode/encode (zero-`PersistValue` path) + +For code with the concrete backend type: + +```haskell +rawQueryDirect :: (FromRow (Env backend) a, HasDirectQuery backend) + => Text -> ParamBuilder (Param backend) -> ReaderT backend m (Acquire (ConduitM () a m ())) +rawSqlDirect :: (FromRow (Env backend) a, HasDirectQuery backend) + => Text -> ParamBuilder (Param backend) -> ReaderT backend m [a] +``` + +For code through `SqlBackend`: + +```haskell +rawSqlDirectCompat :: (DirectEntity a) + => Text -> [PersistValue] -> ReaderT SqlBackend m (Maybe [a]) +``` + +Backend-specific codec modules: + +```haskell +-- Re-exported from Database.Persist.Postgresql.Internal.DirectDecode +PgRowEnv (..) -- row environment +compositeFieldDecode -- one-liner for composite FieldDecode instances + +-- Re-exported from Database.Persist.Postgresql.Internal.PgCodec +PgDecode (..), PgEncode (..) -- value-level codec classes +pgValue, pgComposite, pgField, pgFieldNullable, pgArray, pgArrayNullable -- decode DSL +pgConst, pgEncodeField, pgArrayEncoder, pgCompositeEncoder -- encode DSL +``` diff --git a/persistent-postgresql-ng/README.md b/persistent-postgresql-ng/README.md new file mode 100644 index 000000000..f32390eba --- /dev/null +++ b/persistent-postgresql-ng/README.md @@ -0,0 +1,158 @@ +# persistent-postgresql-ng + +A PostgreSQL backend for [persistent](https://hackage.haskell.org/package/persistent) that uses the **binary wire protocol** and **libpq pipeline mode**. + +Mostly a drop-in replacement for `persistent-postgresql`. All standard persistent operations work without code changes aside from type signatures and import changes. + +## What's different + +| Feature | persistent-postgresql | persistent-postgresql-ng | +|---------|----------------------|--------------------------| +| Wire protocol | Text (via postgresql-simple) | Binary (via postgresql-binary) | +| Automatic pipelining | No | Yes– Hedis-style lazy reply stream | +| Bulk insert | `INSERT ... VALUES (?,?,...), (?,?,...), ...` | `INSERT ... SELECT * FROM UNNEST($1::type[], ...)` | +| IN clauses | `IN (?,?,?,...)` | `= ANY($1)` | +| Direct decode path | No | Yes– zero `PersistValue` allocation | +| Result fetch modes | All-at-once only | All-at-once, single-row, chunked (PG17+) | + +## Benchmarks + +Measured against `persistent-postgresql` on the same PostgreSQL 16 instance. Three network conditions: localhost (0ms), 1ms added latency per direction (2ms RTT), and 5ms per direction (10ms RTT). + +Latency was introduced using a TCP delay proxy (`bench/delay-proxy.py`). + +### 0ms latency (localhost, TCP loopback) + +![Benchmark: 0ms latency](bench/bench-0ms.svg) + + +| Benchmark | pipeline | simple | speedup | +|-----------|----------|--------|---------| +| **get ×100 (pipelined reads)** | 1.7ms | 4.7ms | **2.8×** | +| **insert ×100 (pipelined RETURNING)** | 10.8ms | 12.8ms | 1.2× | +| **upsert ×100 (pipelined RETURNING)** | 8.9ms | 12.7ms | **1.4×** | +| insertMany ×1000 (UNNEST) | 5.3ms | 14.1ms | **2.7×** | +| delete ×100 then select | 4.5ms | 7.5ms | **1.7×** | +| mixed DML ×100 then select | 14.6ms | 29.9ms | **2.0×** | +| selectList ×100 | 8.6ms | 11.2ms | 1.3× | + +At zero latency, the advantage comes from the binary protocol and UNNEST-based bulk inserts. Individual `get` and `insert` are comparable because round-trip time is negligible. + +### 1ms latency per direction (2ms RTT, nearby datacenter) + +![Benchmark: 1ms latency](bench/bench-1ms.svg) + + +| Benchmark | pipeline | simple | speedup | +|-----------|----------|--------|---------| +| **get ×100 (pipelined reads)** | **11ms** | 310ms | **28×** | +| **insert ×100 (pipelined RETURNING)** | **13ms** | 314ms | **24×** | +| **upsert ×100 (pipelined RETURNING)** | **13ms** | 321ms | **25×** | +| insertMany ×1000 (UNNEST) | 8.6ms | 31.0ms | **3.6×** | +| selectList ×100 | 16.6ms | 25.8ms | **1.6×** | +| select IN ×20 | 17.4ms | 24.8ms | **1.4×** | + +With even modest latency, the automatic pipelining dominates. `mapM get keys`, `mapM insert records`, and `forM_ records upsert` all send queries before reading results– one flush instead of 100 round-trips. + +### 5ms latency per direction (10ms RTT, cross-region) + +![Benchmark: 5ms latency](bench/bench-5ms.svg) + + +| Benchmark | pipeline | simple | speedup | +|-----------|----------|--------|---------| +| **get ×100 (pipelined reads)** | **50ms** | 1.19s | **24×** | +| **insert ×100 (pipelined RETURNING)** | **41ms** | 1.20s | **29×** | +| insertMany ×1000 (UNNEST) | 22.8ms | 72.6ms | **3.2×** | +| selectList ×100 | 47.9ms | 74.0ms | **1.5×** | +| select IN ×20 | 44.1ms | 70.3ms | **1.6×** | + +The speedup scales linearly with latency. At 10ms RTT, 100 sequential round-trips cost 1000ms minimum. The pipeline pays one RTT for the flush and reads all 100 results from the server's already-buffered responses. + +### Attributing the speedup: binary protocol vs pipelining + +The improvements come from three independent sources. The 0ms column isolates the binary protocol effect (pipelining has no benefit when round-trips are free). The 1ms column shows the combined effect, and the difference reveals the pipelining contribution. + +| Benchmark | 0ms: pipeline / simple | 1ms: pipeline / simple | Source of speedup | +|-----------|:---:|:---:|---| +| **get ×100** | 1.7ms / 4.7ms (2.8×) | 11ms / 310ms (**28×**) | 0ms: binary decode. 1ms: **Hedis-style lazy pipelining** (100 queries in 1 flush) | +| **insert ×100** | 10.8ms / 12.8ms (1.2×) | 13ms / 314ms (**24×**) | 0ms: binary encode. 1ms: **lazy RETURNING pipelining** | +| **delete ×100** | 8.4ms / 12.9ms (1.5×) | 25ms / 592ms (**24×**) | 0ms: binary protocol. 1ms: **fire-and-forget pipelining** | +| **update ×100** | 8.3ms / 12.5ms (1.5×) | 25ms / 555ms (**22×**) | 0ms: binary protocol. 1ms: **fire-and-forget pipelining** | +| **replace ×100** | 11.1ms / 11.5ms (1.0×) | 27ms / 602ms (**22×**) | 0ms: ~neutral. 1ms: **fire-and-forget pipelining** | +| **insertMany ×1000** | 7.2ms / 16.7ms (2.3×) | 8.6ms / 31.0ms (**3.6×**) | 0ms: **UNNEST** (1 query vs N). 1ms: UNNEST + fewer round-trips | +| **selectList ×100** | 13.5ms / 15.6ms (1.2×) | 16.6ms / 25.8ms (**1.6×**) | 0ms: binary decode. 1ms: binary + pipelined setup | +| **upsert ×100** | 8.9ms / 12.7ms (1.4×) | 13ms / 321ms (**25×**) | 0ms: binary protocol. 1ms: **lazy RETURNING pipelining** | +| **deleteWhere ×100** | 90ms / 99ms (1.1×) | 119ms / 750ms (**6.3×**) | 0ms: ~neutral. 1ms: **fire-and-forget pipelining** | + +**Summary of sources:** + +| Source | Typical gain at 0ms | Typical gain at 1ms/dir | +|--------|:---:|:---:| +| Binary protocol (encode/decode) | 1.2-2.8× | 1.2-2.8× | +| UNNEST bulk insert | 2.3× | 3.6× | +| Fire-and-forget DML pipelining | 1.0× | 20-24× | +| Hedis-style lazy pipelining (get, insert, upsert) | 1.0× | 24-28× | +| Combined (best case) | 2.8× | **28×** | + +The binary protocol provides a constant-factor improvement regardless of latency. Pipelining provides a latency-proportional improvement that dominates at any non-zero network distance. + +### Running benchmarks + +```bash +# Baseline (direct connection) +stack bench persistent-postgresql-ng + +# With artificial latency via TCP proxy +python3 bench/delay-proxy.py 15432 localhost 5432 1 & # 1ms per direction +PGPORT=15432 PGHOST=127.0.0.1 stack bench persistent-postgresql-ng +kill %1 + +# With system-level latency (macOS, requires root) +sudo bench/run-with-latency.sh 1 # 1ms via dummynet +``` + +## Automatic pipelining (Hedis-style) + +All read operations (`get`, `getBy`, `insert` with RETURNING, `count`, `exists`) use a [Hedis-style](https://www.iankduncan.com/engineering/2026-02-17-archive-redis-pipelining) lazy reply stream for automatic optimal pipelining. No API changes are required– standard persistent code like `mapM get keys` is automatically pipelined. + +The technique: + +1. At connection time, an infinite lazy list of server replies is created using `unsafeInterleaveIO`. Each element, when forced, flushes the send buffer and reads one result. +2. Each command **sends** eagerly (writes to the output buffer) and **receives** lazily (pops an unevaluated thunk from the reply list via `atomicModifyIORef`). +3. The actual network read happens when the caller inspects the result value. If 100 `get` calls are sequenced before any result is inspected, all 100 queries are sent in one flush and results are read sequentially from the server's response buffer. + +The ordering guarantee comes from the lazy list structure: each thunk N is created inside thunk N-1's `unsafeInterleaveIO` body, so replies are always read in pipeline order regardless of evaluation order. + +Write operations (`delete`, `update`, `replace`, `deleteWhere`, `updateWhere`) remain fire-and-forget– they send the query and don't read the result until a subsequent read operation (or transaction commit) drains them. + +## Direct decode path + +In addition to the standard `PersistValue`-based path, the backend supports a direct codec path that bypasses `PersistValue` entirely. See the [RFC](../RFC-direct-decode.md) for full design details. + +```haskell +-- Switch one import to opt in: +import Database.Persist.Sql.Experimental -- instead of Database.Persist.Sql +``` + +For code with the concrete backend type (zero overhead, full specialization): + +```haskell +rawSqlDirect + "SELECT name, age FROM users WHERE age > $1" + (writeParam (18 :: Int)) + :: ReaderT (WriteBackend PostgreSQLBackend) m [(Text, Int64)] +``` + +For code through `SqlBackend` (uses `DirectEntity` + `Typeable` bridge): + +```haskell +rawSqlDirectCompat + "SELECT name, age FROM users WHERE age > $1" + [toPersistValue (18 :: Int)] + :: ReaderT SqlBackend m (Maybe [(Text, Int64)]) +``` + +## Architecture + +See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed internals: pipeline mode, binary protocol, connection lifecycle, error handling, and the direct decode/encode layer. diff --git a/persistent-postgresql-ng/bench/bench-0ms.svg b/persistent-postgresql-ng/bench/bench-0ms.svg new file mode 100644 index 000000000..636fc0446 --- /dev/null +++ b/persistent-postgresql-ng/bench/bench-0ms.svg @@ -0,0 +1,1823 @@ + + + + + + + + 2026-02-19T22:23:56.086805 + image/svg+xml + + + Matplotlib v3.9.4, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/persistent-postgresql-ng/bench/bench-1ms.svg b/persistent-postgresql-ng/bench/bench-1ms.svg new file mode 100644 index 000000000..7ffc1d4bf --- /dev/null +++ b/persistent-postgresql-ng/bench/bench-1ms.svg @@ -0,0 +1,2173 @@ + + + + + + + + 2026-02-19T22:23:56.179526 + image/svg+xml + + + Matplotlib v3.9.4, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/persistent-postgresql-ng/bench/bench-5ms.svg b/persistent-postgresql-ng/bench/bench-5ms.svg new file mode 100644 index 000000000..727075deb --- /dev/null +++ b/persistent-postgresql-ng/bench/bench-5ms.svg @@ -0,0 +1,1871 @@ + + + + + + + + 2026-02-19T22:23:56.258854 + image/svg+xml + + + Matplotlib v3.9.4, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/persistent-postgresql-ng/bench/gen-charts.py b/persistent-postgresql-ng/bench/gen-charts.py new file mode 100644 index 000000000..2a7982513 --- /dev/null +++ b/persistent-postgresql-ng/bench/gen-charts.py @@ -0,0 +1,101 @@ +#!/usr/bin/env python3 +"""Generate benchmark comparison SVGs for the README.""" + +import matplotlib +matplotlib.use('Agg') +import matplotlib.pyplot as plt +import numpy as np + +# Color scheme +COLOR_OLD = '#6c757d' # gray for persistent-postgresql +COLOR_NEW = '#0d6efd' # blue for persistent-postgresql-ng + +def make_chart(title, benchmarks, old_times, new_times, filename, unit='ms'): + fig, ax = plt.subplots(figsize=(12, 6)) + + x = np.arange(len(benchmarks)) + width = 0.35 + + bars_old = ax.bar(x - width/2, old_times, width, label='persistent-postgresql', color=COLOR_OLD) + bars_new = ax.bar(x + width/2, new_times, width, label='persistent-postgresql-ng', color=COLOR_NEW) + + ax.set_ylabel(f'Time ({unit})', fontsize=12) + ax.set_title(title, fontsize=14, fontweight='bold') + ax.set_xticks(x) + ax.set_xticklabels(benchmarks, rotation=25, ha='right', fontsize=10) + ax.legend(fontsize=11) + ax.grid(axis='y', alpha=0.3) + + # Add speedup labels on the new bars + for i, (old, new) in enumerate(zip(old_times, new_times)): + if old > 0 and new > 0: + speedup = old / new + if speedup >= 1.3: + ax.annotate(f'{speedup:.0f}x', + xy=(x[i] + width/2, new), + xytext=(0, 5), textcoords='offset points', + ha='center', fontsize=9, fontweight='bold', color='#0a58ca') + + fig.tight_layout() + fig.savefig(filename, format='svg', bbox_inches='tight') + plt.close(fig) + print(f'Wrote {filename}') + + +# --- 0ms latency --- +benchmarks_0ms = [ + 'get x100', + 'insert x100', + 'upsert x100', + 'delete x100', + 'update x100', + 'insertMany x1000', + 'selectList x100', + 'mixed DML x100', +] +old_0ms = [4.7, 12.8, 12.7, 12.9, 12.5, 14.1, 11.2, 29.9] +new_0ms = [1.7, 10.8, 8.9, 9.6, 9.4, 5.3, 8.6, 14.6] + +make_chart( + 'Benchmark: 0ms latency (localhost)', + benchmarks_0ms, old_0ms, new_0ms, + 'persistent-postgresql-ng/bench/bench-0ms.svg' +) + +# --- 1ms latency --- +benchmarks_1ms = [ + 'get x100', + 'insert x100', + 'upsert x100', + 'delete x100', + 'update x100', + 'replace x100', + 'insertMany x1000', + 'selectList x100', + 'deleteWhere x100', +] +old_1ms = [310, 314, 321, 592, 555, 602, 31, 25.8, 750] +new_1ms = [ 11, 13, 13, 25, 25, 27, 8.6, 16.6, 119] + +make_chart( + 'Benchmark: 1ms latency per direction (2ms RTT)', + benchmarks_1ms, old_1ms, new_1ms, + 'persistent-postgresql-ng/bench/bench-1ms.svg' +) + +# --- 5ms latency --- +benchmarks_5ms = [ + 'get x100', + 'insert x100', + 'insertMany x1000', + 'selectList x100', + 'select IN x20', +] +old_5ms = [1190, 1200, 72.6, 74.0, 70.3] +new_5ms = [ 50, 41, 22.8, 47.9, 44.1] + +make_chart( + 'Benchmark: 5ms latency per direction (10ms RTT)', + benchmarks_5ms, old_5ms, new_5ms, + 'persistent-postgresql-ng/bench/bench-5ms.svg' +)