Skip to content

Conversation

@sjruhland
Copy link

Summary

Adds GizmoSQL connection support to the @malloydata/db-duckdb package, following the established pattern for DuckDB variants (similar to MotherDuck and WASM).

Key changes:

  • GizmoSQLConnection class extending DuckDBCommon
  • Python ADBC bridge for Arrow Flight SQL protocol
  • SQL injection protection for catalog identifiers
  • Efficient data transfer via Apache Arrow IPC format

Architecture

GizmoSQL is DuckDB served over Arrow Flight SQL with the same SQL dialect. Following the project's convention, this implementation belongs in db-duckdb alongside other DuckDB transport mechanisms:

@malloydata/db-duckdb/
├── DuckDBConnection       (native bindings)
├── DuckDBWasmConnection  (WASM)
└── GizmoSQLConnection    (Flight SQL)

Implementation Details

Connection Layer:

  • Extends DuckDBCommon for shared dialect and schema introspection
  • Python subprocess bridge handles ADBC Flight SQL communication
  • Apache Arrow for efficient columnar data transfer

Security:

  • Catalog identifier validation prevents SQL injection
  • Credentials managed via secure keystore in VSCode extension

Dependencies:

  • Python: pyarrow, adbc-driver-flightsql
  • Node: apache-arrow (already a dependency)

Testing

Connection tested successfully with:

  • Basic connectivity
  • Query execution
  • Schema introspection
  • Error handling

Adds GizmoSQL connection support following the established pattern for
DuckDB variants (similar to DuckDBWasmConnection).

Key features:
- GizmoSQLConnection class extending DuckDBCommon
- Python ADBC bridge for Arrow Flight SQL protocol communication
- SQL injection protection for catalog identifiers
- Efficient data transfer via Apache Arrow IPC format

Architecture:
GizmoSQL is DuckDB served over Arrow Flight SQL with the same SQL dialect.
Following the project's convention, this implementation lives in db-duckdb
alongside other DuckDB transport mechanisms.

Dependencies:
- Python: pyarrow, adbc-driver-flightsql
- Node: apache-arrow (already a dependency)

Signed-off-by: Schyler Ruhland <[email protected]>
Adds integration tests following the repository's standard pattern:
- Uses describeIfDatabaseAvailable for conditional test execution
- Tests connection, queries, streaming, schema introspection
- Tests error handling and edge cases
- Automatically skipped in CI when credentials unavailable

All 8 tests pass with real GizmoSQL server.

Signed-off-by: Schyler Ruhland <[email protected]>
@sjruhland sjruhland force-pushed the sr/malloy-with-gizmo branch from dc732c4 to d71a27e Compare December 1, 2025 05:29
rowLimit ??= defaultOptions.rowLimit;
await this.setup();

const statements = sql.split('-- hack: split on this');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the -- hack: split on this delimiter. I believe this isn’t a valid or stable way to split SQL statements and GizmoSQL doesn’t define or rely on this marker anywhere

Comment based splitting is risky

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do but fyi this was borrowed from duckdb_*.ts implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Then it's fine :)

const pythonPath = this.config.pythonPath || 'python3';
const bridgeScript = join(__dirname, '../python/gizmosql_bridge.py');

return new Promise<Table>((resolve, reject) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the Gizmo has no default query timeout. Can you consider adding a timeout to prevent hanging on long-running queries?

Gizmo recently added a timeout arg (https://www.linkedin.com/posts/gizmodata_release-v1129-gizmodatagizmosql-activity-7394010812920582145-EIwM/)

stdout += data.toString();
});

python.stderr.on('data', (data) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are only the errors written here? Or even the warnings too?

If there's a chance of warnings being written into stdder then what about the scenario when python may write warnings to stderr even on success?

const uri = process.env['GIZMOSQL_URI'];
const username = process.env['GIZMOSQL_USERNAME'];
const password = process.env['GIZMOSQL_PASSWORD'];
const catalog = process.env['GIZMOSQL_CATALOG'] || 'main';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can u remove 'main' from here and update the Error message below?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an account and was able to run the follwoing query: SELECT * FROM customer LIMIT 10;

Here's what I found in the Network Tab:
{
"action": "query",
"sql": "SELECT * FROM customer LIMIT 10;",
"parameters": []
}

I didn't have any catalog specified. I believe catalog is optional

Here's another query:
{
"action": "query",
"sql": "\n SELECT column_name, data_type, is_nullable\n FROM information_schema.columns\n WHERE table_schema = ?\n AND table_name = ?\n ORDER BY ordinal_position\n ",
"parameters": [
"main",
"nation"
]
}


it('should execute query with multiple rows', async () => {
const result = await connection.runSQL(
'SELECT * FROM (VALUES (1, \'a\'), (2, \'b\'), (3, \'c\')) AS t(id, name)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace ' with '

this.uri = options.gizmosqlUri;
this.username = options.gizmosqlUsername;
this.password = options.gizmosqlPassword;
this.catalog = options.gizmosqlCatalog || 'main';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can u remove 'main'?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants