Description
When EnableBatchedInserts=1 is set as a JDBC connection property, the Databricks JDBC driver strips backtick quoting from column names during its internal SQL reconstruction in PreparedStatementBatchExecutor.executeBatchedInsert(). This causes PARSE_SYNTAX_ERROR for column names containing dots (e.g., col.name), as the unquoted dot is interpreted as a schema/table separator.
Steps to Reproduce
- Connect to Databricks using JDBC with
EnableBatchedInserts=1
- Create a table with column names containing dots (e.g.,
col.name)
- Use
PreparedStatement.addBatch() + executeBatch() to insert data into the table
Expected Behavior
The driver should preserve backtick quoting around column names when reconstructing the batched INSERT SQL:
INSERT INTO `table` (`col.name`, `col2`) VALUES (?, ?), (?, ?), ...
Actual Behavior
The driver strips backtick quoting from column names during SQL reconstruction:
INSERT INTO `table` (col.name, col2) VALUES (?, ?), (?, ?), ...
This results in:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'name'. SQLSTATE: 42601
Environment
- Databricks JDBC Driver v3.x
- Connection properties:
EnableBatchedInserts=1, BatchInsertSize=10000
- jOOQ with
SQLDialect.DATABRICKS (generates correct backtick-quoted SQL)
Workaround
Currently, the only workaround is to disable EnableBatchedInserts, which results in ~10x slower write performance (e.g., 141s vs 12s for 100 records).
Additional Context
The issue is in the driver's PreparedStatementBatchExecutor.executeBatchedInsert() method, which parses the original INSERT statement, extracts column names (stripping backticks), and reconstructs a multi-row INSERT without re-quoting identifiers that require it.
This affects any column name containing special characters that require quoting (dots, spaces, reserved words, etc.).
Description
When
EnableBatchedInserts=1is set as a JDBC connection property, the Databricks JDBC driver strips backtick quoting from column names during its internal SQL reconstruction inPreparedStatementBatchExecutor.executeBatchedInsert(). This causesPARSE_SYNTAX_ERRORfor column names containing dots (e.g.,col.name), as the unquoted dot is interpreted as a schema/table separator.Steps to Reproduce
EnableBatchedInserts=1col.name)PreparedStatement.addBatch()+executeBatch()to insert data into the tableExpected Behavior
The driver should preserve backtick quoting around column names when reconstructing the batched INSERT SQL:
Actual Behavior
The driver strips backtick quoting from column names during SQL reconstruction:
This results in:
Environment
EnableBatchedInserts=1,BatchInsertSize=10000SQLDialect.DATABRICKS(generates correct backtick-quoted SQL)Workaround
Currently, the only workaround is to disable
EnableBatchedInserts, which results in ~10x slower write performance (e.g., 141s vs 12s for 100 records).Additional Context
The issue is in the driver's
PreparedStatementBatchExecutor.executeBatchedInsert()method, which parses the original INSERT statement, extracts column names (stripping backticks), and reconstructs a multi-row INSERT without re-quoting identifiers that require it.This affects any column name containing special characters that require quoting (dots, spaces, reserved words, etc.).