Skip to content

Appender rework#295

Merged
staticlibs merged 1 commit intoduckdb:mainfrom
staticlibs:appender_datachunk
Jul 8, 2025
Merged

Appender rework#295
staticlibs merged 1 commit intoduckdb:mainfrom
staticlibs:appender_datachunk

Conversation

@staticlibs
Copy link
Collaborator

@staticlibs staticlibs commented Jun 29, 2025

This change implements the access to Appender interface from Java with the following features:

  • C API is used to access the native Appender
  • necessary C API calls are exposed to Java using JNI wrappers as thin as possible - Java calls mirror corresponding C API calls 1 to 1
  • the data chunk interface of the Appender API is used: vector data is exposed as a direct ByteBuffer, all primitive appended values are written to this buffer from Java without going through JNI + C API (that is still used for some types with calls like: duckdb_vector_assign_string_element_len)
  • Java-side Appender/DataChunk/Vector data structures follow the Go lang's Appender implementation closely (with nested arrays initialization etc)
  • Java Appender usage is made thread-safe for concurrent Appender or Connection closure; append() calls are remained not thread-safe (to minimize the overhead), but their usage cannot cause the JNI-side crash
  • Java API of the new Appender is modeled after the java.lang.StringBuilder class and intended to be used with method chaining
  • support primitive arrays (one and two dimensional)
  • support for NULL and DEFAULT values
  • type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates Value instances and appends them one by on) is still available for compatibility. It can be created using Connection#createSingleValueAppender method. It is marked as 'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new Appender API.

Fixes: #84, #100, #110, #139, #157, #163, #219, #249

@staticlibs staticlibs marked this pull request as draft June 29, 2025 23:04
@staticlibs staticlibs force-pushed the appender_datachunk branch from a02716b to 64386b7 Compare July 3, 2025 15:56
@arouel
Copy link
Contributor

arouel commented Jul 7, 2025

@staticlibs would it feasible to use the Foreign Function and Memory (FFM) API instead of Java Native Interface (JNI)? With jextract you could generate the mapping to the C API easily. I bring this up, because in a future release the interoperation with native code will be disallowed by default. The FFM API is the preferred alternative to JNI.

@staticlibs
Copy link
Collaborator Author

@arouel

The main target for the JDBC driver is Java 8. Access to DuckDB engine through FFM can be introduced in future as an optional alternative, but JNI remains the main approach for any foreseeable future. And upcoming restrictions to native code are likely to be applied (by the JDK) to both JNI and FFM the same way (because they have almost the same problems with app "integrity"). And both JNI and FFM are likely to use the same flags/manifests to disable the restrictions that are added.

Copy link
Contributor

@arouel arouel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the plan to keep the current method names of the new DuckDBAppender as they were before? If possible, I would prefer if we change the naming pattern, so that one can understand which database type we append to, e.g. appendBlob(byte[] values), appendString(byte[] value), appendTimestamp(Instant value) etc.

When we keep the naming pattern as is, we keep the confusion to which type we map to.

@arouel
Copy link
Contributor

arouel commented Jul 7, 2025

@arouel

The main target for the JDBC driver is Java 8. Access to DuckDB engine through FFM can be introduced in future as an optional alternative, but JNI remains the main approach for any foreseeable future. And upcoming restrictions to native code are likely to be applied (by the JDK) to both JNI and FFM the same way (because they have almost the same problems with app "integrity"). And both JNI and FFM are likely to use the same flags/manifests to disable the restrictions that are added.

Ok, I get it, when you target Java 8 specifically.

@staticlibs
Copy link
Collaborator Author

@arouel

Is it the plan to keep the current method names

New Appender follows the StringBuilder's approach with overloaded append() method. When the same Java type can be used for different DB type - then special names are used like appendDecimal().

@staticlibs staticlibs force-pushed the appender_datachunk branch from 64386b7 to 2329252 Compare July 8, 2025 15:50
This change implements the access to the Appender interface from Java with
the following features:

 - C API is used to access the native Appender
 - necessary C API calls are exposed to Java using JNI wrappers as thin
   as possible - Java calls mirror corresponding C API calls 1 to 1
 - the data chunk interface of the Appender API is used: vector data is
   exposed as a direct ByteBuffer, all primitive appended values are
   written to this buffer from Java without going through JNI + C API (
   that is still used for some types with calls like:
   `duckdb_vector_assign_string_element_len`)
 - Java-side Appender/DataChunk/Vector data structures follow the Go
   lang's Appender implementation closely (with nested arrays
   initialization etc)
 - Java Appender usage is made thread-safe for concurrent Appender or
   Connection closure; `append()` calls are remained not thread-safe (to
   minimize the overhead), but their usage cannot cause the JNI-side
   crash
 - Java API of the new Appender is modeled after the
   `java.lang.StringBuilder` class and intended to be used with method
   chaining
 - support primitive arrays (one and two dimensional)
 - support for `NULL` and `DEFAULT` values
 - type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates
`Value` instances and appends them one by one) is still available for
compatibility. It can be created using
`Connection#createSingleValueAppender` method. It is marked as
'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new
Appender API.

Fixes: duckdb#84, duckdb#100, duckdb#110, duckdb#139, duckdb#157, duckdb#163, duckdb#219, duckdb#249
@staticlibs staticlibs force-pushed the appender_datachunk branch from 2329252 to ac024e5 Compare July 8, 2025 18:39
@staticlibs staticlibs marked this pull request as ready for review July 8, 2025 18:42
@staticlibs staticlibs closed this Jul 8, 2025
@staticlibs staticlibs reopened this Jul 8, 2025
@staticlibs staticlibs merged commit 935f58b into duckdb:main Jul 8, 2025
10 checks passed
@staticlibs staticlibs deleted the appender_datachunk branch July 8, 2025 20:23
staticlibs added a commit to staticlibs/duckdb-java that referenced this pull request Jul 8, 2025
This is a backport of the PR duckdb#295 to `v1.3-ossivalis` stable branch.

This change implements the access to the Appender interface from Java with
the following features:

 - C API is used to access the native Appender
 - necessary C API calls are exposed to Java using JNI wrappers as thin
   as possible - Java calls mirror corresponding C API calls 1 to 1
 - the data chunk interface of the Appender API is used: vector data is
   exposed as a direct ByteBuffer, all primitive appended values are
   written to this buffer from Java without going through JNI + C API (
   that is still used for some types with calls like:
   `duckdb_vector_assign_string_element_len`)
 - Java-side Appender/DataChunk/Vector data structures follow the Go
   lang's Appender implementation closely (with nested arrays
   initialization etc)
 - Java Appender usage is made thread-safe for concurrent Appender or
   Connection closure; `append()` calls are remained not thread-safe (to
   minimize the overhead), but their usage cannot cause the JNI-side
   crash
 - Java API of the new Appender is modeled after the
   `java.lang.StringBuilder` class and intended to be used with method
   chaining
 - support primitive arrays (one and two dimensional)
 - support for `NULL` and `DEFAULT` values
 - type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates
`Value` instances and appends them one by one) is still available for
compatibility. It can be created using
`Connection#createSingleValueAppender` method. It is marked as
'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new
Appender API.

Fixes: duckdb#84, duckdb#100, duckdb#110, duckdb#139, duckdb#157, duckdb#163, duckdb#219, duckdb#249
@staticlibs staticlibs mentioned this pull request Jul 8, 2025
staticlibs added a commit that referenced this pull request Jul 8, 2025
This is a backport of the PR #295 to `v1.3-ossivalis` stable branch.

This change implements the access to the Appender interface from Java with
the following features:

 - C API is used to access the native Appender
 - necessary C API calls are exposed to Java using JNI wrappers as thin
   as possible - Java calls mirror corresponding C API calls 1 to 1
 - the data chunk interface of the Appender API is used: vector data is
   exposed as a direct ByteBuffer, all primitive appended values are
   written to this buffer from Java without going through JNI + C API (
   that is still used for some types with calls like:
   `duckdb_vector_assign_string_element_len`)
 - Java-side Appender/DataChunk/Vector data structures follow the Go
   lang's Appender implementation closely (with nested arrays
   initialization etc)
 - Java Appender usage is made thread-safe for concurrent Appender or
   Connection closure; `append()` calls are remained not thread-safe (to
   minimize the overhead), but their usage cannot cause the JNI-side
   crash
 - Java API of the new Appender is modeled after the
   `java.lang.StringBuilder` class and intended to be used with method
   chaining
 - support primitive arrays (one and two dimensional)
 - support for `NULL` and `DEFAULT` values
 - type checks between Java types and DB column types are enforced

Note: previous version of the Appender (that internally creates
`Value` instances and appends them one by one) is still available for
compatibility. It can be created using
`Connection#createSingleValueAppender` method. It is marked as
'deprecated' and intended to be removed in future versions.

Testing: existing Appender test suite is extended and adapted to new
Appender API.

Fixes: #84, #100, #110, #139, #157, #163, #219, #249
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Appender] Add AppendDefault

2 participants