Ducklake slowness with embedded duckdb server.

We have a ducklake server connected to postgres + s3 parquet files all in AWS (RDS + S3). The workload was tuned and we are able to get 20 seconds performance on first time query execution when duckdb is started in the machine. The same duckdb process inside JVM for the same tpch query responds only after 60 to 70 seconds. The subsequent query executions are faster both in the machine and JVM with approximately 3 second response time. 

Do we have to tune some specific setting for duckdb with ducklake and parquet files to be performant when running inside JVM?

<img width="1102" alt="Image" src="https://github.com/user-attachments/assets/cf7812c3-fc7d-4c86-9618-461e336a156e" />

<img width="624" alt="Image" src="https://github.com/user-attachments/assets/76cb5408-16d8-46b7-a511-d1cd596bc987" />

Thanks,
Prakash

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ducklake slowness with embedded duckdb server. #289

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ducklake slowness with embedded duckdb server. #289

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions