fetch_record_batch is 2x/3x slower than raw pyArrow

### What happens?

Hi Team, 

First of all, thanks for the all the hard work on DuckDB, it's an amazing product. 

From my testing, it seems that `DuckDB` slows down significantly when querying parquet data and serving the result as `record_batches`. I'm not super sure the exact issue but it's usually 2x/3x slower than polars/pyarrow.

You can reproduce it below with `uv run xx.py`
### To Reproduce

```python
# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "duckdb==1.4.4",
#     "polars==1.38.1",
#     "pyarrow==23.0.1",
# ]
# ///
import pyarrow.parquet as pq
import pyarrow as pa

import duckdb
import polars as pl
from itertools import permutations
import time
from contextlib import contextmanager



pa.show_info()

# Create parquet file with permutations of 'abcdef'
perms = list(permutations('abcdefghijk'))
print(len(perms))
data = {'permutation': [''.join(p) for p in perms]}
table = pa.table(data)
pq.write_table(table, 'alphabet_test.parquet')


time_taken = {}
@contextmanager
def timer(name):
    start = time.time()
    try:
        yield
    finally:
        elapsed = time.time() - start
        print(f"{name}: {elapsed:.6f}s")
        time_taken[name] = elapsed


# Test DuckDB fetch_arrow
with timer("DuckDB fetch_arrow_table"):
    reader = duckdb.read_parquet('alphabet_test.parquet').fetch_arrow_table(batch_size=100_000)

# Test DuckDB fetch_record_batch
with timer("DuckDB fetch_record_batch"):
    reader = duckdb.read_parquet('alphabet_test.parquet').fetch_arrow_reader(batch_size=100_000)
    batches_duckdb = []
    for batch in reader:
        batches_duckdb.append(batch)

# Test PyArrow RecordBatchReader
with timer("PyArrow RecordBatchReader"):
    ds = pa.dataset.dataset('alphabet_test.parquet')
    reader = pa.dataset.Scanner.from_dataset(ds, batch_size=100_000).to_reader()
    batches_pyarrow = []
    for batch in reader:
        batches_pyarrow.append(batch)

# Test Polars + PyArrow RecordBatchReader
with timer("Polars + PyArrow RecordBatchReader"):
    ds = pl.scan_parquet('alphabet_test.parquet').collect_batches(chunk_size=100_000)
    reader = pa.RecordBatchReader.from_stream(ds)
    batches_pyarrow = []
    for batch in reader:
        batches_pyarrow.append(batch)

print("\nSummary of time taken:")
for name, elapsed in time_taken.items():
    print(f"{name}: {elapsed:.6f}s")
print(f"fetch ducdkb batches {time_taken['DuckDB fetch_record_batch'] / time_taken['PyArrow RecordBatchReader']:.2f} slower than PyArrow RecordBatchReader")
```

### OS:

Darwin arm64

### DuckDB Version:

1.4.4

### DuckDB Client:

python

### Hardware:

Apple M4

### Full Name:

Valentino Chen

### Affiliation:

Personal

### Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

- [x] Yes, I have

### Did you include all code required to reproduce the issue?

- [x] Yes, I have

### Did you include all relevant data sets for reproducing the issue?

No - Other reason (please specify in the issue body)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fetch_record_batch is 2x/3x slower than raw pyArrow #359

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant data sets for reproducing the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fetch_record_batch is 2x/3x slower than raw pyArrow #359

Description

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant data sets for reproducing the issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions