Skip to content

Optimize translate() #20302

@neilconway

Description

@neilconway

Is your feature request related to a problem or challenge?

translate() in DF is ~1.8x slower than the implementation in DuckDB (per datafusion-benchmarks).

Describe the solution you'd like

The DF implementation of translate could be optimized as follows:

• If the second and third arguments are both constants (common case), we can build the lookup map once per batch of inputs, rather than clearing and rebuilding it for every input string in the batch
• If all arguments are ASCII-only, we can build a fixed-size (128 entry) lookup table mapping ASCII characters directly, rather than using a hash table.

Likely other optimizations are possible as well, but implementing those two ideas should be a substantial win.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions