Adding support for Row Id mapping for composite merge#20608
Adding support for Row Id mapping for composite merge#20608darjisagar7 wants to merge 1 commit intoopensearch-project:feature/datafusionfrom
Conversation
Signed-off-by: Sagar Darji <darjisagar7@gmail.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
❌ Gradle check result for f2560c5: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
| } | ||
|
|
||
| // Create mappings for this file | ||
| for old_row_id in 0..file_rows as i64 { |
There was a problem hiding this comment.
Here we are creating RowIdMappingData struct for every row, for a merge of files totaling N rows, this will be 3N Java heap allocations triggered from Rust through JNI + 4N JNI calls from create_row_id_mapping_object
Instead, since the mapping is just new_row_id = file_start_offset + old_row_id, we can pass only per-file segment metadata (file ID, start offset, row count) across JNI and materialize the full mapping in Java.
Description
This PR will create the RowIdMapping which will be used for the composite merges
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.