Skip to content

Automatic duplicate removal#16

Open
patrickbr wants to merge 5 commits intomasterfrom
duplicate-removal
Open

Automatic duplicate removal#16
patrickbr wants to merge 5 commits intomasterfrom
duplicate-removal

Conversation

@patrickbr
Copy link
Member

@patrickbr patrickbr commented Feb 13, 2026

With this PR, spatialjoin automatically detects duplicate geometry parts above a threshold size (number of anchor points). This works across multi geometries. Duplicates are replaced by references to the original geometry. Duplicate removal is done by iterating over the event list once (sorted by left x coordinate) and checking duplicates in blocks of equal x coordinates.

Missing: since references will now automatically be added, we need reference support for --within-dist and --de9im before this can be merged.

Also changed along the way: previously, if at least one reference geometry was present, every geometry was first compared to itself as a duplicate to resolve potential references. This added a small, but measurable time overhead. This is now replaced by special "self-check" events in the event list which are added for geometries which are referenced somewhere and which trigger such a self check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant