Skip to content

feat: impl jaccardSimilarity for improved cross-post detection#15

Merged
hmd-ali merged 1 commit intor-webdev:mainfrom
LankyMoose:feat/improved-crosspost-detection
Feb 4, 2026
Merged

feat: impl jaccardSimilarity for improved cross-post detection#15
hmd-ali merged 1 commit intor-webdev:mainfrom
LankyMoose:feat/improved-crosspost-detection

Conversation

@LankyMoose
Copy link
Contributor

Improves detection of cross-posting with slight differences between messages, adjustable via MESSAGE_SIMILARITY_THRESHOLD

@LankyMoose
Copy link
Contributor Author

LankyMoose commented Feb 4, 2026

Further on this - I was originally going to use Levenshtein Distance but figured it might be a bit perf-heavy for large messages. If the current implementation works well but misses things in smaller messages then we have ways to improve.

Copy link
Contributor

@hmd-ali hmd-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@hmd-ali hmd-ali merged commit ef5cacf into r-webdev:main Feb 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants