Skip to content

Conversation

@gaurav
Copy link
Collaborator

@gaurav gaurav commented Feb 28, 2025

We don't currently handle start-end quotes properly (i.e. “” and ‘’), because in the database we usually encode these as plain quotes (" and '). This PR replaces the query string to use the latter.

To this, I had to add clique_identifier_count to the test data, which is how I figured out that an entry with clique_identifier_count=1 always got a zero score (because log(1) = 0). We now add one to clique_identifier_count to fix this issue.

Closes #176.

@gaurav gaurav changed the base branch from master to prioritize-exact-matches July 24, 2025 23:24
Base automatically changed from prioritize-exact-matches to master August 25, 2025 21:29
@gaurav gaurav changed the title Fix start-end quote bug Fix start-end quotes bug and filtering out clique_identifier_count=1 records Aug 25, 2025
@gaurav gaurav merged commit 007b659 into master Aug 25, 2025
1 check passed
@gaurav gaurav deleted the fix-windows-smart-quote branch August 25, 2025 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Queries containing don't work correctly

2 participants