Reverse image search 2.0 (rebased) #627

Meow · 2025-09-25T07:56:16Z

Original PR: #401

This PR replaces the old "image intensities" reverse image search, and has come about due to the confluence of several key factors within the past year:

Computer vision utilities like those in PyTorch have become more accessible than ever, with native language bindings like tch-rs removing the need for a Python server
The self-distillation vision transformers DINOv2 and DINOv2 with registers have been released, which come with pretrained weights to extract semantic features from images without the need for a finetuned head. The authors claim that these systems can extract robust features on any type of downstream task as-is. I believe they are underselling how good it is, and found the recall to be excellent during model selection.
The OpenSearch project has released the k-NN plugin, which enables nearest neighbor search over dense vectors, like the kind representing the CLS token of a ViT.

Together, these factors are used to implement a reverse image search system that uses semantic meaning in the images to identify them, rather than their overall appearance. To establish what is meant by this, here are some examples of an original image and matches found when executing on Derpibooru:

Demo	Result
Line art
Hamburger
Trixie
Scenery

The fact that DINOv2 has semantic extraction can be determined through generated attention maps for these images. The code to generate these attention maps can be found in this repository. These have been reprocessed at a higher scale for visibility:

Scaled original	Attention map

The system works as follows:

Image/video is previewed into a raw RGB bitmap
Bitmap is resampled to model target dimensions
Classification vector is retrieved from model
Classification vector is normalized to convert the k-NN search into one ordered by cosine similarity, and delivered back to the application
For indexing, the normalized vector is stored as a nested field into into image search index; for search, the nearest neighbors are retrieved using a HNSW index

Indexing the classification vector using a nested field allows for the possibility of extracting multiple vectors from each image, and the database table has been set up to allow this should it be desired in the future.

I have pre-computed the DINOv2 with registers features for ~3.5M images on Derpibooru, ~400K images on Furbooru, and ~35K images on Tantabus. Batch inference was run on a 3060 Ti using code from this repository, with the entire process heavily bottlenecked by memory copy bandwidth and image decode performance rather than the GPU execution itself. However, the inference code is efficient enough to run on a CPU in less than 0.5 seconds per image, and this is what is implemented in the repository (with the expectation that there will be no GPU requirement on the server).

This PR must not be merged until OpenSearch releases version 2.19, as 2.18 contains a critical bug that prevents the system from working in all cases. Other bugs
relating to filtering may or may not also be fixed in the 2.19 release, but have been worked around for now.

Meow: we're on OpenSearch 3.2.0 now

~~This PR must also not be merged until its dependents #389 and #400 are merged.~~

Meow: these are merged now

Fixes #331 (method outdated)

Meow · 2025-09-25T08:57:48Z

it appears like the reverse search no longer functions correctly, trying to search for an image using its thumbnail as reverse search image

image vectors are not created upon image creation
it's possible to create them via Philomena.ImageVectors.BatchProcessor.all_missing("full", batch_size: 32)
but even if they're created, reverse-search does not work, and even providing the original image to the reverse searcher again doesn't appear to produce any results

liamwhite · 2025-09-25T14:21:21Z

I think this should not be merged until opensearch-project/k-NN#2222 is properly addressed

liamwhite · 2025-09-25T14:31:23Z

Also, it'd be good to implement a rudimentary form of batching (merge together up to 8 requests or on a 100ms timer?) to improve the efficiency of performing evaluations

Meow · 2025-09-25T14:40:24Z

I think this should not be merged until opensearch-project/k-NN#2222 is properly addressed

Right, I kinda assumed this would be fixed by now by them. Wow they're slow.

Are you sure it's wise to wait and not just do it? This issue seems to have almost zero traction or movement, I'm not convinced that they'll fix it in any sort of reasonable timely manner, and it'd be a shame to potentially hold a feature for months or even years because some other maintainers have other priorities.

Also, it'd be good to implement a rudimentary form of batching (merge together up to 8 requests or on a 100ms timer?) to improve the efficiency of performing evaluations

Maybe. I'm not convinced that this would ever be a condition that would be met, though, save for some sort of an attack. But in that case, I'd argue a 1 second window is more wise. People already wait several seconds for image upload, what's 1 second more.

liamwhite · 2025-09-25T14:41:31Z

Are you sure it's wise to wait and not just do it?

Yes because the reverse search feature will straight up not work properly without it being fixed

Meow · 2025-09-25T14:41:57Z

Are you sure it's wise to wait and not just do it?

Yes because the reverse search feature will straight up not work properly without it being fixed

Did you not have a workaround? You mentioned it in the original PR.

liamwhite · 2025-09-25T14:51:08Z

The workaround causes incomplete results if there is a separate filter applied. A previous bug (which I am not sure yet if it was fixed) caused the search engine to segfault when rebalancing vector documents between shards.

liamwhite added 3 commits September 25, 2025 09:51

Add feature extraction pipeline to mediaproc

e4bf109

Add feature extraction and importing pipeline to Philomena

63090b5

Add feature-based reverse search interface

e78d632

Meow mentioned this pull request Sep 25, 2025

Reverse image search 2.0 #401

Closed

Meow added 2 commits September 25, 2025 10:21

fixes

6537911

more fixes

52e1bcf

suppress sobelow warning

a1bd7d9

liamwhite marked this pull request as draft September 25, 2025 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reverse image search 2.0 (rebased) #627

Reverse image search 2.0 (rebased) #627

Uh oh!

Meow commented Sep 25, 2025 •

edited

Loading

Uh oh!

Meow commented Sep 25, 2025 •

edited

Loading

Uh oh!

liamwhite commented Sep 25, 2025

Uh oh!

liamwhite commented Sep 25, 2025 •

edited

Loading

Uh oh!

Meow commented Sep 25, 2025

Uh oh!

liamwhite commented Sep 25, 2025

Uh oh!

Meow commented Sep 25, 2025

Uh oh!

liamwhite commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Reverse image search 2.0 (rebased) #627

Are you sure you want to change the base?

Reverse image search 2.0 (rebased) #627

Uh oh!

Conversation

Meow commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Meow commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liamwhite commented Sep 25, 2025

Uh oh!

liamwhite commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Meow commented Sep 25, 2025

Uh oh!

liamwhite commented Sep 25, 2025

Uh oh!

Meow commented Sep 25, 2025

Uh oh!

liamwhite commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Meow commented Sep 25, 2025 •

edited

Loading

Meow commented Sep 25, 2025 •

edited

Loading

liamwhite commented Sep 25, 2025 •

edited

Loading