Skip to content

test: add encode_vision tests for Embedder module#589

Open
stanley1208 wants to merge 1 commit intogoogle-deepmind:mainfrom
stanley1208:test/encode-vision-test
Open

test: add encode_vision tests for Embedder module#589
stanley1208 wants to merge 1 commit intogoogle-deepmind:mainfrom
stanley1208:test/encode-vision-test

Conversation

@stanley1208
Copy link

Summary

Resolves TODO(mblondel): Add tests for encode_vision here in _modules_test.py.

The Embedder.encode_vision() method projects SigLiP vision embeddings into the text embedding space via RMSNorm + Einsum projection. This was previously untested.

Tests added (2 tests in _modules_test.py)

  • test_encode_vision_output_shape: verifies projection from [B, num_patches, vision_proj_dim] to [B, num_patches, embed_dim]
  • test_encode_vision_different_batch_shapes: verifies correct output with single and multiple patches

Test plan

  • Both new tests pass
  • All existing _modules_test.py tests unaffected

Resolves TODO(mblondel) requesting tests for encode_vision. Adds 2 tests verifying the vision projection from vision_proj_dim to embed_dim with various batch and patch dimensions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant