Skip to content

Conversation

@ajcvickers
Copy link
Contributor

Note that I have not been able to test this against a server that fully supports auto-embedding. The version I have access to throws for a variety of cases, has no available embedding models, and index creation never completes.

Note that I have not been able to test this against a server that fully supports auto-embedding. The version I have access to throws for a variety of cases, has no available embedding models, and index creation never completes.
@ajcvickers ajcvickers requested review from BorisDog and Copilot January 6, 2026 14:04
@ajcvickers ajcvickers requested a review from a team as a code owner January 6, 2026 14:04
@ajcvickers ajcvickers added the feature Adds new user-facing functionality. label Jan 6, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for auto-embedding in vector search indexes, allowing MongoDB Atlas to automatically create vector embeddings from text fields using specified embedding models (e.g., "voyage-4"). The implementation includes new API constructors, query vector handling for text input, and comprehensive test coverage.

Key changes:

  • New VectorEmbeddingModality enum to specify the type of data being embedded
  • QueryVector now supports text strings for auto-embedding indexes
  • CreateVectorSearchIndexModel has new constructors for auto-embedding indexes with support for compression profiles and explicit compression settings

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
VectorEmbeddingModality.cs New enum defining modality types for auto-embedding (currently only Text)
QueryVector.cs Added string constructor and implicit operator for text-based queries
PipelineStageDefinitionBuilder.cs Modified to use "query" field for text vs "queryVector" for arrays
CreateVectorSearchIndexModel.cs Added auto-embedding constructors, compression profile support, and updated rendering logic
VectorSearchTests.cs Added test for auto-embedding vector search with new Movie class
AtlasSearchIndexManagmentTests.cs Added comprehensive tests for auto-embedding index creation with various options and renamed existing tests for consistency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@BorisDog BorisDog requested a review from adelinowona January 6, 2026 16:58
…elease.

- Changing QueryVector to take string instead of BsonString
- Added model to vector query options
- Fixed lookup for index status on community server
- Added more query tests
- Changed query tests to run on small set of documents that can be queried in a reasonable amount of time.
@ajcvickers ajcvickers requested a review from BorisDog January 8, 2026 16:21
Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments

@ajcvickers ajcvickers requested a review from BorisDog January 9, 2026 09:47
Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some failing unittests.

}

/// <inheritdoc/>
internal override BsonDocument Render(RenderArgs<TDocument> renderArgs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for CreateSearchIndexModel:
Should we introduce
BsonDocument Render(RenderArgs<TDocument> renderArgs) => Definition
for more uniform handling in CreateCreateIndexesOperation?

Also now that Render is internal, the exception in CreateSearchIndexModel.Definition is not accurate. Is it possible to override that property and throw there? I think making a property is acceptable as breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refer back to the comments that Robert had on the previous PR, and see if you are okay with the breaking changes that were rejected there?
#1769
#1795

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. Is it possible to override that property and throw there?

From: #1795 (review)

If making this property virtual is a breaking change, we could just let it return null in the base class.

Copy link
Contributor Author

@ajcvickers ajcvickers Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, anything that breaks what the property currently does--like throwing and expecting people (almost certainly just us) to call Render instead for current cases that work with the old type.

My preference historically would be to take small binary breaks to things like this that are primarily called by the driver, in order to keep internal quality high, and to create a coherent design and experience. But I was under the impression that we didn't want to take binary breaks because of the impact of third-party libraries that have not been rebuilt against latest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property virtual is a breaking change
I think this is something we can tolerate, I don't see how this can be breaking in practice.

Also my other suggestion was having Render method in CreateSearchIndexModel
BsonDocument Render(RenderArgs<TDocument> renderArgs) => Definition, for consistency.

Both suggestions for your consideration.

]);

_collection.SearchIndexes.CreateOne(new CreateAutoEmbeddingVectorSearchIndexModel<Movie>(
e => e.Plot, _autoEmbedIndexName, "voyage-4", filterFields: [e => e.Runtime, e => e.Year]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we introduce a temporary expectation for a failure here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean use an ifdef to disable the tests when auto-embedding is not available? Or something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the auto-embedding is not available yet and the tests fail:
I thought to implement your strategy of checking for the test failure instead of disabling the test.

Copy link
Contributor Author

@ajcvickers ajcvickers Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That strategy doesn't work well when the tests are actually passing against a compatible server. Because it means the tests checked in no longer test that the working server works, they just test that the failing server fails. I think if we're going to implement and release without having a C.I. server to test against, then the C.I. should just fail, and keep failing until such a server is available. This reflects the reality of the testing situation in which we are releasing, and means if we don't want the C.I. to keep failing, then we must do something to properly fix it, rather than excluding more tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see it fixed properly and have the CI green. But if the time doesn't allow we can't introduce constantly failing tests in this PR, so I'd prefer the disabling option.
In any case this discrepancy in server behavior is temporary.

Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM.
Need to finalize the tests strategy.

]);

_collection.SearchIndexes.CreateOne(new CreateAutoEmbeddingVectorSearchIndexModel<Movie>(
e => e.Plot, _autoEmbedIndexName, "voyage-4", filterFields: [e => e.Runtime, e => e.Year]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the auto-embedding is not available yet and the tests fail:
I thought to implement your strategy of checking for the test failure instead of disabling the test.

Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please notice that there are also failing unittests.

}

/// <inheritdoc/>
internal override BsonDocument Render(RenderArgs<TDocument> renderArgs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property virtual is a breaking change
I think this is something we can tolerate, I don't see how this can be breaking in practice.

Also my other suggestion was having Render method in CreateSearchIndexModel
BsonDocument Render(RenderArgs<TDocument> renderArgs) => Definition, for consistency.

Both suggestions for your consideration.

]);

_collection.SearchIndexes.CreateOne(new CreateAutoEmbeddingVectorSearchIndexModel<Movie>(
e => e.Plot, _autoEmbedIndexName, "voyage-4", filterFields: [e => e.Runtime, e => e.Year]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see it fixed properly and have the CI green. But if the time doesn't allow we can't introduce constantly failing tests in this PR, so I'd prefer the disabling option.
In any case this discrepancy in server behavior is temporary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Adds new user-facing functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants