-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Python: Adding MongoDB Atlas Vector Search Connector (#2818)
### Motivation and Context Resolves: #2591 <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> * Added in the `MongoDBAtlasMemoryStore`. A memory store abstraction that facilitates connections between a MongoDB Atlas cluster to conduct [Atlas Vector Search](https://www.mongodb.com/products/platform/atlas-vector-search) on the Microsoft Semantic Kernel. * Leverages our async python driver [motor](https://motor.readthedocs.io/en/stable/) to keep with the heuristic of asynchronous code ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄 ### Callout * When writing my test, I noticed there's a habit of other tests conducting a `try` on imports. Grepping through those implementations, I see some of them [encapsulate their respective libraries in function calls rather than doing direct imports in the module.](https://github.com/microsoft/semantic-kernel/blob/main/python/semantic_kernel/connectors/memory/chroma/chroma_memory_store.py#L55-L63) Is this the general code pattern we should follow? Or is it fine to have our imports at the top of the module fine? --------- Co-authored-by: Steven Silvester <steven.silvester@ieee.org> Co-authored-by: Abby Harrison <54643756+awharrison-28@users.noreply.github.com> Co-authored-by: Abby Harrison <abby.harrison@microsoft.com>
- Loading branch information
1 parent
320e72b
commit 8ce66fe
Showing
12 changed files
with
866 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
52 changes: 52 additions & 0 deletions
52
python/semantic_kernel/connectors/memory/mongodb_atlas/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# microsoft.semantic_kernel.connectors.memory.mongodb_atlas | ||
|
||
This connector uses [MongoDB Atlas Vector Search](https://www.mongodb.com/products/platform/atlas-vector-search) to implement Semantic Memory. | ||
|
||
## Quick Start | ||
|
||
1. Create [Atlas cluster](https://www.mongodb.com/docs/atlas/getting-started/) | ||
|
||
2. Create a collection | ||
|
||
3. Create [Vector Search Index](https://www.mongodb.com/docs/atlas/atlas-search/field-types/knn-vector/) for the collection. | ||
The index has to be defined on a field called ```embedding```. For example: | ||
``` | ||
{ | ||
"mappings": { | ||
"dynamic": true, | ||
"fields": { | ||
"embedding": { | ||
"dimension": 1024, | ||
"similarity": "cosine", | ||
"type": "knnVector" | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
|
||
4. Create the MongoDB memory store | ||
```python | ||
import semantic_kernel as sk | ||
import semantic_kernel.connectors.ai.open_ai | ||
from semantic_kernel.connectors.memory.mongodb_atlas import ( | ||
MongoDBAtlasMemoryStore | ||
) | ||
|
||
kernel = sk.Kernel() | ||
|
||
... | ||
|
||
kernel.register_memory_store(memory_store=MongoDBAtlasMemoryStore( | ||
# connection_string = if not provided pull from .env | ||
)) | ||
... | ||
|
||
``` | ||
|
||
## Important Notes | ||
|
||
### Vector search indexes | ||
In this version, vector search index management is outside of ```MongoDBAtlasMemoryStore``` scope. | ||
Creation and maintenance of the indexes have to be done by the user. Please note that deleting a collection | ||
(```memory_store.delete_collection_async```) will delete the index as well. |
5 changes: 5 additions & 0 deletions
5
python/semantic_kernel/connectors/memory/mongodb_atlas/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
from semantic_kernel.connectors.memory.mongodb_atlas.mongodb_atlas_memory_store import ( | ||
MongoDBAtlasMemoryStore, | ||
) | ||
|
||
__all__ = ["MongoDBAtlasMemoryStore"] |
Oops, something went wrong.