Data source - Mongo DB Atlas

The configurable options of Mongo DB Atlas when using Azure OpenAI On Your Data. This data source is supported starting in API version 2024-08-01.

Name Type Required Description
parameters Parameters True The parameters to use when configuring Mongo DB Atlas.
type string True Must be mongo_db.

Parameters

Name Type Required Description
authentication object True The authentication options for Azure OpenAI On Your Data when using a username and a password.
app_name string True The name of the Mongo DB Atlas Application.
collection_name string True The name of the Mongo DB Atlas Collection.
database_name string True The name of the Mongo DB Atlas database.
endpoint string True The name of the Mongo DB Atlas cluster endpoint.
embedding_dependency One of DeploymentNameVectorizationSource, EndpointVectorizationSource True The embedding dependency for vector search.
fields_mapping object True Settings to control how fields are processed when using a configured Mongo DB Atlas resource.
index_name string True The name of the Mongo DB Atlas index.
top_n_documents integer False The configured top number of documents to feature for the configured query.
max_search_queries integer False The max number of rewritten queries should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send.
allow_partial_result boolean False If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.
in_scope boolean False Whether queries should be restricted to use indexed data.
strictness integer False The configured strictness of the search relevance filtering, from 1 to 5. The higher the strictness, the higher precision but lower recall of the answer.
include_contexts array False The included properties of the output context. If not specified, the default value is citations and intent. Valid properties are all_retrieved_documents, citations and intent.

Authentication

The authentication options for Azure OpenAI On Your Data when using a username and a password.

Name Type Required Description
type string True Must be username_and_password.
username string True The username to use for authentication.
password string True The password to use for authentication.

Deployment name vectorization source

The details of the vectorization source, used by Azure OpenAI On Your Data when applying vector search. This vectorization source is based on an internal embeddings model deployment name in the same Azure OpenAI resource. This vectorization source enables you to use vector search without Azure OpenAI api-key and without Azure OpenAI public network access.

Name Type Required Description
deployment_name string True The embedding model deployment name within the same Azure OpenAI resource.
type string True Must be deployment_name.

Endpoint vectorization source

The details of the vectorization source, used by Azure OpenAI On Your Data when applying vector search. This vectorization source is based on the Azure OpenAI embedding API endpoint.

Name Type Required Description
endpoint string True Specifies the resource endpoint URL from which embeddings should be retrieved. It should be in the format of https://{YOUR_RESOURCE_NAME}.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings. The api-version query parameter isn't allowed.
authentication ApiKeyAuthenticationOptions True Specifies the authentication options to use when retrieving embeddings from the specified endpoint.
type string True Must be endpoint.

Field mapping options

Optional settings to control how fields are processed when using a configured Mongo DB Atlas resource.

Name Type Required Description
content_fields string[] True The names of index fields that should be treated as content.
vector_fields string[] True The names of fields that represent vector data.
title_field string False The name of the index field to use as a title.
url_field string False The name of the index field to use as a URL.
filepath_field string False The name of the index field to use as a filepath.
content_fields_separator string False The separator pattern that content fields should use.

Examples

Install the latest pip packages openai, azure-identity.

import os
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

endpoint = os.environ.get("AzureOpenAIEndpoint")
deployment = os.environ.get("ChatCompletionsDeploymentName")
index_name = os.environ.get("IndexName")
key = os.environ.get("Key")
embedding_name = os.environ.get("EmbeddingName")
embedding_type = os.environ.get("EmbeddingType")

# Additional variables for Mongo DB Atlas
mongo_db_username = os.environ.get("MongoDBUsername")
mongo_db_password = os.environ.get("MongoDBPassword")
mongo_db_endpoint = os.environ.get("MongoDBEndpoint")
mongo_db_app_name = os.environ.get("MongoDBAppName")
mongo_db_database_name = os.environ.get("MongoDBName")
mongo_db_collection = os.environ.get("MongoDBCollection")
mongo_db_index = os.environ.get("MongoDBIndex")

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_endpoint=endpoint,
    azure_ad_token_provider=token_provider,
    api_version="2024-05-01-preview",
)

completion = client.chat.completions.create(
    model=deployment,
    messages=[
        {
            "role": "user",
            "content": "Who is DRI?",
        },
    ],
    extra_body={
        "data_sources": [
            {
                "type": "mongo_db",
                "parameters": {
                "authentication": {
                    "type": "username_and_password",
                    "username": mongo_db_username,
                    "password": mongo_db_password
                },
                "endpoint": mongo_db_endpoint,
                "app_name": mongo_db_app_name,
                "database_name": mongo_db_database_name,
                "collection_name": mongo_db_collection,
                "index_name": mongo_db_index,
                "embedding_dependency": {
                    "type": embedding_type,
                    "deployment_name": embedding_name
                },
                "fields_mapping": {
                    "content_fields": [
                    "content"
                    ],
                    "vector_fields": [
                    "contentvector"
                    ]
                }
                }
            }
        ]
    }
)

print(completion.model_dump_json(indent=2))