Training
Module
Generate text and conversations with .NET and Azure OpenAI Completions - Training
Learn how to use the .NET SDK with the Azure OpenAI service to have your applications carry on conversations with users in natural language.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
This article provides details on the inference REST API endpoints for Azure OpenAI.
Managing and interacting with Azure OpenAI models and resources is divided across three primary API surfaces:
Each API surface/specification encapsulates a different set of Azure OpenAI capabilities. Each API has its own unique set of preview and stable/generally available (GA) API releases. Preview releases currently tend to follow a monthly cadence.
API | Latest preview release | Latest GA release | Specifications | Description |
---|---|---|---|---|
Control plane | 2024-06-01-preview |
2024-10-01 |
Spec files | Azure OpenAI shares a common control plane with all other Azure AI Services. The control plane API is used for things like creating Azure OpenAI resources, model deployment, and other higher level resource management tasks. The control plane also governs what is possible to do with capabilities like Azure Resource Manager, Bicep, Terraform, and Azure CLI. |
Data plane - authoring | 2025-03-01-preview |
2024-10-21 |
Spec files | The data plane authoring API controls fine-tuning, file-upload, ingestion jobs, batch and certain model level queries |
Data plane - inference | 2025-03-01-preview |
2024-10-21 |
Spec files | The data plane inference API provides the inference capabilities/endpoints for features like completions, chat completions, embeddings, speech/whisper, on your data, Dall-e, assistants, etc. |
Azure OpenAI provides two methods for authentication. You can use either API Keys or Microsoft Entra ID.
API Key authentication: For this type of authentication, all API requests must include the API Key in the api-key
HTTP header. The Quickstart provides guidance for how to make calls with this type of authentication.
Microsoft Entra ID authentication: You can authenticate an API call using a Microsoft Entra token. Authentication tokens are included in a request as the Authorization
header. The token provided must be preceded by Bearer
, for example Bearer YOUR_AUTH_TOKEN
. You can read our how-to guide on authenticating with Microsoft Entra ID.
The service APIs are versioned using the api-version
query parameter. All versions follow the YYYY-MM-DD date structure. For example:
POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01
The rest of the article covers the latest preview release of the Azure OpenAI data plane inference specification, 2025-03-01-preview
. This article includes documentation for the latest preview capabilities like assistants, threads, and vector stores.
If you're looking for documentation on the latest GA API release, refer to the latest GA data plane inference API
POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-03-01-preview
Creates a completion for the provided prompt, parameters and chosen model.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
best_of | integer | Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed.When used with n , best_of controls the number of candidate completions and n specifies how many to return best_of must be greater than n .Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop . |
No | 1 |
echo | boolean | Echo back the prompt in addition to the completion |
No | False |
frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
No | 0 |
logit_bias | object | Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated. |
No | None |
logprobs | integer | Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.The maximum value for logprobs is 5. |
No | None |
max_tokens | integer | The maximum number of tokens that can be generated in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. Example Python code for counting tokens. |
No | 16 |
n | integer | How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop . |
No | 1 |
presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
No | 0 |
prompt | string or array | The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document. |
Yes | |
seed | integer | If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend. |
No | |
stop | string or array | Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. |
No | |
stream | boolean | Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code. |
No | False |
suffix | string | The suffix that comes after a completion of inserted text. This parameter is only supported for gpt-3.5-turbo-instruct . |
No | None |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse. |
No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | createCompletionResponse |
Status Code: default
Description: Service unavailable
Content-Type | Type | Description |
---|---|---|
application/json | errorResponse |
Creates a completion for the provided prompt, parameters and chosen model.
POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-03-01-preview
{
"prompt": [
"tell me a joke about mango"
],
"max_tokens": 32,
"temperature": 1.0,
"n": 1
}
Responses: Status Code: 200
{
"body": {
"id": "cmpl-7QmVI15qgYVllxK0FtxVGG6ywfzaq",
"created": 1686617332,
"choices": [
{
"text": "es\n\nWhat do you call a mango who's in charge?\n\nThe head mango.",
"index": 0,
"finish_reason": "stop",
"logprobs": null
}
],
"usage": {
"completion_tokens": 20,
"prompt_tokens": 6,
"total_tokens": 26
}
}
}
POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-03-01-preview
Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | The deployment id of the model which was deployed. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
dimensions | integer | The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. |
No | |
encoding_format | string | The format to return the embeddings in. Can be either float or base64 . Defaults to float . |
No | |
input | string or array | Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8192 tokens for text-embedding-ada-002 ), cannot be an empty string, and any array must be 2048 dimensions or less. |
Yes | |
input_type | string | input type of embedding search to use | No | |
user | string | A unique identifier representing your end-user, which can help monitoring and detecting abuse. | No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | object |
Return the embeddings for a given prompt.
POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-03-01-preview
{
"input": [
"this is a test"
]
}
Responses: Status Code: 200
{
"body": {
"data": [
{
"index": 0,
"embedding": [
-0.012838088,
-0.007421397,
-0.017617522,
-0.028278312,
-0.018666342,
0.01737855,
-0.01821495,
-0.006950092,
-0.009937238,
-0.038580645,
0.010674067,
0.02412286,
-0.013647936,
0.013189907,
0.0021125758,
0.012406612,
0.020790534,
0.00074595667,
0.008397198,
-0.00535031,
0.008968075,
0.014351576,
-0.014086051,
0.015055214,
-0.022211088,
-0.025198232,
0.0065186154,
-0.036350243,
0.009180495,
-0.009698266,
0.009446018,
-0.008463579,
-0.0040426035,
-0.03443847,
-0.00091273896,
-0.0019217303,
0.002349888,
-0.021560553,
0.016515596,
-0.015572986,
0.0038666942,
-8.432463e-05,
0.0032178196,
-0.020365695,
]
}
],
"usage": {
"prompt_tokens": 4,
"total_tokens": 4
}
}
}
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
Creates a completion for the chat message
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
audio | object | Parameters for audio output. Required when audio output is requested withmodalities: ["audio"] . |
No | |
└─ format | enum | Specifies the output audio format. Must be one of wav , mp3 , flac ,opus , or pcm16 . Possible values: wav , mp3 , flac , opus , pcm16 |
No | |
└─ voice | enum | Specifies the voice type. Supported voices are alloy , echo , fable , onyx , nova , and shimmer .Possible values: alloy , echo , fable , onyx , nova , shimmer |
No | |
data_sources | array | The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI. |
No | |
frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
No | 0 |
function_call | string or chatCompletionFunctionCallOption | Deprecated in favor of tool_choice .Controls which (if any) function is called by the model. none means the model will not call a function and instead generates a message.auto means the model can pick between generating a message or calling a function.Specifying a particular function via {"name": "my_function"} forces the model to call that function.none is the default when no functions are present. auto is the default if functions are present. |
No | |
functions | array | Deprecated in favor of tools .A list of functions the model may generate JSON inputs for. |
No | |
logit_bias | object | Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. |
No | None |
logprobs | boolean | Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message . |
No | False |
max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. | No | |
max_tokens | integer | The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Example Python code for counting tokens. |
No | |
messages | array | A list of messages comprising the conversation so far. Example Python code. | Yes | |
metadata | object | Developer-defined tags and values used for filtering completions in the stored completions dashboard. | No | |
modalities | ChatCompletionModalities | Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default: ["text"] The gpt-4o-audio-preview model can also be used to generate audio. Torequest that this model generate both text and audio responses, you can use: ["text", "audio"] |
No | |
n | integer | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. |
No | 1 |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
prediction | PredictionContent | Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. | No | |
presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
No | 0 |
reasoning_effort | enum | o1 models only Constrains effort on reasoning for reasoning models. Currently supported values are low , medium , and high . Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.Possible values: low , medium , high |
No | |
response_format | ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema | An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which guarantees the model will match your supplied JSON schema.Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
seed | integer | This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend. |
No | |
stop | string or array | Up to 4 sequences where the API will stop generating further tokens. |
No | |
store | boolean | Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. | No | |
stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code. |
No | False |
stream_options | chatCompletionStreamOptions | Options for streaming response. Only set this when you set stream: true . |
No | None |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
tool_choice | chatCompletionToolChoiceOption | Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present. |
No | |
tools | array | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported. |
No | |
top_logprobs | integer | An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. |
No | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse. |
No | |
user_security_context | userSecurityContext | User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud. | No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | object |
Status Code: default
Description: Service unavailable
Content-Type | Type | Description |
---|---|---|
application/json | errorResponse |
Creates a completion for the provided prompt, parameters and chosen model.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "system",
"content": "you are a helpful assistant that talks like a pirate"
},
{
"role": "user",
"content": "can you tell me how to care for a parrot?"
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Ahoy matey! So ye be wantin' to care for a fine squawkin' parrot, eh? Well, shiver me timbers, let ol' Cap'n Assistant share some wisdom with ye! Here be the steps to keepin' yer parrot happy 'n healthy:\n\n1. Secure a sturdy cage: Yer parrot be needin' a comfortable place to lay anchor! Be sure ye get a sturdy cage, at least double the size of the bird's wingspan, with enough space to spread their wings, yarrrr!\n\n2. Perches 'n toys: Aye, parrots need perches of different sizes, shapes, 'n textures to keep their feet healthy. Also, a few toys be helpin' to keep them entertained 'n their minds stimulated, arrrh!\n\n3. Proper grub: Feed yer feathered friend a balanced diet of high-quality pellets, fruits, 'n veggies to keep 'em strong 'n healthy. Give 'em fresh water every day, or ye\u00e2\u20ac\u2122ll have a scurvy bird on yer hands!\n\n4. Cleanliness: Swab their cage deck! Clean their cage on a regular basis: fresh water 'n food daily, the floor every couple of days, 'n a thorough scrubbing ev'ry few weeks, so the bird be livin' in a tidy haven, arrhh!\n\n5. Socialize 'n train: Parrots be a sociable lot, arrr! Exercise 'n interact with 'em daily to create a bond 'n maintain their mental 'n physical health. Train 'em with positive reinforcement, treat 'em kindly, yarrr!\n\n6. Proper rest: Yer parrot be needin' \u00e2\u20ac\u2122bout 10-12 hours o' sleep each night. Cover their cage 'n let them slumber in a dim, quiet quarter for a proper night's rest, ye scallywag!\n\n7. Keep a weather eye open for illness: Birds be hidin' their ailments, arrr! Be watchful for signs of sickness, such as lethargy, loss of appetite, puffin' up, or change in droppings, and make haste to a vet if need be.\n\n8. Provide fresh air 'n avoid toxins: Parrots be sensitive to draft and pollutants. Keep yer quarters well ventilated, but no drafts, arrr! Be mindful of toxins like Teflon fumes, candles, or air fresheners.\n\nSo there ye have it, me hearty! With proper care 'n commitment, yer parrot will be squawkin' \"Yo-ho-ho\" for many years to come! Good luck, sailor, and may the wind be at yer back!"
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion based on Azure Search data and system-assigned managed identity.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a dog?"
}
],
"data_sources": [
{
"type": "azure_search",
"parameters": {
"endpoint": "https://your-search-endpoint.search.windows.net/",
"index_name": "{index name}",
"authentication": {
"type": "system_assigned_managed_identity"
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"citations": [
{
"content": "Citation content.",
"title": "Citation Title",
"filepath": "contoso.txt",
"url": "https://contoso.blob.windows.net/container/contoso.txt",
"chunk_id": "0"
}
],
"intent": "dog care"
}
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion based on Azure Search image vector data.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a dog?"
}
],
"data_sources": [
{
"type": "azure_search",
"parameters": {
"endpoint": "https://your-search-endpoint.search.windows.net/",
"index_name": "{index name}",
"query_type": "vector",
"fields_mapping": {
"image_vector_fields": [
"image_vector"
]
},
"authentication": {
"type": "api_key",
"key": "{api key}"
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion."
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion based on Azure Search vector data, previous assistant message and user-assigned managed identity.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a cat?"
},
{
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"intent": "cat care"
}
},
{
"role": "user",
"content": "how about dog?"
}
],
"data_sources": [
{
"type": "azure_search",
"parameters": {
"endpoint": "https://your-search-endpoint.search.windows.net/",
"authentication": {
"type": "user_assigned_managed_identity",
"managed_identity_resource_id": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{resource-name}"
},
"index_name": "{index name}",
"query_type": "vector",
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "{embedding deployment name}"
},
"in_scope": true,
"top_n_documents": 5,
"strictness": 3,
"role_information": "You are an AI assistant that helps people find information.",
"fields_mapping": {
"content_fields_separator": "\\n",
"content_fields": [
"content"
],
"filepath_field": "filepath",
"title_field": "title",
"url_field": "url",
"vector_fields": [
"contentvector"
]
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"citations": [
{
"content": "Citation content 2.",
"title": "Citation Title 2",
"filepath": "contoso2.txt",
"url": "https://contoso.blob.windows.net/container/contoso2.txt",
"chunk_id": "0"
}
],
"intent": "dog care"
}
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion for the provided Azure Cosmos DB.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a dog?"
}
],
"data_sources": [
{
"type": "azure_cosmos_db",
"parameters": {
"authentication": {
"type": "connection_string",
"connection_string": "mongodb+srv://rawantest:{password}$@{cluster-name}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
},
"database_name": "vectordb",
"container_name": "azuredocs",
"index_name": "azuredocindex",
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "{embedding deployment name}"
},
"fields_mapping": {
"content_fields": [
"content"
],
"vector_fields": [
"contentvector"
]
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"citations": [
{
"content": "Citation content.",
"title": "Citation Title",
"filepath": "contoso.txt",
"url": "https://contoso.blob.windows.net/container/contoso.txt",
"chunk_id": "0"
}
],
"intent": "dog care"
}
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion for the provided Mongo DB.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a dog?"
}
],
"data_sources": [
{
"type": "mongo_db",
"parameters": {
"authentication": {
"type": "username_and_password",
"username": "<username>",
"password": "<password>"
},
"endpoint": "<endpoint_name>",
"app_name": "<application name>",
"database_name": "sampledb",
"collection_name": "samplecollection",
"index_name": "sampleindex",
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "{embedding deployment name}"
},
"fields_mapping": {
"content_fields": [
"content"
],
"vector_fields": [
"contentvector"
]
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"citations": [
{
"content": "Citation content.",
"title": "Citation Title",
"filepath": "contoso.txt",
"url": "https://contoso.blob.windows.net/container/contoso.txt",
"chunk_id": "0"
}
],
"intent": "dog care"
}
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion for the provided Elasticsearch.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a dog?"
}
],
"data_sources": [
{
"type": "elasticsearch",
"parameters": {
"endpoint": "https://your-elasticsearch-endpoint.eastus.azurecontainer.io",
"index_name": "{index name}",
"authentication": {
"type": "key_and_key_id",
"key": "{key}",
"key_id": "{key id}"
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"citations": [
{
"content": "Citation content.",
"title": "Citation Title",
"filepath": "contoso.txt",
"url": "https://contoso.blob.windows.net/container/contoso.txt",
"chunk_id": "0"
}
],
"intent": "dog care"
}
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
Creates a completion for the provided Pinecone resource.
POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview
{
"messages": [
{
"role": "user",
"content": "can you tell me how to care for a dog?"
}
],
"data_sources": [
{
"type": "pinecone",
"parameters": {
"authentication": {
"type": "api_key",
"key": "{api key}"
},
"environment": "{environment name}",
"index_name": "{index name}",
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "{embedding deployment name}"
},
"fields_mapping": {
"title_field": "title",
"url_field": "url",
"filepath_field": "filepath",
"content_fields": [
"content"
],
"content_fields_separator": "\n"
}
}
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
"created": 1686676106,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Content of the completion [doc1].",
"context": {
"citations": [
{
"content": "Citation content.",
"title": "Citation Title",
"filepath": "contoso.txt",
"url": "https://contoso.blob.windows.net/container/contoso.txt",
"chunk_id": "0"
}
],
"intent": "dog care"
}
}
}
],
"usage": {
"completion_tokens": 557,
"prompt_tokens": 33,
"total_tokens": 590
}
}
}
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-03-01-preview
Transcribes audio into the input language.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: multipart/form-data
Name | Type | Description | Required | Default |
---|---|---|---|---|
file | string | The audio file object to transcribe. | Yes | |
language | string | The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. | No | |
prompt | string | An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. | No | |
response_format | audioResponseFormat | Defines the format of the output. | No | |
temperature | number | The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. | No | 0 |
timestamp_granularities[] | array | The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word , or segment . Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. |
No | ['segment'] |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | object | |
text/plain | string | Transcribed text in the output format (when response_format was one of text, vtt or srt). |
Gets transcribed text and associated metadata from provided spoken audio data.
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"text": "A structured object when requesting json or verbose_json"
}
}
Gets transcribed text and associated metadata from provided spoken audio data.
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-03-01-preview
"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"
Responses: Status Code: 200
{
"type": "string",
"example": "plain text when requesting text, srt, or vtt"
}
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-03-01-preview
Transcribes and translates input audio into English text.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: multipart/form-data
Name | Type | Description | Required | Default |
---|---|---|---|---|
file | string | The audio file to translate. | Yes | |
prompt | string | An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English. | No | |
response_format | audioResponseFormat | Defines the format of the output. | No | |
temperature | number | The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. | No | 0 |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | object | |
text/plain | string | Transcribed text in the output format (when response_format was one of text, vtt or srt). |
Gets English language transcribed text and associated metadata from provided spoken audio data.
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-03-01-preview
"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"
Responses: Status Code: 200
{
"body": {
"text": "A structured object when requesting json or verbose_json"
}
}
Gets English language transcribed text and associated metadata from provided spoken audio data.
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-03-01-preview
"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"
Responses: Status Code: 200
{
"type": "string",
"example": "plain text when requesting text, srt, or vtt"
}
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-03-01-preview
Generates audio from the input text.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: multipart/form-data
Name | Type | Description | Required | Default |
---|---|---|---|---|
input | string | The text to synthesize audio for. The maximum length is 4096 characters. | Yes | |
response_format | enum | The format to synthesize the audio in. Possible values: mp3 , opus , aac , flac , wav , pcm |
No | |
speed | number | The speed of the synthesized audio. Select a value from 0.25 to 4.0 . 1.0 is the default. |
No | 1.0 |
voice | enum | The voice to use for speech synthesis. Possible values: alloy , echo , fable , onyx , nova , shimmer |
Yes |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/octet-stream | string |
Synthesizes audio from the provided text.
POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-03-01-preview
{
"input": "Hi! What are you going to make?",
"voice": "fable",
"response_format": "mp3"
}
Responses: Status Code: 200
{
"body": "101010101"
}
POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-03-01-preview
Generates a batch of images from a text caption on a given DALLE model deployment
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
deployment-id | path | Yes | string | |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
n | integer | The number of images to generate. | No | 1 |
prompt | string | A text description of the desired image(s). The maximum length is 4000 characters. | Yes | |
quality | imageQuality | The quality of the image that will be generated. | No | standard |
response_format | imagesResponseFormat | The format in which the generated images are returned. | No | url |
size | imageSize | The size of the generated images. | No | 1024x1024 |
style | imageStyle | The style of the generated images. | No | vivid |
user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse. | No |
Status Code: 200
Description: Ok
Content-Type | Type | Description |
---|---|---|
application/json | generateImagesResponse |
Status Code: default
Description: An error occurred.
Content-Type | Type | Description |
---|---|---|
application/json | dalleErrorResponse |
Creates images given a prompt.
POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-03-01-preview
{
"prompt": "In the style of WordArt, Microsoft Clippy wearing a cowboy hat.",
"n": 1,
"style": "natural",
"quality": "standard"
}
Responses: Status Code: 200
{
"body": {
"created": 1698342300,
"data": [
{
"revised_prompt": "A vivid, natural representation of Microsoft Clippy wearing a cowboy hat.",
"prompt_filter_results": {
"sexual": {
"severity": "safe",
"filtered": false
},
"violence": {
"severity": "safe",
"filtered": false
},
"hate": {
"severity": "safe",
"filtered": false
},
"self_harm": {
"severity": "safe",
"filtered": false
},
"profanity": {
"detected": false,
"filtered": false
},
"custom_blocklists": {
"filtered": false,
"details": []
}
},
"url": "https://dalletipusw2.blob.core.windows.net/private/images/e5451cc6-b1ad-4747-bd46-b89a3a3b8bc3/generated_00.png?se=2023-10-27T17%3A45%3A09Z&...",
"content_filter_results": {
"sexual": {
"severity": "safe",
"filtered": false
},
"violence": {
"severity": "safe",
"filtered": false
},
"hate": {
"severity": "safe",
"filtered": false
},
"self_harm": {
"severity": "safe",
"filtered": false
}
}
}
]
}
}
GET https://{endpoint}/openai/assistants?api-version=2025-03-01-preview
Returns a list of assistants.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listAssistantsResponse |
Returns a list of assistants.
GET https://{endpoint}/openai/assistants?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "asst_abc123",
"object": "assistant",
"created_at": 1707257477,
"name": "Stock Analyst",
"description": null,
"model": "gpt-4-1106-preview",
"instructions": "You are a financial analyst that analyzes stock market prices and other financial data present on user uploaded files or by calling external APIs.",
"tools": [
{
"type": "code_interpreter"
}
],
"tool_resources": {},
"metadata": {},
"top_p": 1.0,
"temperature": 1.0,
"response_format": "auto"
},
{
"id": "asst_abc456",
"object": "assistant",
"created_at": 1698982718,
"name": "My Assistant",
"description": null,
"model": "gpt-4-turbo",
"instructions": "You are a helpful assistant designed to make me better at coding!",
"tools": [],
"tool_resources": {},
"metadata": {},
"top_p": 1.0,
"temperature": 1.0,
"response_format": "auto"
},
{
"id": "asst_abc789",
"object": "assistant",
"created_at": 1698982643,
"name": null,
"description": null,
"model": "gpt-4-turbo",
"instructions": null,
"tools": [],
"tool_resources": {},
"metadata": {},
"top_p": 1.0,
"temperature": 1.0,
"response_format": "auto"
}
],
"first_id": "asst_abc123",
"last_id": "asst_abc789",
"has_more": false
}
}
POST https://{endpoint}/openai/assistants?api-version=2025-03-01-preview
Create an assistant with a model and instructions.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | The description of the assistant. The maximum length is 512 characters. |
No | |
instructions | string | The system instructions that the assistant uses. The maximum length is 256,000 characters. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | Yes | ||
name | string | The name of the assistant. The maximum length is 256 characters. |
No | |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
└─ vector_stores | array | A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter , retrieval , or function . |
No | [] |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | assistantObject |
Create an assistant with a model and instructions.
POST https://{endpoint}/openai/assistants?api-version=2025-03-01-preview
{
"name": "Math Tutor",
"instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
"tools": [
{
"type": "code_interpreter"
}
],
"model": "gpt-4-1106-preview"
}
Responses: Status Code: 200
{
"body": {
"id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
"object": "assistant",
"created_at": 1707295707,
"name": "Math Tutor",
"description": null,
"model": "gpt-4-1106-preview",
"instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
"tools": [
{
"type": "code_interpreter"
}
],
"metadata": {},
"top_p": 1.0,
"temperature": 1.0,
"response_format": "auto"
}
}
GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview
Retrieves an assistant.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
assistant_id | path | Yes | string | The ID of the assistant to retrieve. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | assistantObject |
Retrieves an assistant.
GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "asst_abc123",
"object": "assistant",
"created_at": 1699009709,
"name": "HR Helper",
"description": null,
"model": "gpt-4-turbo",
"instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.",
"tools": [
{
"type": "file_search"
}
],
"metadata": {},
"top_p": 1.0,
"temperature": 1.0,
"response_format": "auto"
}
}
POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview
Modifies an assistant.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
assistant_id | path | Yes | string | The ID of the assistant to modify. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | The description of the assistant. The maximum length is 512 characters. |
No | |
instructions | string | The system instructions that the assistant uses. The maximum length is 32768 characters. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | No | ||
name | string | The name of the assistant. The maximum length is 256 characters. |
No | |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | Overrides the list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter , retrieval , or function . |
No | [] |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | assistantObject |
Modifies an assistant.
POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview
{
"instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
"tools": [
{
"type": "file_search"
}
],
"model": "gpt-4-turbo"
}
Responses: Status Code: 200
{
"body": {
"id": "asst_123",
"object": "assistant",
"created_at": 1699009709,
"name": "HR Helper",
"description": null,
"model": "gpt-4-turbo",
"instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
"tools": [
{
"type": "file_search"
}
],
"tool_resources": {
"file_search": {
"vector_store_ids": []
}
},
"metadata": {},
"top_p": 1.0,
"temperature": 1.0,
"response_format": "auto"
}
}
DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview
Delete an assistant.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
assistant_id | path | Yes | string | The ID of the assistant to delete. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | deleteAssistantResponse |
Deletes an assistant.
DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
"object": "assistant.deleted",
"deleted": true
}
}
POST https://{endpoint}/openai/threads?api-version=2025-03-01-preview
Create a thread.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
messages | array | A list of messages to start the thread with. | No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
tool_resources | object | A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread. |
No | |
└─ vector_stores | array | A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread. |
No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | threadObject |
Creates a thread.
POST https://{endpoint}/openai/threads?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "thread_v7V4csrNOxtNmgcwGg496Smx",
"object": "thread",
"created_at": 1707297136,
"metadata": {}
}
}
GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview
Retrieves a thread.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to retrieve. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | threadObject |
Retrieves a thread.
GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "thread_v7V4csrNOxtNmgcwGg496Smx",
"object": "thread",
"created_at": 1707297136,
"metadata": {},
"tool_resources": {
"code_interpreter": {
"file_ids": []
}
}
}
}
POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview
Modifies a thread.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to modify. Only the metadata can be modified. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
tool_resources | object | A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of File IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread. |
No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | threadObject |
Modifies a thread.
POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview
{
"metadata": {
"modified": "true",
"user": "abc123"
}
}
Responses: Status Code: 200
{
"body": {
"id": "thread_v7V4csrNOxtNmgcwGg496Smx",
"object": "thread",
"created_at": 1707297136,
"metadata": {
"modified": "true",
"user": "abc123"
},
"tool_resources": {}
}
}
DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview
Delete a thread.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to delete. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | deleteThreadResponse |
Deletes a thread.
DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "thread_v7V4csrNOxtNmgcwGg496Smx",
"object": "thread.deleted",
"deleted": true
}
}
GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview
Returns a list of messages for a given thread.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread the messages belong to. |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
run_id | query | No | string | Filter messages by the run ID that generated them. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listMessagesResponse |
List Messages
GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "msg_abc123",
"object": "thread.message",
"created_at": 1699016383,
"assistant_id": null,
"thread_id": "thread_abc123",
"run_id": null,
"role": "user",
"content": [
{
"type": "text",
"text": {
"value": "How does AI work? Explain it in simple terms.",
"annotations": []
}
}
],
"attachments": [],
"metadata": {}
},
{
"id": "msg_abc456",
"object": "thread.message",
"created_at": 1699016383,
"assistant_id": null,
"thread_id": "thread_abc123",
"run_id": null,
"role": "user",
"content": [
{
"type": "text",
"text": {
"value": "Hello, what is AI?",
"annotations": []
}
}
],
"attachments": [],
"metadata": {}
}
],
"first_id": "msg_abc123",
"last_id": "msg_abc456",
"has_more": false
}
}
POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview
Create a message.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to create a message for. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
attachments | array | A list of files attached to the message, and the tools they should be added to. | No | |
content | string | The content of the message. | Yes | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
role | string | The role of the entity that is creating the message. Allowed values include: - user : Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.- assistant : Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation. |
Yes |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | messageObject |
Create a message.
POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview
{
"role": "user",
"content": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces."
}
Responses: Status Code: 200
{
"body": {
"id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
"object": "thread.message",
"created_at": 1707298421,
"assistant_id": null,
"thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
"run_id": null,
"role": "user",
"content": [
{
"type": "text",
"text": {
"value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
"annotations": []
}
}
],
"attachments": [],
"metadata": {}
}
}
GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview
Retrieve a message.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to which this message belongs. |
message_id | path | Yes | string | The ID of the message to retrieve. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | messageObject |
Retrieve a message.
GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
"object": "thread.message",
"created_at": 1707298421,
"thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
"role": "user",
"content": [
{
"type": "text",
"text": {
"value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
"annotations": []
}
}
],
"file_ids": [],
"assistant_id": null,
"run_id": null,
"metadata": {}
}
}
POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview
Modifies a message.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to which this message belongs. |
message_id | path | Yes | string | The ID of the message to modify. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | messageObject |
Modify a message.
POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview
{
"metadata": {
"modified": "true",
"user": "abc123"
}
}
Responses: Status Code: 200
{
"body": {
"id": "msg_abc123",
"object": "thread.message",
"created_at": 1699017614,
"assistant_id": null,
"thread_id": "thread_abc123",
"run_id": null,
"role": "user",
"content": [
{
"type": "text",
"text": {
"value": "How does AI work? Explain it in simple terms.",
"annotations": []
}
}
],
"file_ids": [],
"metadata": {
"modified": "true",
"user": "abc123"
}
}
}
POST https://{endpoint}/openai/threads/runs?api-version=2025-03-01-preview
Create a thread and run it in one request.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
assistant_id | string | The ID of the assistant to use to execute this run. | Yes | |
instructions | string | Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. | No | |
max_completion_tokens | integer | The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
max_prompt_tokens | integer | The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | The ID of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
stream | boolean | If true , returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message. |
No | |
stream_options | chatCompletionStreamOptions | Options for streaming response. Only set this when you set stream: true . |
No | None |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
thread | createThreadRequest | No | ||
tool_choice | assistantsApiToolChoiceOption | Controls which (if any) tool is called by the model.none means the model will not call any tools and instead generates a message.auto is the default value and means the model can pick between generating a message or calling a tool.Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. |
No | |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. | No | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
truncation_strategy | truncationObject | Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. | No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runObject |
Create a thread and run it in one request.
POST https://{endpoint}/openai/threads/runs?api-version=2025-03-01-preview
{
"assistant_id": "asst_abc123",
"thread": {
"messages": [
{
"role": "user",
"content": "Explain deep learning to a 5 year old."
}
]
}
}
Responses: Status Code: 200
{
"body": {
"id": "run_abc123",
"object": "thread.run",
"created_at": 1699076792,
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"status": "queued",
"started_at": null,
"expires_at": 1699077392,
"cancelled_at": null,
"failed_at": null,
"completed_at": null,
"required_action": null,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": "You are a helpful assistant.",
"tools": [],
"tool_resources": {},
"metadata": {},
"temperature": 1.0,
"top_p": 1.0,
"max_completion_tokens": null,
"max_prompt_tokens": null,
"truncation_strategy": {
"type": "auto",
"last_messages": null
},
"incomplete_details": null,
"usage": null,
"response_format": "auto",
"tool_choice": "auto"
}
}
GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview
Returns a list of runs belonging to a thread.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread the run belongs to. |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listRunsResponse |
Returns a list of runs belonging to a thread.
GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "run_abc123",
"object": "thread.run",
"created_at": 1699075072,
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"status": "completed",
"started_at": 1699075072,
"expires_at": null,
"cancelled_at": null,
"failed_at": null,
"completed_at": 1699075073,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": null,
"incomplete_details": null,
"tools": [
{
"type": "code_interpreter"
}
],
"tool_resources": {
"code_interpreter": {
"file_ids": [
"file-abc123",
"file-abc456"
]
}
},
"metadata": {},
"usage": {
"prompt_tokens": 123,
"completion_tokens": 456,
"total_tokens": 579
},
"temperature": 1.0,
"top_p": 1.0,
"max_prompt_tokens": 1000,
"max_completion_tokens": 1000,
"truncation_strategy": {
"type": "auto",
"last_messages": null
},
"response_format": "auto",
"tool_choice": "auto"
},
{
"id": "run_abc456",
"object": "thread.run",
"created_at": 1699063290,
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"status": "completed",
"started_at": 1699063290,
"expires_at": null,
"cancelled_at": null,
"failed_at": null,
"completed_at": 1699063291,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": null,
"incomplete_details": null,
"tools": [
{
"type": "code_interpreter"
}
],
"tool_resources": {
"code_interpreter": {
"file_ids": [
"file-abc123",
"file-abc456"
]
}
},
"metadata": {},
"usage": {
"prompt_tokens": 123,
"completion_tokens": 456,
"total_tokens": 579
},
"temperature": 1.0,
"top_p": 1.0,
"max_prompt_tokens": 1000,
"max_completion_tokens": 1000,
"truncation_strategy": {
"type": "auto",
"last_messages": null
},
"response_format": "auto",
"tool_choice": "auto"
}
],
"first_id": "run_abc123",
"last_id": "run_abc456",
"has_more": false
}
}
POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview
Create a run.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to run. |
include[] | query | No | array | A list of additional fields to include in the response. Currently the only supported value is step_details.tool_calls[*].file_search.results[*].content to fetch the file search result content. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
additional_instructions | string | Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions. | No | |
additional_messages | array | Adds additional messages to the thread before creating the run. | No | |
assistant_id | string | The ID of the assistant to use to execute this run. | Yes | |
instructions | string | Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. | No | |
max_completion_tokens | integer | The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
max_prompt_tokens | integer | The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
stream | boolean | If true , returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_choice | assistantsApiToolChoiceOption | Controls which (if any) tool is called by the model.none means the model will not call any tools and instead generates a message.auto is the default value and means the model can pick between generating a message or calling a tool.Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. |
No | |
tools | array | Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. | No | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
truncation_strategy | truncationObject | Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. | No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runObject |
Create a run.
POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview
{
"assistant_id": "asst_abc123"
}
Responses: Status Code: 200
{
"body": {
"id": "run_abc123",
"object": "thread.run",
"created_at": 1699063290,
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"status": "queued",
"started_at": 1699063290,
"expires_at": null,
"cancelled_at": null,
"failed_at": null,
"completed_at": 1699063291,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": null,
"incomplete_details": null,
"tools": [
{
"type": "code_interpreter"
}
],
"metadata": {},
"usage": null,
"temperature": 1.0,
"top_p": 1.0,
"max_prompt_tokens": 1000,
"max_completion_tokens": 1000,
"truncation_strategy": {
"type": "auto",
"last_messages": null
},
"response_format": "auto",
"tool_choice": "auto"
}
}
GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview
Retrieves a run.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread that was run. |
run_id | path | Yes | string | The ID of the run to retrieve. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runObject |
Gets a run.
GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "run_HsO8tYM4K5AAMAHgK0J3om8Q",
"object": "thread.run",
"created_at": 1707303196,
"assistant_id": "asst_JtTwHk28cIocgFXZPCBxhOzl",
"thread_id": "thread_eRNwflE3ncDYak1np6MdMHJh",
"status": "completed",
"started_at": 1707303197,
"expires_at": null,
"cancelled_at": null,
"failed_at": null,
"completed_at": 1707303201,
"last_error": null,
"model": "gpt-4-1106-preview",
"instructions": "You are an AI model that empowers every person and every organization on the planet to achieve more.",
"tools": [],
"file_ids": [],
"metadata": {}
}
}
POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview
Modifies a run.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread that was run. |
run_id | path | Yes | string | The ID of the run to modify. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runObject |
Modifies a run.
POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview
{
"metadata": {
"user_id": "user_abc123"
}
}
Responses: Status Code: 200
{
"body": {
"id": "run_abc123",
"object": "thread.run",
"created_at": 1699075072,
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"status": "completed",
"started_at": 1699075072,
"expires_at": null,
"cancelled_at": null,
"failed_at": null,
"completed_at": 1699075073,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": null,
"incomplete_details": null,
"tools": [
{
"type": "code_interpreter"
}
],
"tool_resources": {
"code_interpreter": {
"file_ids": [
"file-abc123",
"file-abc456"
]
}
},
"metadata": {
"user_id": "user_abc123"
},
"usage": {
"prompt_tokens": 123,
"completion_tokens": 456,
"total_tokens": 579
},
"temperature": 1.0,
"top_p": 1.0,
"max_prompt_tokens": 1000,
"max_completion_tokens": 1000,
"truncation_strategy": {
"type": "auto",
"last_messages": null
},
"response_format": "auto",
"tool_choice": "auto"
}
}
POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-03-01-preview
When a run has the status: "requires_action"
and required_action.type
is submit_tool_outputs
, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to which this run belongs. |
run_id | path | Yes | string | The ID of the run that requires the tool output submission. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
stream | boolean | If true , returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message. |
No | |
tool_outputs | array | A list of tools for which the outputs are being submitted. | Yes |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runObject |
When a run has the status: "requires_action"
and required_action.type
is submit_tool_outputs
, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.
POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-03-01-preview
{
"tool_outputs": [
{
"tool_call_id": "call_001",
"output": "70 degrees and sunny."
}
]
}
Responses: Status Code: 200
{
"body": {
"id": "run_123",
"object": "thread.run",
"created_at": 1699075592,
"assistant_id": "asst_123",
"thread_id": "thread_123",
"status": "queued",
"started_at": 1699075592,
"expires_at": 1699076192,
"cancelled_at": null,
"failed_at": null,
"completed_at": null,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": null,
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
],
"metadata": {},
"usage": null,
"temperature": 1.0,
"top_p": 1.0,
"max_prompt_tokens": 1000,
"max_completion_tokens": 1000,
"truncation_strategy": {
"type": "auto",
"last_messages": null
},
"response_format": "auto",
"tool_choice": "auto"
}
}
POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-03-01-preview
Cancels a run that is in_progress
.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to which this run belongs. |
run_id | path | Yes | string | The ID of the run to cancel. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runObject |
Cancels a run that is in_progress
.
POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "run_abc123",
"object": "thread.run",
"created_at": 1699076126,
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"status": "cancelling",
"started_at": 1699076126,
"expires_at": 1699076726,
"cancelled_at": null,
"failed_at": null,
"completed_at": null,
"last_error": null,
"model": "gpt-4-turbo",
"instructions": "You summarize books.",
"tools": [
{
"type": "file_search"
}
],
"tool_resources": {
"file_search": {
"vector_store_ids": [
"vs_123"
]
}
},
"metadata": {},
"usage": null,
"temperature": 1.0,
"top_p": 1.0,
"response_format": "auto"
}
}
GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-03-01-preview
Returns a list of run steps belonging to a run.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread the run and run steps belong to. |
run_id | path | Yes | string | The ID of the run the run steps belong to. |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
api-version | query | Yes | string | |
include[] | query | No | array | A list of additional fields to include in the response. Currently the only supported value is step_details.tool_calls[*].file_search.results[*].content to fetch the file search result content. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listRunStepsResponse |
Returns a list of run steps belonging to a run.
GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "step_abc123",
"object": "thread.run.step",
"created_at": 1699063291,
"run_id": "run_abc123",
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"type": "message_creation",
"status": "completed",
"cancelled_at": null,
"completed_at": 1699063291,
"expired_at": null,
"failed_at": null,
"last_error": null,
"step_details": {
"type": "message_creation",
"message_creation": {
"message_id": "msg_abc123"
}
},
"usage": {
"prompt_tokens": 123,
"completion_tokens": 456,
"total_tokens": 579
}
}
],
"first_id": "step_abc123",
"last_id": "step_abc456",
"has_more": false
}
}
GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-03-01-preview
Retrieves a run step.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
thread_id | path | Yes | string | The ID of the thread to which the run and run step belongs. |
run_id | path | Yes | string | The ID of the run to which the run step belongs. |
step_id | path | Yes | string | The ID of the run step to retrieve. |
include[] | query | No | array | A list of additional fields to include in the response. Currently the only supported value is step_details.tool_calls[*].file_search.results[*].content to fetch the file search result content. |
api-version | query | Yes | string |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | runStepObject |
Retrieves a run step.
GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "step_abc123",
"object": "thread.run.step",
"created_at": 1699063291,
"run_id": "run_abc123",
"assistant_id": "asst_abc123",
"thread_id": "thread_abc123",
"type": "message_creation",
"status": "completed",
"cancelled_at": null,
"completed_at": 1699063291,
"expired_at": null,
"failed_at": null,
"last_error": null,
"step_details": {
"type": "message_creation",
"message_creation": {
"message_id": "msg_abc123"
}
},
"usage": {
"prompt_tokens": 123,
"completion_tokens": 456,
"total_tokens": 579
}
}
}
GET https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview
Returns a list of vector stores.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listVectorStoresResponse |
Returns a list of vector stores.
GET https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "vs_abc123",
"object": "vector_store",
"created_at": 1699061776,
"name": "Support FAQ",
"bytes": 139920,
"file_counts": {
"in_progress": 0,
"completed": 3,
"failed": 0,
"cancelled": 0,
"total": 3
}
},
{
"id": "vs_abc456",
"object": "vector_store",
"created_at": 1699061776,
"name": "Support FAQ v2",
"bytes": 139920,
"file_counts": {
"in_progress": 0,
"completed": 3,
"failed": 0,
"cancelled": 0,
"total": 3
}
}
],
"first_id": "vs_abc123",
"last_id": "vs_abc456",
"has_more": false
}
}
POST https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview
Create a vector store.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is non-empty. |
No | |
expires_after | vectorStoreExpirationAfter | The expiration policy for a vector store. | No | |
file_ids | array | A list of file IDs that the vector store should use. Useful for tools like file_search that can access files. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
name | string | The name of the vector store. | No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreObject |
Creates a vector store.
POST https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "vs_abc123",
"object": "vector_store",
"created_at": 1699061776,
"name": "Support FAQ",
"bytes": 139920,
"file_counts": {
"in_progress": 0,
"completed": 3,
"failed": 0,
"cancelled": 0,
"total": 3
}
}
}
GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview
Retrieves a vector store.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store to retrieve. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreObject |
Retrieves a vector store.
GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "vs_abc123",
"object": "vector_store",
"created_at": 1699061776
}
}
POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview
Modifies a vector store.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store to modify. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
expires_after | vectorStoreExpirationAfter | The expiration policy for a vector store. | No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
name | string | The name of the vector store. | No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreObject |
Modifies a vector store.
POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview
{
"name": "Support FAQ"
}
Responses: Status Code: 200
{
"body": {
"id": "vs_abc123",
"object": "vector_store",
"created_at": 1699061776,
"name": "Support FAQ",
"bytes": 139920,
"file_counts": {
"in_progress": 0,
"completed": 3,
"failed": 0,
"cancelled": 0,
"total": 3
}
}
}
DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview
Delete a vector store.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store to delete. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | deleteVectorStoreResponse |
Deletes a vector store.
DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "vs_abc123",
"object": "vector_store.deleted",
"deleted": true
}
}
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview
Returns a list of vector store files.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store that the files belong to. |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
filter | query | No | string Possible values: in_progress , completed , failed , cancelled |
Filter by file status. One of in_progress , completed , failed , cancelled . |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listVectorStoreFilesResponse |
Returns a list of vector store files.
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "file-abc123",
"object": "vector_store.file",
"created_at": 1699061776,
"vector_store_id": "vs_abc123"
},
{
"id": "file-abc456",
"object": "vector_store.file",
"created_at": 1699061776,
"vector_store_id": "vs_abc123"
}
],
"first_id": "file-abc123",
"last_id": "file-abc456",
"has_more": false
}
}
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview
Create a vector store file by attaching a file to a vector store.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store for which to create a File. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | chunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. |
No | |
file_id | string | A File ID that the vector store should use. Useful for tools like file_search that can access files. |
Yes |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreFileObject |
Create a vector store file by attaching a File to a vector store.
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview
{
"file_id": "file-abc123"
}
Responses: Status Code: 200
{
"body": {
"id": "file-abc123",
"object": "vector_store.file",
"created_at": 1699061776,
"usage_bytes": 1234,
"vector_store_id": "vs_abcd",
"status": "completed",
"last_error": null
}
}
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview
Retrieves a vector store file.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store that the file belongs to. |
file_id | path | Yes | string | The ID of the file being retrieved. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreFileObject |
Retrieves a vector store file.
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "file-abc123",
"object": "vector_store.file",
"created_at": 1699061776,
"vector_store_id": "vs_abcd",
"status": "completed",
"last_error": null
}
}
DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview
Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store that the file belongs to. |
file_id | path | Yes | string | The ID of the file to delete. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | deleteVectorStoreFileResponse |
Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.
DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "file_abc123",
"object": "vector_store.file.deleted",
"deleted": true
}
}
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview
Update attributes on a vector store file.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store the file belongs to. |
file_id | path | Yes | string | The ID of the file to update attributes. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
attributes | VectorStoreFileAttributes | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers. |
Yes |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreFileObject |
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}/content?api-version=2025-03-01-preview
Retrieve the parsed contents of a vector store file.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store. |
file_id | path | Yes | string | The ID of the file within the vector store. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | VectorStoreFileContentResponse |
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/search?api-version=2025-03-01-preview
Search a vector store for relevant chunks based on a query and file attributes filter.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store to search. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
filters | ComparisonFilter or CompoundFilter | A filter to apply based on file attributes. | No | |
max_num_results | integer | The maximum number of results to return. This number should be between 1 and 50 inclusive. | No | 10 |
query | string or array | A query string for a search | Yes | |
ranking_options | object | Ranking options for search. | No | |
└─ ranker | enum | Possible values: auto , default-2024-11-15 |
No | |
└─ score_threshold | number | No | 0 | |
rewrite_query | boolean | Whether to rewrite the natural language query for vector search. | No | False |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | VectorStoreSearchResultsPage |
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-03-01-preview
Create a vector store file batch.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store for which to create a File Batch. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | chunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. |
No | |
file_ids | array | A list of File IDs that the vector store should use. Useful for tools like file_search that can access files. |
Yes |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreFileBatchObject |
Create a vector store file batch.
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-03-01-preview
{
"file_ids": [
"file-abc123",
"file-abc456"
]
}
Responses: Status Code: 200
{
"id": "vsfb_abc123",
"object": "vector_store.file_batch",
"created_at": 1699061776,
"vector_store_id": "vs_abc123",
"status": "in_progress",
"file_counts": {
"in_progress": 1,
"completed": 1,
"failed": 0,
"cancelled": 0,
"total": 0
}
}
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-03-01-preview
Retrieves a vector store file batch.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store that the file batch belongs to. |
batch_id | path | Yes | string | The ID of the file batch being retrieved. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreFileBatchObject |
Retrieves a vector store file batch.
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "vsfb_abc123",
"object": "vector_store.file_batch",
"created_at": 1699061776,
"vector_store_id": "vs_abc123",
"status": "in_progress",
"file_counts": {
"in_progress": 1,
"completed": 1,
"failed": 0,
"cancelled": 0,
"total": 0
}
}
}
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-03-01-preview
Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store that the file batch belongs to. |
batch_id | path | Yes | string | The ID of the file batch to cancel. |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | vectorStoreFileBatchObject |
Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.
POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"id": "vsfb_abc123",
"object": "vector_store.file_batch",
"created_at": 1699061776,
"vector_store_id": "vs_abc123",
"status": "cancelling",
"file_counts": {
"in_progress": 12,
"completed": 3,
"failed": 0,
"cancelled": 0,
"total": 15
}
}
}
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-03-01-preview
Returns a list of vector store files in a batch.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
vector_store_id | path | Yes | string | The ID of the vector store that the files belong to. |
batch_id | path | Yes | string | The ID of the file batch that the files belong to. |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order. |
after | query | No | string | A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. |
before | query | No | string | A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. |
filter | query | No | string Possible values: in_progress , completed , failed , cancelled |
Filter by file status. One of in_progress , completed , failed , cancelled . |
api-version | query | Yes | string | api version |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | listVectorStoreFilesResponse |
Returns a list of vector store files.
GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-03-01-preview
Responses: Status Code: 200
{
"body": {
"object": "list",
"data": [
{
"id": "file-abc123",
"object": "vector_store.file",
"created_at": 1699061776,
"vector_store_id": "vs_abc123"
},
{
"id": "file-abc456",
"object": "vector_store.file",
"created_at": 1699061776,
"vector_store_id": "vs_abc123"
}
],
"first_id": "file-abc123",
"last_id": "file-abc456",
"has_more": false
}
}
POST https://{endpoint}/openai/responses?api-version=2025-03-01-preview
Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like file search to use your own data as input for the model's response.
Content-Type: application/json
Name | Type | Description | Required | Default |
---|---|---|---|---|
include | array | No | ||
input | string or array | Text, image, or file inputs to the model, used to generate a response. Learn more: - Text inputs and outputs - Image inputs - File inputs - Conversation state - function calling |
Yes | |
instructions | string | Inserts a system (or developer) message as the first item in the model's context. When using along with previous_response_id , the instructions from a previousresponse will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses. |
No | |
max_output_tokens | integer | An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens. |
No | |
parallel_tool_calls | boolean | Whether to allow the model to run tool calls in parallel. |
No | True |
previous_response_id | string | The unique ID of the previous response to the model. Use this to create multi-turn conversations. Learn more about Conversation state. |
No | |
reasoning | Reasoning | o-series models only Configuration options for reasoning models. |
No | |
store | boolean | Whether to store the generated model response for later retrieval via API. |
No | True |
stream | boolean | If set to true, the model response data will be streamed to the client as it is generated using server-sent events. |
No | False |
text | object | Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - Text inputs and outputs - Structured Outputs |
No | |
└─ format | TextResponseFormatConfiguration | An object specifying the format that the model must output. Configuring { "type": "json_schema" } enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is { "type": "text" } with no additional options.Not recommended for gpt-4o and newer models: Setting to { "type": "json_object" } enables the older JSON mode, whichensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it. |
No | |
tool_choice | ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction | How the model should select which tool (or tools) to use when generating a response. See the tools parameter to see how to specify which toolsthe model can call. |
No | |
tools | array | An array of tools the model may call while generating a response. You can specify which tool to use by setting the tool_choice parameter.The two categories of tools you can provide the model are: - Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, file search. - Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling. |
No | |
truncation | enum | The truncation strategy to use for the model response. - auto : If the context of this response and previous ones exceedsthe model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - disabled (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: auto , disabled |
No |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | response | |
text/event-stream | responseStreamEvent |
Status Code: default
Description: Service unavailable
Content-Type | Type | Description |
---|---|---|
application/json | errorResponse |
GET https://{endpoint}/openai/responses/{response_id}?api-version=2025-03-01-preview
Retrieves a model response with the given ID.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
response_id | path | Yes | string | The ID of the response to retrieve. |
include | query | No | array | Additional fields to include in the response. See the include parameter for Response creation above for more information. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | response |
Status Code: default
Description: Service unavailable
Content-Type | Type | Description |
---|---|---|
application/json | errorResponse |
DELETE https://{endpoint}/openai/responses/{response_id}?api-version=2025-03-01-preview
Deletes a model response with the given ID.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
response_id | path | Yes | string | The ID of the response to delete. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Status Code: 404
Description: Not Found
Content-Type | Type | Description |
---|---|---|
application/json | error |
Status Code: default
Description: Service unavailable
Content-Type | Type | Description |
---|---|---|
application/json | errorResponse |
GET https://{endpoint}/openai/responses/{response_id}/input_items?api-version=2025-03-01-preview
Returns a list of input items for a given response.
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint | path | Yes | string url |
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com . Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com |
response_id | path | Yes | string | The ID of the response to retrieve input items for. |
limit | query | No | integer | A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. |
order | query | No | string Possible values: asc , desc |
The order to return the input items in. Default is asc .- asc : Return the input items in ascending order.- desc : Return the input items in descending order. |
after | query | No | string | An item ID to list items after, used in pagination. |
before | query | No | string | An item ID to list items before, used in pagination. |
Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure..**
Name | Required | Type | Description |
---|---|---|---|
Authorization | True | string | Example: Authorization: Bearer {Azure_OpenAI_Auth_Token} To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com Type: oauth2 Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize scope: https://cognitiveservices.azure.com/.default |
api-key | True | string | Provide Azure OpenAI API key here |
Status Code: 200
Description: OK
Content-Type | Type | Description |
---|---|---|
application/json | responseItemList |
Name | Type | Description | Required | Default |
---|---|---|---|---|
error | error | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | string | No | ||
message | string | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
inner_error | innerError | Inner error with additional details. | No | |
param | string | No | ||
type | string | No |
Inner error with additional details.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | innerErrorCode | Error codes for the inner error object. | No | |
content_filter_results | contentFilterPromptResults | Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. | No |
Error codes for the inner error object.
Property | Value |
---|---|
Description | Error codes for the inner error object. |
Type | string |
Values | ResponsibleAIPolicyViolation |
Name | Type | Description | Required | Default |
---|---|---|---|---|
error | dalleError | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
inner_error | dalleInnerError | Inner error with additional details. | No | |
param | string | No | ||
type | string | No |
Inner error with additional details.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | innerErrorCode | Error codes for the inner error object. | No | |
content_filter_results | dalleFilterResults | Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. | No | |
revised_prompt | string | The prompt that was used to generate the image, if there was any revision to the prompt. | No |
Describes a span within generated completion text. Offset 0 is the first UTF32 code point of the completion text.
Name | Type | Description | Required | Default |
---|---|---|---|---|
completion_end_offset | integer | Offset of the first UTF32 code point which is excluded from the span. This field is always equal to completion_start_offset for empty spans. This field is always larger than completion_start_offset for non-empty spans. | Yes | |
completion_start_offset | integer | Offset of the UTF32 code point which begins the span. | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
filtered | boolean | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
filtered | boolean | Yes | ||
severity | string | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
detected | boolean | No | ||
filtered | boolean | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
citation | object | No | ||
└─ URL | string | No | ||
└─ license | string | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
details | array | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
filtered | boolean | Yes | ||
id | string | No |
Information about the content filtering results.
Name | Type | Description | Required | Default |
---|---|---|---|---|
custom_blocklists | contentFilterDetailedResults | Content filtering results with a detail of content filter ids for the filtered segments. | No | |
error | errorBase | No | ||
hate | contentFilterSeverityResult | No | ||
profanity | contentFilterDetectedResult | No | ||
self_harm | contentFilterSeverityResult | No | ||
sexual | contentFilterSeverityResult | No | ||
violence | contentFilterSeverityResult | No |
Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.
Name | Type | Description | Required | Default |
---|---|---|---|---|
custom_blocklists | contentFilterDetailedResults | Content filtering results with a detail of content filter ids for the filtered segments. | No | |
error | errorBase | No | ||
hate | contentFilterSeverityResult | No | ||
indirect_attack | contentFilterDetectedResult | No | ||
jailbreak | contentFilterDetectedResult | No | ||
profanity | contentFilterDetectedResult | No | ||
self_harm | contentFilterSeverityResult | No | ||
sexual | contentFilterSeverityResult | No | ||
violence | contentFilterSeverityResult | No |
Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about third party text and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.
Name | Type | Description | Required | Default |
---|---|---|---|---|
custom_blocklists | contentFilterDetailedResults | Content filtering results with a detail of content filter ids for the filtered segments. | No | |
error | errorBase | No | ||
hate | contentFilterSeverityResult | No | ||
profanity | contentFilterDetectedResult | No | ||
protected_material_code | contentFilterDetectedWithCitationResult | No | ||
protected_material_text | contentFilterDetectedResult | No | ||
self_harm | contentFilterSeverityResult | No | ||
sexual | contentFilterSeverityResult | No | ||
ungrounded_material | contentFilterDetectedWithCompletionTextSpansResult | No | ||
violence | contentFilterSeverityResult | No |
Content filtering results with a detail of content filter ids for the filtered segments.
Name | Type | Description | Required | Default |
---|---|---|---|---|
details | array | No | ||
filtered | boolean | Yes |
Content filtering results for a single prompt in the request.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_filter_results | contentFilterPromptResults | Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. | No | |
prompt_index | integer | No |
Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.
No properties defined for this component.
Information about the content filtering results.
Name | Type | Description | Required | Default |
---|---|---|---|---|
hate | contentFilterSeverityResult | No | ||
self_harm | contentFilterSeverityResult | No | ||
sexual | contentFilterSeverityResult | No | ||
violence | contentFilterSeverityResult | No |
Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.
Name | Type | Description | Required | Default |
---|---|---|---|---|
custom_blocklists | contentFilterDetailedResults | Content filtering results with a detail of content filter ids for the filtered segments. | No | |
hate | contentFilterSeverityResult | No | ||
jailbreak | contentFilterDetectedResult | No | ||
profanity | contentFilterDetectedResult | No | ||
self_harm | contentFilterSeverityResult | No | ||
sexual | contentFilterSeverityResult | No | ||
violence | contentFilterSeverityResult | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | No | 0 |
logit_bias | object | Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. | No | |
max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. | No | |
max_tokens | integer | The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens). This is not compatible with o1 series models. | No | 4096 |
metadata | object | Developer-defined tags and values used for filtering completions in the stored completions dashboard. | No | |
presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | No | 0 |
stop | string or array | Up to 4 sequences where the API will stop generating further tokens. | No | |
store | boolean | Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. | No | |
stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. |
No | False |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help Azure OpenAI to monitor and detect abuse. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
best_of | integer | Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed.When used with n , best_of controls the number of candidate completions and n specifies how many to return best_of must be greater than n .Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop . |
No | 1 |
echo | boolean | Echo back the prompt in addition to the completion |
No | False |
frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
No | 0 |
logit_bias | object | Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated. |
No | None |
logprobs | integer | Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.The maximum value for logprobs is 5. |
No | None |
max_tokens | integer | The maximum number of tokens that can be generated in the completion. The token count of your prompt plus max_tokens cannot exceed the model's context length. Example Python code for counting tokens. |
No | 16 |
n | integer | How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop . |
No | 1 |
presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
No | 0 |
prompt | string or array | The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document. |
Yes | |
seed | integer | If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend. |
No | |
stop | string or array | Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. |
No | |
stream | boolean | Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code. |
No | False |
suffix | string | The suffix that comes after a completion of inserted text. This parameter is only supported for gpt-3.5-turbo-instruct . |
No | None |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse. |
No |
Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).
Name | Type | Description | Required | Default |
---|---|---|---|---|
choices | array | The list of completion choices the model generated for the input prompt. | Yes | |
created | integer | The Unix timestamp (in seconds) of when the completion was created. | Yes | |
id | string | A unique identifier for the completion. | Yes | |
model | string | The model used for completion. | Yes | |
object | enum | The object type, which is always "text_completion" Possible values: text_completion |
Yes | |
prompt_filter_results | promptFilterResults | Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders. | No | |
system_fingerprint | string | This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism. |
No | |
usage | completionUsage | Usage statistics for the completion request. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
audio | object | Parameters for audio output. Required when audio output is requested withmodalities: ["audio"] . |
No | |
└─ format | enum | Specifies the output audio format. Must be one of wav , mp3 , flac ,opus , or pcm16 . Possible values: wav , mp3 , flac , opus , pcm16 |
No | |
└─ voice | enum | Specifies the voice type. Supported voices are alloy , echo , fable , onyx , nova , and shimmer .Possible values: alloy , echo , fable , onyx , nova , shimmer |
No | |
data_sources | array | The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI. |
No | |
frequency_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. |
No | 0 |
function_call | string or chatCompletionFunctionCallOption | Deprecated in favor of tool_choice .Controls which (if any) function is called by the model. none means the model will not call a function and instead generates a message.auto means the model can pick between generating a message or calling a function.Specifying a particular function via {"name": "my_function"} forces the model to call that function.none is the default when no functions are present. auto is the default if functions are present. |
No | |
functions | array | Deprecated in favor of tools .A list of functions the model may generate JSON inputs for. |
No | |
logit_bias | object | Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. |
No | None |
logprobs | boolean | Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message . |
No | False |
max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. | No | |
max_tokens | integer | The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Example Python code for counting tokens. |
No | |
messages | array | A list of messages comprising the conversation so far. Example Python code. | Yes | |
metadata | object | Developer-defined tags and values used for filtering completions in the stored completions dashboard. | No | |
modalities | ChatCompletionModalities | Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default: ["text"] The gpt-4o-audio-preview model can also be used to generate audio. Torequest that this model generate both text and audio responses, you can use: ["text", "audio"] |
No | |
n | integer | How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. |
No | 1 |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
prediction | PredictionContent | Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. | No | |
presence_penalty | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. |
No | 0 |
reasoning_effort | enum | o1 models only Constrains effort on reasoning for reasoning models. Currently supported values are low , medium , and high . Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.Possible values: low , medium , high |
No | |
response_format | ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema | An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which guarantees the model will match your supplied JSON schema.Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
seed | integer | This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend. |
No | |
stop | string or array | Up to 4 sequences where the API will stop generating further tokens. |
No | |
store | boolean | Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. | No | |
stream | boolean | If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code. |
No | False |
stream_options | chatCompletionStreamOptions | Options for streaming response. Only set this when you set stream: true . |
No | None |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
tool_choice | chatCompletionToolChoiceOption | Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present. |
No | |
tools | array | A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported. |
No | |
top_logprobs | integer | An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. |
No | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse. |
No | |
user_security_context | userSecurityContext | User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud. | No |
User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud.
Name | Type | Description | Required | Default |
---|---|---|---|---|
application_name | string | The name of the application. Sensitive personal information should not be included in this field. | No | |
end_user_id | string | This identifier is the Microsoft Entra ID (formerly Azure Active Directory) user object ID used to authenticate end-users within the generative AI application. Sensitive personal information should not be included in this field. | No | |
end_user_tenant_id | string | The Microsoft 365 tenant ID the end user belongs to. It's required when the generative AI application is multi tenant. | No | |
source_ip | string | Captures the original client's IP address, accepting both IPv4 and IPv6 formats. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | A description of what the function does, used by the model to choose when and how to call the function. | No | |
name | string | The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. | Yes | |
parameters | FunctionParameters | The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format. Omitting parameters defines a function with an empty parameter list. |
No |
Specifying a particular function via {"name": "my_function"}
forces the model to call that function.
Name | Type | Description | Required | Default |
---|---|---|---|---|
name | string | The name of the function to call. | Yes |
The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.
No properties defined for this component.
This component can be one of the following:
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
With o1 models and newer, developer
messages replace the previous system
messages.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or array | The contents of the developer message. | Yes | |
name | string | An optional name for the participant. Provides the model information to differentiate between participants of the same role. | No | |
role | enum | The role of the messages author, in this case developer .Possible values: developer |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or array | The contents of the system message. | Yes | |
name | string | An optional name for the participant. Provides the model information to differentiate between participants of the same role. | No | |
role | enum | The role of the messages author, in this case system .Possible values: system |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or array | The contents of the user message. |
Yes | |
name | string | An optional name for the participant. Provides the model information to differentiate between participants of the same role. | No | |
role | enum | The role of the messages author, in this case user .Possible values: user |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or array | The contents of the assistant message. Required unless tool_calls or function_call is specified. |
No | |
function_call | object | Deprecated and replaced by tool_calls . The name and arguments of a function that should be called, as generated by the model. |
No | |
└─ arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | No | |
└─ name | string | The name of the function to call. | No | |
name | string | An optional name for the participant. Provides the model information to differentiate between participants of the same role. | No | |
refusal | string | The refusal message by the assistant. | No | |
role | enum | The role of the messages author, in this case assistant .Possible values: assistant |
Yes | |
tool_calls | chatCompletionMessageToolCalls | The tool calls generated by the model, such as function calls. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or array | The contents of the tool message. | Yes | |
role | enum | The role of the messages author, in this case tool .Possible values: tool |
Yes | |
tool_call_id | string | Tool call that this message is responding to. | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string | The contents of the function message. | Yes | |
name | string | The name of the function to call. | Yes | |
role | enum | The role of the messages author, in this case function .Possible values: function |
Yes |
This component can be one of the following:
This component can be one of the following:
This component can be one of the following:
This component can be one of the following:
This component can be one of the following:
Name | Type | Description | Required | Default |
---|---|---|---|---|
text | string | The text content. | Yes | |
type | enum | The type of the content part. Possible values: text |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
input_audio | object | Yes | ||
└─ data | string | Base64 encoded audio data. | No | |
└─ format | enum | The format of the encoded audio data. Currently supports "wav" and "mp3". Possible values: wav , mp3 |
No | |
type | enum | The type of the content part. Always input_audio .Possible values: input_audio |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
image_url | object | Yes | ||
└─ detail | enum | Specifies the detail level of the image. Learn more in the Vision guide. Possible values: auto , low , high |
No | |
└─ url | string | Either a URL of the image or the base64 encoded image data. | No | |
type | enum | The type of the content part. Possible values: image_url |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
refusal | string | The refusal message generated by the model. | Yes | |
type | enum | The type of the content part. Possible values: refusal |
Yes |
A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.
This component uses the property type
to discriminate between different types:
Type Value | Schema |
---|---|
azure_search |
azureSearchChatExtensionConfiguration |
azure_cosmos_db |
azureCosmosDBChatExtensionConfiguration |
elasticsearch |
elasticsearchChatExtensionConfiguration |
mongo_db |
mongoDBChatExtensionConfiguration |
pinecone |
pineconeChatExtensionConfiguration |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | azureChatExtensionType | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Yes |
A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.
Property | Value |
---|---|
Description | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Type | string |
Values | azure_search azure_cosmos_db elasticsearch mongo_db pinecone |
A specific representation of configurable options for Azure Search when using it as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
parameters | azureSearchChatExtensionParameters | Parameters for Azure Search when used as an Azure OpenAI chat extension. | No | |
type | azureChatExtensionType | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Yes |
Parameters for Azure Search when used as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
allow_partial_result | boolean | If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. | No | False |
authentication | onYourDataApiKeyAuthenticationOptions or onYourDataSystemAssignedManagedIdentityAuthenticationOptions or onYourDataUserAssignedManagedIdentityAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions | Yes | ||
embedding_dependency | onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataIntegratedVectorizationSource | No | ||
endpoint | string | The absolute endpoint path for the Azure Search resource to use. | Yes | |
fields_mapping | azureSearchIndexFieldMappingOptions | Optional settings to control how fields are processed when using a configured Azure Search resource. | No | |
filter | string | Search filter. | No | |
in_scope | boolean | Whether queries should be restricted to use of indexed data. | No | |
include_contexts | array | The included properties of the output context. If not specified, the default value is citations and intent . |
No | |
index_name | string | The name of the index to use as available in the referenced Azure Search resource. | Yes | |
max_search_queries | integer | The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
query_type | azureSearchQueryType | The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension. | No | |
semantic_configuration | string | The additional semantic configuration for the query. | No | |
strictness | integer | The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. | No | |
top_n_documents | integer | The configured top number of documents to feature for the configured query. | No |
Optional settings to control how fields are processed when using a configured Azure Search resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_fields | array | The names of index fields that should be treated as content. | No | |
content_fields_separator | string | The separator pattern that content fields should use. | No | |
filepath_field | string | The name of the index field to use as a filepath. | No | |
image_vector_fields | array | The names of fields that represent image vector data. | No | |
title_field | string | The name of the index field to use as a title. | No | |
url_field | string | The name of the index field to use as a URL. | No | |
vector_fields | array | The names of fields that represent vector data. | No |
The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.
Property | Value |
---|---|
Description | The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension. |
Type | string |
Values | simple semantic vector vector_simple_hybrid vector_semantic_hybrid |
A specific representation of configurable options for Azure Cosmos DB when using it as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
parameters | azureCosmosDBChatExtensionParameters | Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore. |
No | |
type | azureChatExtensionType | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Yes |
Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.
Name | Type | Description | Required | Default |
---|---|---|---|---|
allow_partial_result | boolean | If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. | No | False |
authentication | onYourDataConnectionStringAuthenticationOptions | The authentication options for Azure OpenAI On Your Data when using a connection string. | Yes | |
container_name | string | The name of the Azure Cosmos DB resource container. | Yes | |
database_name | string | The MongoDB vCore database name to use with Azure Cosmos DB. | Yes | |
embedding_dependency | onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource | Yes | ||
fields_mapping | azureCosmosDBFieldMappingOptions | Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource. | Yes | |
in_scope | boolean | Whether queries should be restricted to use of indexed data. | No | |
include_contexts | array | The included properties of the output context. If not specified, the default value is citations and intent . |
No | |
index_name | string | The MongoDB vCore index name to use with Azure Cosmos DB. | Yes | |
max_search_queries | integer | The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
strictness | integer | The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. | No | |
top_n_documents | integer | The configured top number of documents to feature for the configured query. | No |
Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_fields | array | The names of index fields that should be treated as content. | Yes | |
content_fields_separator | string | The separator pattern that content fields should use. | No | |
filepath_field | string | The name of the index field to use as a filepath. | No | |
title_field | string | The name of the index field to use as a title. | No | |
url_field | string | The name of the index field to use as a URL. | No | |
vector_fields | array | The names of fields that represent vector data. | Yes |
A specific representation of configurable options for Elasticsearch when using it as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
parameters | elasticsearchChatExtensionParameters | Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension. | No | |
type | azureChatExtensionType | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Yes |
Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
allow_partial_result | boolean | If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. | No | False |
authentication | onYourDataKeyAndKeyIdAuthenticationOptions or onYourDataEncodedApiKeyAuthenticationOptions | Yes | ||
embedding_dependency | onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataModelIdVectorizationSource | No | ||
endpoint | string | The endpoint of Elasticsearch®. | Yes | |
fields_mapping | elasticsearchIndexFieldMappingOptions | Optional settings to control how fields are processed when using a configured Elasticsearch® resource. | No | |
in_scope | boolean | Whether queries should be restricted to use of indexed data. | No | |
include_contexts | array | The included properties of the output context. If not specified, the default value is citations and intent . |
No | |
index_name | string | The index name of Elasticsearch®. | Yes | |
max_search_queries | integer | The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
query_type | elasticsearchQueryType | The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension. | No | |
strictness | integer | The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. | No | |
top_n_documents | integer | The configured top number of documents to feature for the configured query. | No |
Optional settings to control how fields are processed when using a configured Elasticsearch® resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_fields | array | The names of index fields that should be treated as content. | No | |
content_fields_separator | string | The separator pattern that content fields should use. | No | |
filepath_field | string | The name of the index field to use as a filepath. | No | |
title_field | string | The name of the index field to use as a title. | No | |
url_field | string | The name of the index field to use as a URL. | No | |
vector_fields | array | The names of fields that represent vector data. | No |
The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.
Property | Value |
---|---|
Description | The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension. |
Type | string |
Values | simple vector |
A specific representation of configurable options for Mongo DB when using it as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
parameters | mongoDBChatExtensionParameters | Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB. | No | |
type | azureChatExtensionType | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Yes |
Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.
Name | Type | Description | Required | Default |
---|---|---|---|---|
allow_partial_result | boolean | If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. | No | False |
app_name | string | The name of the Mongo DB Application. | Yes | |
authentication | onYourDataUsernameAndPasswordAuthenticationOptions | The authentication options for Azure OpenAI On Your Data when using a username and a password. | Yes | |
collection_name | string | The name of the Mongo DB Collection. | Yes | |
database_name | string | The name of the Mongo DB database. | Yes | |
embedding_dependency | onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource | Yes | ||
endpoint | string | The name of the Mongo DB cluster endpoint. | Yes | |
fields_mapping | mongoDBFieldMappingOptions | Optional settings to control how fields are processed when using a configured Mongo DB resource. | Yes | |
in_scope | boolean | Whether queries should be restricted to use of indexed data. | No | |
include_contexts | array | The included properties of the output context. If not specified, the default value is citations and intent . |
No | |
index_name | string | The The name of the Mongo DB index. | Yes | |
max_search_queries | integer | The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
strictness | integer | The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. | No | |
top_n_documents | integer | The configured top number of documents to feature for the configured query. | No |
Optional settings to control how fields are processed when using a configured Mongo DB resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_fields | array | The names of index fields that should be treated as content. | Yes | |
content_fields_separator | string | The separator pattern that content fields should use. | No | |
filepath_field | string | The name of the index field to use as a filepath. | No | |
title_field | string | The name of the index field to use as a title. | No | |
url_field | string | The name of the index field to use as a URL. | No | |
vector_fields | array | The names of fields that represent vector data. | Yes |
A specific representation of configurable options for Pinecone when using it as an Azure OpenAI chat extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
parameters | pineconeChatExtensionParameters | Parameters for configuring Azure OpenAI Pinecone chat extensions. | No | |
type | azureChatExtensionType | A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI. |
Yes |
Parameters for configuring Azure OpenAI Pinecone chat extensions.
Name | Type | Description | Required | Default |
---|---|---|---|---|
allow_partial_result | boolean | If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. | No | False |
authentication | onYourDataApiKeyAuthenticationOptions | The authentication options for Azure OpenAI On Your Data when using an API key. | Yes | |
embedding_dependency | onYourDataDeploymentNameVectorizationSource | The details of a a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource. |
Yes | |
environment | string | The environment name of Pinecone. | Yes | |
fields_mapping | pineconeFieldMappingOptions | Optional settings to control how fields are processed when using a configured Pinecone resource. | Yes | |
in_scope | boolean | Whether queries should be restricted to use of indexed data. | No | |
include_contexts | array | The included properties of the output context. If not specified, the default value is citations and intent . |
No | |
index_name | string | The name of the Pinecone database index. | Yes | |
max_search_queries | integer | The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. | No | |
strictness | integer | The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. | No | |
top_n_documents | integer | The configured top number of documents to feature for the configured query. | No |
Optional settings to control how fields are processed when using a configured Pinecone resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_fields | array | The names of index fields that should be treated as content. | Yes | |
content_fields_separator | string | The separator pattern that content fields should use. | No | |
filepath_field | string | The name of the index field to use as a filepath. | No | |
title_field | string | The name of the index field to use as a title. | No | |
url_field | string | The name of the index field to use as a URL. | No |
The authentication options for Azure OpenAI On Your Data.
This component uses the property type
to discriminate between different types:
Type Value | Schema |
---|---|
api_key |
onYourDataApiKeyAuthenticationOptions |
connection_string |
onYourDataConnectionStringAuthenticationOptions |
key_and_key_id |
onYourDataKeyAndKeyIdAuthenticationOptions |
encoded_api_key |
onYourDataEncodedApiKeyAuthenticationOptions |
access_token |
onYourDataAccessTokenAuthenticationOptions |
system_assigned_managed_identity |
onYourDataSystemAssignedManagedIdentityAuthenticationOptions |
user_assigned_managed_identity |
onYourDataUserAssignedManagedIdentityAuthenticationOptions |
username_and_password |
onYourDataUsernameAndPasswordAuthenticationOptions |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The context property.
Property | Value |
---|---|
Description | The context property. |
Type | string |
Values | citations intent all_retrieved_documents |
The authentication types supported with Azure OpenAI On Your Data.
Property | Value |
---|---|
Description | The authentication types supported with Azure OpenAI On Your Data. |
Type | string |
Values | api_key connection_string key_and_key_id encoded_api_key access_token system_assigned_managed_identity user_assigned_managed_identity username_and_password |
The authentication options for Azure OpenAI On Your Data when using an API key.
Name | Type | Description | Required | Default |
---|---|---|---|---|
key | string | The API key to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using a connection string.
Name | Type | Description | Required | Default |
---|---|---|---|---|
connection_string | string | The connection string to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using an Elasticsearch key and key ID pair.
Name | Type | Description | Required | Default |
---|---|---|---|---|
key | string | The Elasticsearch key to use for authentication. | No | |
key_id | string | The Elasticsearch key ID to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using an Elasticsearch encoded API key.
Name | Type | Description | Required | Default |
---|---|---|---|---|
encoded_api_key | string | The Elasticsearch encoded API key to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using access token.
Name | Type | Description | Required | Default |
---|---|---|---|---|
access_token | string | The access token to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using a system-assigned managed identity.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using a user-assigned managed identity.
Name | Type | Description | Required | Default |
---|---|---|---|---|
managed_identity_resource_id | string | The resource ID of the user-assigned managed identity to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes |
The authentication options for Azure OpenAI On Your Data when using a username and a password.
Name | Type | Description | Required | Default |
---|---|---|---|---|
password | string | The password. to use for authentication. | No | |
type | onYourDataAuthenticationType | The authentication types supported with Azure OpenAI On Your Data. | Yes | |
username | string | The username to use for authentication. | No |
An abstract representation of a vectorization source for Azure OpenAI On Your Data with vector search.
This component uses the property type
to discriminate between different types:
Type Value | Schema |
---|---|
endpoint |
onYourDataEndpointVectorizationSource |
deployment_name |
onYourDataDeploymentNameVectorizationSource |
integrated |
onYourDataIntegratedVectorizationSource |
model_id |
onYourDataModelIdVectorizationSource |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | onYourDataVectorizationSourceType | Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search. |
Yes |
Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.
Property | Value |
---|---|
Description | Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search. |
Type | string |
Values | endpoint deployment_name integrated model_id |
The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a public Azure OpenAI endpoint call for embeddings.
Name | Type | Description | Required | Default |
---|---|---|---|---|
authentication | onYourDataApiKeyAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions | No | ||
dimensions | integer | The number of dimensions the embeddings should have. Only supported in text-embedding-3 and later models. |
No | |
endpoint | string | Specifies the resource endpoint URL from which embeddings should be retrieved. It should be in the format of https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings . The api-version query parameter is not allowed. |
No | |
type | onYourDataVectorizationSourceType | Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search. |
Yes |
The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
deployment_name | string | Specifies the name of the model deployment to use for vectorization. This model deployment must be in the same Azure OpenAI resource, but On Your Data will use this model deployment via an internal call rather than a public one, which enables vector search even in private networks. | No | |
dimensions | integer | The number of dimensions the embeddings should have. Only supported in text-embedding-3 and later models. |
No | |
type | onYourDataVectorizationSourceType | Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search. |
Yes |
Represents the integrated vectorizer defined within the search resource.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | onYourDataVectorizationSourceType | Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search. |
Yes |
The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a search service model ID. Currently only supported by Elasticsearch®.
Name | Type | Description | Required | Default |
---|---|---|---|---|
model_id | string | Specifies the model ID to use for vectorization. This model ID must be defined in the search service. | No | |
type | onYourDataVectorizationSourceType | Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search. |
Yes |
A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.
Name | Type | Description | Required | Default |
---|---|---|---|---|
all_retrieved_documents | array | All the retrieved documents. | No | |
citations | array | The data source retrieval result, used to generate the assistant message in the response. | No | |
intent | string | The detected intent from the chat history, used to pass to the next turn to carry over the context. | No |
citation information for a chat completions response message.
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunk_id | string | The chunk ID of the citation. | No | |
content | string | The content of the citation. | Yes | |
filepath | string | The file path of the citation. | No | |
rerank_score | number | The rerank score of the retrieved document. | No | |
title | string | The title of the citation. | No | |
url | string | The URL of the citation. | No |
The retrieved document.
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunk_id | string | The chunk ID of the citation. | No | |
content | string | The content of the citation. | Yes | |
data_source_index | integer | The index of the data source. | No | |
filepath | string | The file path of the citation. | No | |
filter_reason | filterReason | The filtering reason of the retrieved document. | No | |
original_search_score | number | The original search score of the retrieved document. | No | |
rerank_score | number | The rerank score of the retrieved document. | No | |
search_queries | array | The search queries used to retrieve the document. | No | |
title | string | The title of the citation. | No | |
url | string | The URL of the citation. | No |
The filtering reason of the retrieved document.
Property | Value |
---|---|
Description | The filtering reason of the retrieved document. |
Type | string |
Values | score rerank |
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | The function that the model called. | Yes | |
└─ arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | No | |
└─ name | string | The name of the function to call. | No | |
id | string | The ID of the tool call. | Yes | |
type | toolCallType | The type of the tool call, in this case function . |
Yes |
The type of the tool call, in this case function
.
Property | Value |
---|---|
Description | The type of the tool call, in this case function . |
Type | string |
Values | function |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string | The contents of the message. | No | |
tool_call_id | string | Tool call that this message is responding to. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string | The contents of the message. | No | |
name | string | The contents of the message. | No | |
role | enum | The role of the messages author, in this case function .Possible values: function |
No |
Represents a chat completion response returned by model, based on the provided input.
Name | Type | Description | Required | Default |
---|---|---|---|---|
choices | array | A list of chat completion choices. Can be more than one if n is greater than 1. |
Yes | |
created | integer | The Unix timestamp (in seconds) of when the chat completion was created. | Yes | |
id | string | A unique identifier for the chat completion. | Yes | |
model | string | The model used for the chat completion. | Yes | |
object | enum | The object type, which is always chat.completion .Possible values: chat.completion |
Yes | |
prompt_filter_results | promptFilterResults | Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders. | No | |
system_fingerprint | string | This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism. |
No | |
usage | completionUsage | Usage statistics for the completion request. | No |
Represents a streamed chunk of a chat completion response returned by model, based on the provided input.
Name | Type | Description | Required | Default |
---|---|---|---|---|
choices | array | A list of chat completion choices. Can contain more than one elements if n is greater than 1. |
Yes | |
created | integer | The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp. | Yes | |
id | string | A unique identifier for the chat completion. Each chunk has the same ID. | Yes | |
model | string | The model to generate the completion. | Yes | |
object | enum | The object type, which is always chat.completion.chunk .Possible values: chat.completion.chunk |
Yes | |
system_fingerprint | string | This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism. |
No |
A chat completion delta generated by streamed model responses.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string | The contents of the chunk message. | No | |
function_call | object | Deprecated and replaced by tool_calls . The name and arguments of a function that should be called, as generated by the model. |
No | |
└─ arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | No | |
└─ name | string | The name of the function to call. | No | |
refusal | string | The refusal message generated by the model. | No | |
role | enum | The role of the author of this message. Possible values: system , user , assistant , tool |
No | |
tool_calls | array | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | No | ||
└─ arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | No | |
└─ name | string | The name of the function to call. | No | |
id | string | The ID of the tool call. | No | |
index | integer | Yes | ||
type | enum | The type of the tool. Currently, only function is supported.Possible values: function |
No |
Options for streaming response. Only set this when you set stream: true
.
Name | Type | Description | Required | Default |
---|---|---|---|---|
include_usage | boolean | If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value. |
No |
Log probability information for the choice.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | array | A list of message content tokens with log probability information. | Yes | |
refusal | array | A list of message refusal tokens with log probability information. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
bytes | array | A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token. |
Yes | |
logprob | number | The log probability of this token. | Yes | |
token | string | The token. | Yes | |
top_logprobs | array | List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested top_logprobs returned. |
Yes |
A chat completion message generated by the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
audio | object | If the audio output modality is requested, this object contains data about the audio response from the model. |
No | |
└─ data | string | Base64 encoded audio bytes generated by the model, in the format specified in the request. |
No | |
└─ expires_at | integer | The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations. |
No | |
└─ id | string | Unique identifier for this audio response. | No | |
└─ transcript | string | Transcript of the audio generated by the model. | No | |
content | string | The contents of the message. | Yes | |
context | azureChatExtensionsMessageContext | A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension. |
No | |
function_call | chatCompletionFunctionCall | Deprecated and replaced by tool_calls . The name and arguments of a function that should be called, as generated by the model. |
No | |
refusal | string | The refusal message generated by the model. | Yes | |
role | chatCompletionResponseMessageRole | The role of the author of the response message. | Yes | |
tool_calls | array | The tool calls generated by the model, such as function calls. | No |
The role of the author of the response message.
Property | Value |
---|---|
Description | The role of the author of the response message. |
Type | string |
Values | assistant |
Controls which (if any) tool is called by the model. none
means the model will not call any tool and instead generates a message. auto
means the model can pick between generating a message or calling one or more tools. required
means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool. none
is the default when no tools are present. auto
is the default if tools are present.
This component can be one of the following:
Specifies a tool the model should use. Use to force the model to call a specific function.
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | Yes | ||
└─ name | string | The name of the function to call. | No | |
type | enum | The type of the tool. Currently, only function is supported.Possible values: function |
Yes |
Whether to enable parallel function calling during tool use.
No properties defined for this component.
Static predicted output content, such as the content of a text file that is being regenerated.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or array | The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly. | Yes | |
type | enum | The type of the predicted content you want to provide. This type is currently always content .Possible values: content |
Yes |
The tool calls generated by the model, such as function calls.
No properties defined for this component.
Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default:
["text"]
The gpt-4o-audio-preview
model can also be used to generate audio. To
request that this model generate both text and audio responses, you can
use:
["text", "audio"]
No properties defined for this component.
Deprecated and replaced by tool_calls
. The name and arguments of a function that should be called, as generated by the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. | Yes | |
name | string | The name of the function to call. | Yes |
Usage statistics for the completion request.
Name | Type | Description | Required | Default |
---|---|---|---|---|
completion_tokens | integer | Number of tokens in the generated completion. | Yes | |
completion_tokens_details | object | Breakdown of tokens used in a completion. | No | |
└─ accepted_prediction_tokens | integer | When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion. | No | |
└─ audio_tokens | integer | Audio input tokens generated by the model. | No | |
└─ reasoning_tokens | integer | Tokens generated by the model for reasoning. | No | |
└─ rejected_prediction_tokens | integer | When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits. | No | |
prompt_tokens | integer | Number of tokens in the prompt. | Yes | |
prompt_tokens_details | object | Details of the prompt tokens. | No | |
└─ audio_tokens | integer | Audio input tokens present in the prompt. | No | |
└─ cached_tokens | integer | The number of cached prompt tokens. | No | |
total_tokens | integer | Total number of tokens used in the request (prompt + completion). | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | FunctionObject | Yes | ||
type | enum | The type of the tool. Currently, only function is supported.Possible values: function |
Yes |
The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.
Omitting parameters
defines a function with an empty parameter list.
No properties defined for this component.
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | A description of what the function does, used by the model to choose when and how to call the function. | No | |
name | string | The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. | Yes | |
parameters | FunctionParameters | The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format. Omitting parameters defines a function with an empty parameter list. |
No | |
strict | boolean | Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true . |
No | False |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | The type of response format being defined: text Possible values: text |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | The type of response format being defined: json_object Possible values: json_object |
Yes |
The schema for the response format, described as a JSON Schema object.
No properties defined for this component.
Name | Type | Description | Required | Default |
---|---|---|---|---|
json_schema | object | Yes | ||
└─ description | string | A description of what the response format is for, used by the model to determine how to respond in the format. | No | |
└─ name | string | The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. | No | |
└─ schema | ResponseFormatJsonSchemaSchema | The schema for the response format, described as a JSON Schema object. | No | |
└─ strict | boolean | Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is true . |
No | False |
type | enum | The type of response format being defined: json_schema Possible values: json_schema |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
finish_reason | string | No | ||
index | integer | No |
Translation request.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file | string | The audio file to translate. | Yes | |
prompt | string | An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English. | No | |
response_format | audioResponseFormat | Defines the format of the output. | No | |
temperature | number | The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. | No | 0 |
Translation or transcription response when response_format was json
Name | Type | Description | Required | Default |
---|---|---|---|---|
text | string | Translated or transcribed text. | Yes |
Translation or transcription response when response_format was verbose_json
Name | Type | Description | Required | Default |
---|---|---|---|---|
duration | number | Duration. | No | |
language | string | Language. | No | |
segments | array | No | ||
task | string | Type of audio task. | No | |
text | string | Translated or transcribed text. | Yes | |
words | array | No |
Defines the format of the output.
Property | Value |
---|---|
Description | Defines the format of the output. |
Type | string |
Values | json text srt verbose_json vtt |
Transcription request.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file | string | The audio file object to transcribe. | Yes | |
language | string | The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. | No | |
prompt | string | An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. | No | |
response_format | audioResponseFormat | Defines the format of the output. | No | |
temperature | number | The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. | No | 0 |
timestamp_granularities[] | array | The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word , or segment . Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. |
No | ['segment'] |
Transcription or translation segment.
Name | Type | Description | Required | Default |
---|---|---|---|---|
avg_logprob | number | Average log probability. | No | |
compression_ratio | number | Compression ratio. | No | |
end | number | Segment end offset. | No | |
id | integer | Segment identifier. | No | |
no_speech_prob | number | Probability of 'no speech'. | No | |
seek | number | Offset of the segment. | No | |
start | number | Segment start offset. | No | |
temperature | number | Temperature. | No | |
text | string | Segment text. | No | |
tokens | array | Tokens of the text. | No |
Transcription or translation word.
Name | Type | Description | Required | Default |
---|---|---|---|---|
end | number | Word end offset. | No | |
start | number | Word start offset. | No | |
word | string | Word | No |
Speech request.
Name | Type | Description | Required | Default |
---|---|---|---|---|
input | string | The text to synthesize audio for. The maximum length is 4096 characters. | Yes | |
response_format | enum | The format to synthesize the audio in. Possible values: mp3 , opus , aac , flac , wav , pcm |
No | |
speed | number | The speed of the synthesized audio. Select a value from 0.25 to 4.0 . 1.0 is the default. |
No | 1.0 |
voice | enum | The voice to use for speech synthesis. Possible values: alloy , echo , fable , onyx , nova , shimmer |
Yes |
The quality of the image that will be generated.
Property | Value |
---|---|
Description | The quality of the image that will be generated. |
Type | string |
Default | standard |
Values | standard hd |
The format in which the generated images are returned.
Property | Value |
---|---|
Description | The format in which the generated images are returned. |
Type | string |
Default | url |
Values | url b64_json |
The size of the generated images.
Property | Value |
---|---|
Description | The size of the generated images. |
Type | string |
Default | 1024x1024 |
Values | 256x256 512x512 1792x1024 1024x1792 1024x1024 |
The style of the generated images.
Property | Value |
---|---|
Description | The style of the generated images. |
Type | string |
Default | vivid |
Values | vivid natural |
Name | Type | Description | Required | Default |
---|---|---|---|---|
n | integer | The number of images to generate. | No | 1 |
prompt | string | A text description of the desired image(s). The maximum length is 4000 characters. | Yes | |
quality | imageQuality | The quality of the image that will be generated. | No | standard |
response_format | imagesResponseFormat | The format in which the generated images are returned. | No | url |
size | imageSize | The size of the generated images. | No | 1024x1024 |
style | imageStyle | The style of the generated images. | No | vivid |
user | string | A unique identifier representing your end-user, which can help to monitor and detect abuse. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
created | integer | The unix timestamp when the operation was created. | Yes | |
data | array | The result data of the operation, if successful | Yes |
The image url or encoded image if successful, and an error otherwise.
Name | Type | Description | Required | Default |
---|---|---|---|---|
b64_json | string | The base64 encoded image | No | |
content_filter_results | dalleContentFilterResults | Information about the content filtering results. | No | |
prompt_filter_results | dalleFilterResults | Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. | No | |
revised_prompt | string | The prompt that was used to generate the image, if there was any revision to the prompt. | No | |
url | string | The image url. | No |
A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.
Name | Type | Description | Required | Default |
---|---|---|---|---|
spans | array | An array of spans that represent detected objects and its bounding box information. | Yes | |
text | string | Yes |
A span object that represents a detected object and its bounding box information.
Name | Type | Description | Required | Default |
---|---|---|---|---|
length | integer | The length of the span in characters, measured in Unicode codepoints. | Yes | |
offset | integer | The character offset within the text where the span begins. This offset is defined as the position of the first character of the span, counting from the start of the text as Unicode codepoints. | Yes | |
polygon | array | An array of objects representing points in the polygon that encloses the detected object. | Yes | |
text | string | The text content of the span that represents the detected object. | Yes |
Usage statistics related to the run. This value will be null
if the run is not in a terminal state (i.e. in_progress
, queued
, etc.).
Name | Type | Description | Required | Default |
---|---|---|---|---|
completion_tokens | integer | Number of completion tokens used over the course of the run. | Yes | |
prompt_tokens | integer | Number of prompt tokens used over the course of the run. | Yes | |
total_tokens | integer | Total number of tokens used (prompt + completion). | Yes |
Usage statistics related to the run step. This value will be null
while the run step's status is in_progress
.
Name | Type | Description | Required | Default |
---|---|---|---|---|
completion_tokens | integer | Number of completion tokens used over the course of the run step. | Yes | |
prompt_tokens | integer | Number of prompt tokens used over the course of the run step. | Yes | |
total_tokens | integer | Total number of tokens used (prompt + completion). | Yes |
Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106
.
Setting to { "type": "json_schema", "json_schema": {...} }
enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.
Setting to { "type": "json_object" }
enables JSON mode, which ensures the message the model generates is valid JSON.
Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length"
, which indicates the generation exceeded max_tokens
or the conversation exceeded the max context length.
This component can be one of the following:
An object describing the expected output of the model. If json_object
only function
type tools
are allowed to be passed to the Run. If text
the model can return text or any value needed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | string | Must be one of text or json_object . |
No | text |
type Enum: AssistantsApiResponseFormat
Value | Description |
---|---|
text | |
json_object |
Represents an assistant
that can call the model and use tools.
Name | Type | Description | Required | Default |
---|---|---|---|---|
created_at | integer | The Unix timestamp (in seconds) for when the assistant was created. | Yes | |
description | string | The description of the assistant. The maximum length is 512 characters. |
Yes | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
instructions | string | The system instructions that the assistant uses. The maximum length is 256,000 characters. |
Yes | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
Yes | |
model | string | ID of the model to use. You can use the list models API to see all of your available models. | Yes | |
name | string | The name of the assistant. The maximum length is 256 characters. |
Yes | |
object | string | The object type, which is always assistant . |
Yes | |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter , file_search , or function . |
Yes | [] |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
object Enum: AssistantObjectType
Value | Description |
---|---|
assistant | The object type, which is always assistant |
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | The description of the assistant. The maximum length is 512 characters. |
No | |
instructions | string | The system instructions that the assistant uses. The maximum length is 256,000 characters. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | Yes | ||
name | string | The name of the assistant. The maximum length is 256 characters. |
No | |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
└─ vector_stores | array | A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter , retrieval , or function . |
No | [] |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | The description of the assistant. The maximum length is 512 characters. |
No | |
instructions | string | The system instructions that the assistant uses. The maximum length is 32768 characters. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | No | ||
name | string | The name of the assistant. The maximum length is 256 characters. |
No | |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | Overrides the list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter , retrieval , or function . |
No | [] |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
Name | Type | Description | Required | Default |
---|---|---|---|---|
deleted | boolean | Yes | ||
id | string | Yes | ||
object | string | Yes |
object Enum: DeleteAssistantResponseState
Value | Description |
---|---|
assistant.deleted |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | string | The type of tool being defined: code_interpreter |
Yes |
type Enum: assistantToolsCodeType
Value | Description |
---|---|
code_interpreter |
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_search | object | Overrides for the file search tool. | No | |
└─ max_num_results | integer | The maximum number of results the file search tool should output. The default is 20 for gpt-4* models and 5 for gpt-3.5-turbo. This number should be between 1 and 50 inclusive. Note that the file search tool may output fewer than max_num_results results. |
No | |
type | string | The type of tool being defined: file_search |
Yes |
type Enum: assistantToolsFileSearchType
Value | Description |
---|---|
file_search |
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | string | The type of tool being defined: file_search |
Yes |
type Enum: assistantToolsFileSearchType
Value | Description |
---|---|
file_search |
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | The function definition. | Yes | |
└─ description | string | A description of what the function does, used by the model to choose when and how to call the function. | No | |
└─ name | string | The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. | No | |
└─ parameters | chatCompletionFunctionParameters | The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format. | No | |
type | string | The type of tool being defined: function |
Yes |
type Enum: assistantToolsFunction
Value | Description |
---|---|
function |
Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.
Name | Type | Description | Required | Default |
---|---|---|---|---|
last_messages | integer | The number of most recent messages from the thread when constructing the context for the run. | No | |
type | string | The truncation strategy to use for the thread. The default is auto . If set to last_messages , the thread will be truncated to the n most recent messages in the thread. When set to auto , messages in the middle of the thread will be dropped to fit the context length of the model, max_prompt_tokens . |
Yes |
type Enum: TruncationType
Value | Description |
---|---|
auto | |
last_messages |
Controls which (if any) tool is called by the model.
none
means the model will not call any tools and instead generates a message.
auto
is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"}
or {"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool.
This component can be one of the following:
Specifies a tool the model should use. Use to force the model to call a specific tool.
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | No | ||
└─ name | string | The name of the function to call. | No | |
type | string | The type of the tool. If type is function , the function name must be set |
Yes |
type Enum: AssistantsNamedToolChoiceType
Value | Description |
---|---|
function | |
code_interpreter | |
file_search |
Represents an execution run on a thread.
Name | Type | Description | Required | Default |
---|---|---|---|---|
assistant_id | string | The ID of the assistant used for execution of this run. | Yes | |
cancelled_at | integer | The Unix timestamp (in seconds) for when the run was cancelled. | Yes | |
completed_at | integer | The Unix timestamp (in seconds) for when the run was completed. | Yes | |
created_at | integer | The Unix timestamp (in seconds) for when the run was created. | Yes | |
expires_at | integer | The Unix timestamp (in seconds) for when the run will expire. | Yes | |
failed_at | integer | The Unix timestamp (in seconds) for when the run failed. | Yes | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
incomplete_details | object | Details on why the run is incomplete. Will be null if the run is not incomplete. |
Yes | |
└─ reason | string | The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run. | No | |
instructions | string | The instructions that the assistant used for this run. | Yes | |
last_error | object | The last error associated with this run. Will be null if there are no errors. |
Yes | |
└─ code | string | One of server_error or rate_limit_exceeded . |
No | |
└─ message | string | A human-readable description of the error. | No | |
max_completion_tokens | integer | The maximum number of completion tokens specified to have been used over the course of the run. |
Yes | |
max_prompt_tokens | integer | The maximum number of prompt tokens specified to have been used over the course of the run. |
Yes | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
Yes | |
model | string | The model that the assistant used for this run. | Yes | |
object | string | The object type, which is always thread.run . |
Yes | |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
required_action | object | Details on the action required to continue the run. Will be null if no action is required. |
Yes | |
└─ submit_tool_outputs | object | Details on the tool outputs needed for this run to continue. | No | |
└─ tool_calls | array | A list of the relevant tool calls. | No | |
└─ type | enum | For now, this is always submit_tool_outputs .Possible values: submit_tool_outputs |
No | |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
Yes | |
started_at | integer | The Unix timestamp (in seconds) for when the run was started. | Yes | |
status | string | The status of the run, which can be either queued , in_progress , requires_action , cancelling , cancelled , failed , completed , or expired . |
Yes | |
temperature | number | The sampling temperature used for this run. If not set, defaults to 1. | No | |
thread_id | string | The ID of the thread that was executed on as a part of this run. | Yes | |
tool_choice | assistantsApiToolChoiceOption | Controls which (if any) tool is called by the model.none means the model will not call any tools and instead generates a message.auto is the default value and means the model can pick between generating a message or calling a tool.Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. |
Yes | |
tools | array | The list of tools that the assistant used for this run. | Yes | [] |
top_p | number | The nucleus sampling value used for this run. If not set, defaults to 1. | No | |
truncation_strategy | truncationObject | Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. | Yes | |
usage | runCompletionUsage | Usage statistics related to the run. This value will be null if the run is not in a terminal state (i.e. in_progress , queued , etc.). |
Yes |
object Enum: runObjectType
Value | Description |
---|---|
thread.run | The run object type which is always thread.run |
status Enum: RunObjectStatus
Value | Description |
---|---|
queued | The queued state |
in_progress | The in_progress state |
requires_action | The required_action state |
cancelling | The cancelling state |
cancelled | The cancelled state |
failed | The failed state |
completed | The completed state |
expired | The expired state |
Name | Type | Description | Required | Default |
---|---|---|---|---|
additional_instructions | string | Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions. | No | |
additional_messages | array | Adds additional messages to the thread before creating the run. | No | |
assistant_id | string | The ID of the assistant to use to execute this run. | Yes | |
instructions | string | Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. | No | |
max_completion_tokens | integer | The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
max_prompt_tokens | integer | The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
stream | boolean | If true , returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message. |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
tool_choice | assistantsApiToolChoiceOption | Controls which (if any) tool is called by the model.none means the model will not call any tools and instead generates a message.auto is the default value and means the model can pick between generating a message or calling a tool.Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. |
No | |
tools | array | Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. | No | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
truncation_strategy | truncationObject | Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
stream | boolean | If true , returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message. |
No | |
tool_outputs | array | A list of tools for which the outputs are being submitted. | Yes |
Tool call objects
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | The function definition. | Yes | |
└─ arguments | string | The arguments that the model expects you to pass to the function. | No | |
└─ name | string | The name of the function. | No | |
id | string | The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the submit tool outputs to run endpoint. | Yes | |
type | string | The type of tool call the output is required for. For now, this is always function . |
Yes |
type Enum: RunToolCallObjectType
Value | Description |
---|---|
function |
Name | Type | Description | Required | Default |
---|---|---|---|---|
assistant_id | string | The ID of the assistant to use to execute this run. | Yes | |
instructions | string | Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. | No | |
max_completion_tokens | integer | The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
max_prompt_tokens | integer | The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete . See incomplete_details for more info. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
model | string | The ID of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. | No | |
parallel_tool_calls | ParallelToolCalls | Whether to enable parallel function calling during tool use. | No | True |
response_format | assistantsApiResponseFormatOption | Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106 .Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length" , which indicates the generation exceeded max_tokens or the conversation exceeded the max context length. |
No | |
stream | boolean | If true , returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message. |
No | |
stream_options | chatCompletionStreamOptions | Options for streaming response. Only set this when you set stream: true . |
No | None |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
No | 1 |
thread | createThreadRequest | No | ||
tool_choice | assistantsApiToolChoiceOption | Controls which (if any) tool is called by the model.none means the model will not call any tools and instead generates a message.auto is the default value and means the model can pick between generating a message or calling a tool.Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. |
No | |
tool_resources | object | A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant. |
No | |
tools | array | Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. | No | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
truncation_strategy | truncationObject | Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. | No |
Represents a thread that contains messages.
Name | Type | Description | Required | Default |
---|---|---|---|---|
created_at | integer | The Unix timestamp (in seconds) for when the thread was created. | Yes | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
Yes | |
object | string | The object type, which is always thread . |
Yes | |
tool_resources | object | A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
Yes | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread. |
No |
object Enum: ThreadObjectType
Value | Description |
---|---|
thread | The type of thread object which is always thread |
Name | Type | Description | Required | Default |
---|---|---|---|---|
messages | array | A list of messages to start the thread with. | No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
tool_resources | object | A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread. |
No | |
└─ vector_stores | array | A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread. |
No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
tool_resources | object | A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs. |
No | |
└─ code_interpreter | object | No | ||
└─ file_ids | array | A list of File IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool. |
No | [] |
└─ file_search | object | No | ||
└─ vector_store_ids | array | The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread. |
No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
deleted | boolean | Yes | ||
id | string | Yes | ||
object | string | Yes |
object Enum: DeleteThreadResponseObjectState
Value | Description |
---|---|
thread.deleted | The delete thread response object state which is thread.deleted |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
Represents a message within a thread.
Name | Type | Description | Required | Default |
---|---|---|---|---|
assistant_id | string | If applicable, the ID of the assistant that authored this message. | Yes | |
attachments | array | A list of files attached to the message, and the tools they were added to. | Yes | |
completed_at | integer | The Unix timestamp (in seconds) for when the message was completed. | Yes | |
content | array | The content of the message in array of text and/or images. | Yes | |
created_at | integer | The Unix timestamp (in seconds) for when the message was created. | Yes | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
incomplete_at | integer | The Unix timestamp (in seconds) for when the message was marked as incomplete. | Yes | |
incomplete_details | object | On an incomplete message, details about why the message is incomplete. | Yes | |
└─ reason | string | The reason the message is incomplete. | No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
Yes | |
object | string | The object type, which is always thread.message . |
Yes | |
role | string | The entity that produced the message. One of user or assistant . |
Yes | |
run_id | string | If applicable, the ID of the run associated with the authoring of this message. | Yes | |
status | string | The status of the message, which can be either in_progress , incomplete , or completed . |
Yes | |
thread_id | string | The thread ID that this message belongs to. | Yes |
object Enum: MessageObjectType
Value | Description |
---|---|
thread.message | The message object type which is thread.message |
status Enum: MessageObjectStatus
Value | Description |
---|---|
in_progress | |
incomplete | |
completed |
role Enum: MessageObjectRole
Value | Description |
---|---|
user | |
assistant |
Represents a message delta i.e. any changed fields on a message during streaming.
Name | Type | Description | Required | Default |
---|---|---|---|---|
delta | object | The delta containing the fields that have changed on the Message. | Yes | |
└─ content | array | The content of the message in array of text and/or images. | No | |
└─ role | string | The entity that produced the message. One of user or assistant . |
No | |
id | string | The identifier of the message, which can be referenced in API endpoints. | Yes | |
object | string | The object type, which is always thread.message.delta . |
Yes |
object Enum: MessageDeltaObjectType
Value | Description |
---|---|
thread.message.delta |
Name | Type | Description | Required | Default |
---|---|---|---|---|
attachments | array | A list of files attached to the message, and the tools they should be added to. | No | |
content | string | The content of the message. | Yes | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
role | string | The role of the entity that is creating the message. Allowed values include: - user : Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.- assistant : Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation. |
Yes |
role Enum: CreateMessageRequestRole
Value | Description |
---|---|
user | |
assistant |
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
deleted | boolean | Yes | ||
id | string | Yes | ||
object | string | Yes |
object Enum: DeleteMessageResponseObject
Value | Description |
---|---|
thread.message.deleted | The delete message response object state |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
References an image File in the content of a message.
Name | Type | Description | Required | Default |
---|---|---|---|---|
image_file | object | Yes | ||
└─ file_id | string | The File ID of the image in the message content. | No | |
type | string | Always image_file . |
Yes |
type Enum: MessageContentImageFileObjectType
Value | Description |
---|---|
image_file | The message content image file type |
The text content that is part of a message.
Name | Type | Description | Required | Default |
---|---|---|---|---|
text | object | Yes | ||
└─ annotations | array | No | ||
└─ value | string | The data that makes up the text. | No | |
type | string | Always text . |
Yes |
type Enum: messageContentTextObjectType
Value | Description |
---|---|
text | The message content text Object type |
A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "retrieval" tool to search files.
Name | Type | Description | Required | Default |
---|---|---|---|---|
end_index | integer | Yes | ||
file_citation | object | Yes | ||
└─ file_id | string | The ID of the specific File the citation is from. | No | |
start_index | integer | Yes | ||
text | string | The text in the message content that needs to be replaced. | Yes | |
type | string | Always file_citation . |
Yes |
type Enum: FileCitationObjectType
Value | Description |
---|---|
file_citation | The file citation object type |
A URL for the file that's generated when the assistant used the code_interpreter
tool to generate a file.
Name | Type | Description | Required | Default |
---|---|---|---|---|
end_index | integer | Yes | ||
file_path | object | Yes | ||
└─ file_id | string | The ID of the file that was generated. | No | |
start_index | integer | Yes | ||
text | string | The text in the message content that needs to be replaced. | Yes | |
type | string | Always file_path . |
Yes |
type Enum: FilePathObjectType
Value | Description |
---|---|
file_path | The file path object type |
References an image File in the content of a message.
Name | Type | Description | Required | Default |
---|---|---|---|---|
image_file | object | No | ||
└─ file_id | string | The File ID of the image in the message content. | No | |
index | integer | The index of the content part in the message. | Yes | |
type | string | Always image_file . |
Yes |
type Enum: MessageDeltaContentImageFileObjectType
Value | Description |
---|---|
image_file |
The text content that is part of a message.
Name | Type | Description | Required | Default |
---|---|---|---|---|
index | integer | The index of the content part in the message. | Yes | |
text | object | No | ||
└─ annotations | array | No | ||
└─ value | string | The data that makes up the text. | No | |
type | string | Always text . |
Yes |
type Enum: MessageDeltaContentTextObjectType
Value | Description |
---|---|
text |
A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.
Name | Type | Description | Required | Default |
---|---|---|---|---|
end_index | integer | No | ||
file_citation | object | No | ||
└─ file_id | string | The ID of the specific File the citation is from. | No | |
└─ quote | string | The specific quote in the file. | No | |
index | integer | The index of the annotation in the text content part. | Yes | |
start_index | integer | No | ||
text | string | The text in the message content that needs to be replaced. | No | |
type | string | Always file_citation . |
Yes |
type Enum: MessageDeltaContentTextAnnotationsFileCitationObjectType
Value | Description |
---|---|
file_citation |
A URL for the file that's generated when the assistant used the code_interpreter
tool to generate a file.
Name | Type | Description | Required | Default |
---|---|---|---|---|
end_index | integer | No | ||
file_path | object | No | ||
└─ file_id | string | The ID of the file that was generated. | No | |
index | integer | The index of the annotation in the text content part. | Yes | |
start_index | integer | No | ||
text | string | The text in the message content that needs to be replaced. | No | |
type | string | Always file_path . |
Yes |
type Enum: MessageDeltaContentTextAnnotationsFilePathObjectType
Value | Description |
---|---|
file_path |
Represents a step in execution of a run.
Name | Type | Description | Required | Default |
---|---|---|---|---|
assistant_id | string | The ID of the assistant associated with the run step. | Yes | |
cancelled_at | integer | The Unix timestamp (in seconds) for when the run step was cancelled. | Yes | |
completed_at | integer | The Unix timestamp (in seconds) for when the run step completed. | Yes | |
created_at | integer | The Unix timestamp (in seconds) for when the run step was created. | Yes | |
expired_at | integer | The Unix timestamp (in seconds) for when the run step expired. A step is considered expired if the parent run is expired. | Yes | |
failed_at | integer | The Unix timestamp (in seconds) for when the run step failed. | Yes | |
id | string | The identifier of the run step, which can be referenced in API endpoints. | Yes | |
last_error | object | The last error associated with this run step. Will be null if there are no errors. |
Yes | |
└─ code | string | One of server_error or rate_limit_exceeded . |
No | |
└─ message | string | A human-readable description of the error. | No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
Yes | |
object | string | The object type, which is always assistant.run.step . |
Yes | |
run_id | string | The ID of the run that this run step is a part of. | Yes | |
status | string | The status of the run, which can be either in_progress , cancelled , failed , completed , or expired . |
Yes | |
step_details | runStepDetailsMessageCreationObject or runStepDetailsToolCallsObject | The details of the run step. | Yes | |
thread_id | string | The ID of the thread that was run. | Yes | |
type | string | The type of run step, which can be either message_creation or tool_calls . |
Yes |
object Enum: RunStepObjectType
Value | Description |
---|---|
assistant.run.step | The object type, which is always assistant.run.step |
type Enum: RunStepObjectType
Value | Description |
---|---|
message_creation | The message_creation run step |
tool_calls | The tool_calls run step |
status Enum: RunStepObjectStatus
Value | Description |
---|---|
in_progress | The in_progress run status |
cancelled | The cancelled run status |
failed | The cancelled run status |
completed | The cancelled run status |
expired | The cancelled run status |
Represents a run step delta i.e. any changed fields on a run step during streaming.
Name | Type | Description | Required | Default |
---|---|---|---|---|
delta | object | The delta containing the fields that have changed on the run step. | Yes | |
└─ step_details | runStepDeltaStepDetailsMessageCreationObject or runStepDeltaStepDetailsToolCallsObject | The details of the run step. | No | |
id | string | The identifier of the run step, which can be referenced in API endpoints. | Yes | |
object | string | The object type, which is always thread.run.step.delta . |
Yes |
object Enum: RunStepDeltaObjectType
Value | Description |
---|---|
thread.run.step.delta |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
Details of the message creation by the run step.
Name | Type | Description | Required | Default |
---|---|---|---|---|
message_creation | object | Yes | ||
└─ message_id | string | The ID of the message that was created by this run step. | No | |
type | string | Always message_creation . |
Yes |
type Enum: RunStepDetailsMessageCreationObjectType
Value | Description |
---|---|
message_creation |
Details of the message creation by the run step.
Name | Type | Description | Required | Default |
---|---|---|---|---|
message_creation | object | No | ||
└─ message_id | string | The ID of the message that was created by this run step. | No | |
type | string | Always message_creation . |
Yes |
type Enum: RunStepDeltaStepDetailsMessageCreationObjectType
Value | Description |
---|---|
message_creation |
Details of the tool call.
Name | Type | Description | Required | Default |
---|---|---|---|---|
tool_calls | array | An array of tool calls the run step was involved in. These can be associated with one of three types of tools: code_interpreter , retrieval or function . |
Yes | |
type | string | Always tool_calls . |
Yes |
type Enum: RunStepDetailsToolCallsObjectType
Value | Description |
---|---|
tool_calls |
Details of the tool call.
Name | Type | Description | Required | Default |
---|---|---|---|---|
tool_calls | array | An array of tool calls the run step was involved in. These can be associated with one of three types of tools: code_interpreter , file_search or function . |
No | |
type | string | Always tool_calls . |
Yes |
type Enum: RunStepDeltaStepDetailsToolCallsObjectType
Value | Description |
---|---|
tool_calls |
Details of the Code Interpreter tool call the run step was involved in.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code_interpreter | object | The Code Interpreter tool call definition. | Yes | |
└─ input | string | The input to the Code Interpreter tool call. | No | |
└─ outputs | array | The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (logs ) or images (image ). Each of these are represented by a different object type. |
No | |
id | string | The ID of the tool call. | Yes | |
type | string | The type of tool call. This is always going to be code_interpreter for this type of tool call. |
Yes |
type Enum: RunStepDetailsToolCallsCodeObjectType
Value | Description |
---|---|
code_interpreter |
Details of the Code Interpreter tool call the run step was involved in.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code_interpreter | object | The Code Interpreter tool call definition. | No | |
└─ input | string | The input to the Code Interpreter tool call. | No | |
└─ outputs | array | The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (logs ) or images (image ). Each of these are represented by a different object type. |
No | |
id | string | The ID of the tool call. | No | |
index | integer | The index of the tool call in the tool calls array. | Yes | |
type | string | The type of tool call. This is always going to be code_interpreter for this type of tool call. |
Yes |
type Enum: RunStepDeltaStepDetailsToolCallsCodeObjectType
Value | Description |
---|---|
code_interpreter |
Text output from the Code Interpreter tool call as part of a run step.
Name | Type | Description | Required | Default |
---|---|---|---|---|
logs | string | The text output from the Code Interpreter tool call. | Yes | |
type | string | Always logs . |
Yes |
type Enum: RunStepDetailsToolCallsCodeOutputLogsObjectType
Value | Description |
---|---|
logs |
Text output from the Code Interpreter tool call as part of a run step.
Name | Type | Description | Required | Default |
---|---|---|---|---|
index | integer | The index of the output in the outputs array. | Yes | |
logs | string | The text output from the Code Interpreter tool call. | No | |
type | string | Always logs . |
Yes |
type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputLogsObjectType
Value | Description |
---|---|
logs |
Name | Type | Description | Required | Default |
---|---|---|---|---|
image | object | Yes | ||
└─ file_id | string | The File ID of the image. | No | |
type | string | Always image . |
Yes |
type Enum: RunStepDetailsToolCallsCodeOutputImageObjectType
Value | Description |
---|---|
image |
Name | Type | Description | Required | Default |
---|---|---|---|---|
image | object | No | ||
└─ file_id | string | The file ID of the image. | No | |
index | integer | The index of the output in the outputs array. | Yes | |
type | string | Always image . |
Yes |
type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputImageObject
Value | Description |
---|---|
image |
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_search | object | For now, this is always going to be an empty object. | Yes | |
└─ results | array | The results of the file search. | No | |
id | string | The ID of the tool call object. | Yes | |
type | string | The type of tool call. This is always going to be file_search for this type of tool call. |
Yes |
type Enum: RunStepDetailsToolCallsFileSearchObjectType
Value | Description |
---|---|
file_search |
A result instance of the file search.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | array | The content of the result that was found. The content is only included if requested via the include query parameter. | No | |
file_id | string | The ID of the file that result was found in. | Yes | |
file_name | string | The name of the file that result was found in. | Yes | |
score | number | The score of the result. All values must be a floating point number between 0 and 1. | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_search | object | For now, this is always going to be an empty object. | Yes | |
id | string | The ID of the tool call object. | No | |
index | integer | The index of the tool call in the tool calls array. | Yes | |
type | string | The type of tool call. This is always going to be retrieval for this type of tool call. |
Yes |
type Enum: RunStepDeltaStepDetailsToolCallsFileSearchObjectType
Value | Description |
---|---|
file_search |
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | The definition of the function that was called. | Yes | |
└─ arguments | string | The arguments passed to the function. | No | |
└─ name | string | The name of the function. | No | |
└─ output | string | The output of the function. This will be null if the outputs have not been submitted yet. |
No | |
id | string | The ID of the tool call object. | Yes | |
type | string | The type of tool call. This is always going to be function for this type of tool call. |
Yes |
type Enum: RunStepDetailsToolCallsFunctionObjectType
Value | Description |
---|---|
function |
Name | Type | Description | Required | Default |
---|---|---|---|---|
function | object | The definition of the function that was called. | No | |
└─ arguments | string | The arguments passed to the function. | No | |
└─ name | string | The name of the function. | No | |
└─ output | string | The output of the function. This will be null if the outputs have not been submitted yet. |
No | |
id | string | The ID of the tool call object. | No | |
index | integer | The index of the tool call in the tool calls array. | Yes | |
type | string | The type of tool call. This is always going to be function for this type of tool call. |
Yes |
type Enum: RunStepDetailsToolCallsFunctionObjectType
Value | Description |
---|---|
function |
The expiration policy for a vector store.
Name | Type | Description | Required | Default |
---|---|---|---|---|
anchor | string | Anchor timestamp after which the expiration policy applies. Supported anchors: last_active_at . |
Yes | |
days | integer | The number of days after the anchor time that the vector store will expire. | Yes |
anchor Enum: VectorStoreExpirationAfterAnchor
Value | Description |
---|---|
last_active_at | The anchor timestamp after which the expiration policy applies. |
A vector store is a collection of processed files can be used by the file_search
tool.
Name | Type | Description | Required | Default |
---|---|---|---|---|
created_at | integer | The Unix timestamp (in seconds) for when the vector store was created. | Yes | |
expires_after | vectorStoreExpirationAfter | The expiration policy for a vector store. | No | |
expires_at | integer | The Unix timestamp (in seconds) for when the vector store will expire. | No | |
file_counts | object | Yes | ||
└─ cancelled | integer | The number of files that were cancelled. | No | |
└─ completed | integer | The number of files that have been successfully processed. | No | |
└─ failed | integer | The number of files that have failed to process. | No | |
└─ in_progress | integer | The number of files that are currently being processed. | No | |
└─ total | integer | The total number of files. | No | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
last_active_at | integer | The Unix timestamp (in seconds) for when the vector store was last active. | Yes | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
Yes | |
name | string | The name of the vector store. | Yes | |
object | enum | The object type, which is always vector_store .Possible values: vector_store |
Yes | |
status | string | The status of the vector store, which can be either expired , in_progress , or completed . A status of completed indicates that the vector store is ready for use. |
Yes | |
usage_bytes | integer | The total number of bytes used by the files in the vector store. | Yes |
status Enum: VectorStoreObjectStatus
Value | Description |
---|---|
expired | |
in_progress | |
completed |
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is non-empty. |
No | |
expires_after | vectorStoreExpirationAfter | The expiration policy for a vector store. | No | |
file_ids | array | A list of file IDs that the vector store should use. Useful for tools like file_search that can access files. |
No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
name | string | The name of the vector store. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
expires_after | vectorStoreExpirationAfter | The expiration policy for a vector store. | No | |
metadata | object | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long. |
No | |
name | string | The name of the vector store. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
deleted | boolean | Yes | ||
id | string | Yes | ||
object | string | Yes |
object Enum: DeleteVectorStoreResponseObject
Value | Description |
---|---|
vector_store.deleted | The delete vector store response object state |
A list of files attached to a vector store.
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is non-empty. |
No | |
created_at | integer | The Unix timestamp (in seconds) for when the vector store file was created. | Yes | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
last_error | object | The last error associated with this vector store file. Will be null if there are no errors. |
Yes | |
└─ code | string | One of server_error or invalid_file or unsupported_file . |
No | |
└─ message | string | A human-readable description of the error. | No | |
object | string | The object type, which is always vector_store.file . |
Yes | |
status | string | The status of the vector store file, which can be either in_progress , completed , cancelled , or failed . The status completed indicates that the vector store file is ready for use. |
Yes | |
usage_bytes | integer | The total vector store usage in bytes. Note that this may be different from the original file size. | Yes | |
vector_store_id | string | The ID of the vector store that the file is attached to. | Yes |
object Enum: VectorStoreFileObjectType
Value | Description |
---|---|
vector_store.file |
status Enum: VectorStoreFileObjectStatus
Value | Description |
---|---|
in_progress | |
completed | |
cancelled | |
failed |
This is returned when the chunking strategy is unknown. Typically, this is because the file was indexed before the chunking_strategy
concept was introduced in the API.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | string | Always other . |
Yes |
type Enum: OtherChunkingStrategyResponseParamType
Value | Description |
---|---|
other |
Name | Type | Description | Required | Default |
---|---|---|---|---|
static | staticChunkingStrategy | Yes | ||
type | string | Always static . |
Yes |
type Enum: StaticChunkingStrategyResponseParamType
Value | Description |
---|---|
static |
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunk_overlap_tokens | integer | The number of tokens that overlap between chunks. The default value is 400 .Note that the overlap must not exceed half of max_chunk_size_tokens . |
Yes | |
max_chunk_size_tokens | integer | The maximum number of tokens in each chunk. The default value is 800 . The minimum value is 100 and the maximum value is 4096 . |
Yes |
The default strategy. This strategy currently uses a max_chunk_size_tokens
of 800
and chunk_overlap_tokens
of 400
.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | Always auto .Possible values: auto |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
static | staticChunkingStrategy | Yes | ||
type | enum | Always static .Possible values: static |
Yes |
The chunking strategy used to chunk the file(s). If not set, will use the auto
strategy.
This component can be one of the following:
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | chunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. |
No | |
file_id | string | A File ID that the vector store should use. Useful for tools like file_search that can access files. |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Yes | ||
first_id | string | Yes | ||
has_more | boolean | Yes | ||
last_id | string | Yes | ||
object | string | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
deleted | boolean | Yes | ||
id | string | Yes | ||
object | string | Yes |
object Enum: DeleteVectorStoreFileResponseObject
Value | Description |
---|---|
vector_store.file.deleted |
A batch of files attached to a vector store.
Name | Type | Description | Required | Default |
---|---|---|---|---|
created_at | integer | The Unix timestamp (in seconds) for when the vector store files batch was created. | Yes | |
file_counts | object | Yes | ||
└─ cancelled | integer | The number of files that where cancelled. | No | |
└─ completed | integer | The number of files that have been processed. | No | |
└─ failed | integer | The number of files that have failed to process. | No | |
└─ in_progress | integer | The number of files that are currently being processed. | No | |
└─ total | integer | The total number of files. | No | |
id | string | The identifier, which can be referenced in API endpoints. | Yes | |
object | string | The object type, which is always vector_store.file_batch . |
Yes | |
status | string | The status of the vector store files batch, which can be either in_progress , completed , cancelled or failed . |
Yes | |
vector_store_id | string | The ID of the vector store that the File is attached to. | Yes |
object Enum: VectorStoreFileBatchObjectType
Value | Description |
---|---|
vector_store.files_batch |
status Enum: VectorStoreFileBatchObjectStatus
Value | Description |
---|---|
in_progress | |
completed | |
cancelled | |
failed |
Name | Type | Description | Required | Default |
---|---|---|---|---|
chunking_strategy | chunkingStrategyRequestParam | The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. |
No | |
file_ids | array | A list of File IDs that the vector store should use. Useful for tools like file_search that can access files. |
Yes |
Represents an event emitted when streaming a Run.
Each event in a server-sent events stream has an event
and data
property:
event: thread.created
data: {"id": "thread_123", "object": "thread", ...}
We emit events whenever a new object is created, transitions to a new state, or is being
streamed in parts (deltas). For example, we emit thread.run.created
when a new run
is created, thread.run.completed
when a run completes, and so on. When an Assistant chooses
to create a message during a run, we emit a thread.message.created event
, a
thread.message.in_progress
event, many thread.message.delta
events, and finally a
thread.message.completed
event.
We may add additional events over time, so we recommend handling unknown events gracefully in your code.
This component can be one of the following:
This component can be one of the following:
Occurs when a new thread is created.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | threadObject | Represents a thread that contains messages. | Yes | |
event | string | Yes |
Data: threadObject
Event Enum: ThreadStreamEventEnum
Value | Description |
---|---|
thread.created | The thread created event |
This component can be one of the following:
Occurs when a new run is created.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventCreated
Value | Description |
---|---|
thread.run.created |
Occurs when a run moves to a queued
status.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventQueued
Value | Description |
---|---|
thread.run.queued |
Occurs when a run moves to an in_progress
status.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventInProgress
Value | Description |
---|---|
thread.run.in_progress |
Occurs when a run moves to a requires_action
status.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventRequiresAction
Value | Description |
---|---|
thread.run.requires_action |
Occurs when a run is completed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventCompleted
Value | Description |
---|---|
thread.run.completed |
Occurs when a run fails.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventFailed
Value | Description |
---|---|
thread.run.failed |
Occurs when a run moves to a cancelling
status.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventCancelling
Value | Description |
---|---|
thread.run.cancelling |
Occurs when a run is cancelled.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventCancelled
Value | Description |
---|---|
thread.run.cancelled |
Occurs when a run expires.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runObject | Represents an execution run on a thread. | Yes | |
event | string | Yes |
Data: runObject
Event Enum: RunStreamEventExpired
Value | Description |
---|---|
thread.run.expired |
This component can be one of the following:
Occurs when a run step is created.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepObject | Represents a step in execution of a run. |
Yes | |
event | string | Yes |
Data: runStepObject
Event Enum: RunStepStreamEventCreated
Value | Description |
---|---|
thread.run.step.created |
Occurs when a run step moves to an in_progress
state.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepObject | Represents a step in execution of a run. |
Yes | |
event | string | Yes |
Data: runStepObject
Event Enum: RunStepStreamEventInProgress
Value | Description |
---|---|
thread.run.step.in_progress |
Occurs when parts of a run step are being streamed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepDeltaObject | Represents a run step delta i.e. any changed fields on a run step during streaming. |
Yes | |
event | string | Yes |
Data: runStepDeltaObject
Event Enum: RunStepStreamEventDelta
Value | Description |
---|---|
thread.run.step.delta |
Occurs when a run step is completed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepObject | Represents a step in execution of a run. |
Yes | |
event | string | Yes |
Data: runStepObject
Event Enum: RunStepStreamEventCompleted
Value | Description |
---|---|
thread.run.step.completed |
Occurs when a run step fails.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepObject | Represents a step in execution of a run. |
Yes | |
event | string | Yes |
Data: runStepObject
Event Enum: RunStepStreamEventFailed
Value | Description |
---|---|
thread.run.step.failed |
Occurs when a run step is cancelled.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepObject | Represents a step in execution of a run. |
Yes | |
event | string | Yes |
Data: runStepObject
Event Enum: RunStepStreamEventCancelled
Value | Description |
---|---|
thread.run.step.cancelled |
Occurs when a run step expires.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | runStepObject | Represents a step in execution of a run. |
Yes | |
event | string | Yes |
Data: runStepObject
Event Enum: RunStepStreamEventExpired
Value | Description |
---|---|
thread.run.step.expired |
This component can be one of the following:
Occurs when a message is created.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | messageObject | Represents a message within a thread. | Yes | |
event | string | Yes |
Data: messageObject
Event Enum: MessageStreamEventCreated
Value | Description |
---|---|
thread.message.created |
Occurs when a message moves to an in_progress
state.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | messageObject | Represents a message within a thread. | Yes | |
event | string | Yes |
Data: messageObject
Event Enum: MessageStreamEventInProgress
Value | Description |
---|---|
thread.message.in_progress |
Occurs when parts of a message are being streamed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | messageDeltaObject | Represents a message delta i.e. any changed fields on a message during streaming. |
Yes | |
event | string | Yes |
Data: messageDeltaObject
Event Enum: MessageStreamEventDelta
Value | Description |
---|---|
thread.message.delta |
Occurs when a message is completed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | messageObject | Represents a message within a thread. | Yes | |
event | string | Yes |
Data: messageObject
Event Enum: MessageStreamEventCompleted
Value | Description |
---|---|
thread.message.completed |
Occurs when a message ends before it is completed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | messageObject | Represents a message within a thread. | Yes | |
event | string | Yes |
Data: messageObject
Event Enum: MessageStreamEventIncomplete
Value | Description |
---|---|
thread.message.incomplete |
This component can be one of the following:
A click action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
button | enum | Indicates which mouse button was pressed during the click. One of left , right , wheel , back , or forward .Possible values: left , right , wheel , back , forward |
Yes | |
type | enum | Specifies the event type. For a click action, this property is always set to click .Possible values: click |
Yes | |
x | integer | The x-coordinate where the click occurred. |
Yes | |
y | integer | The y-coordinate where the click occurred. |
Yes |
The output of a code interpreter tool call that is a file.
Name | Type | Description | Required | Default |
---|---|---|---|---|
files | array | Yes | ||
type | enum | The type of the code interpreter file output. Always files .Possible values: files |
Yes |
The output of a code interpreter tool call that is text.
Name | Type | Description | Required | Default |
---|---|---|---|---|
logs | string | The logs of the code interpreter tool call. |
Yes | |
type | enum | The type of the code interpreter text output. Always logs .Possible values: logs |
Yes |
A tool that runs code.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_ids | array | The IDs of the files to run the code on. |
Yes | |
type | enum | The type of the code interpreter tool. Always code_interpreter .Possible values: code_interpreter |
Yes |
A tool call to run code.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | string | The code to run. |
Yes | |
id | string | The unique ID of the code interpreter tool call. |
Yes | |
results | array | The results of the code interpreter tool call. |
Yes | |
status | enum | The status of the code interpreter tool call. Possible values: in_progress , interpreting , completed |
Yes | |
type | enum | The type of the code interpreter tool call. Always code_interpreter_call .Possible values: code_interpreter_call |
Yes |
This component can be one of the following:
A filter used to compare a specified attribute key to a given value using a defined comparison operation.
Name | Type | Description | Required | Default |
---|---|---|---|---|
key | string | The key to compare against the value. | Yes | |
type | enum | Specifies the comparison operator: eq , ne , gt , gte , lt , lte .- eq : equals- ne : not equal- gt : greater than- gte : greater than or equal- lt : less than- lte : less than or equalPossible values: eq , ne , gt , gte , lt , lte |
Yes | |
value | string or number or boolean | The value to compare against the attribute key; supports string, number, or boolean types. | Yes |
Combine multiple filters using and
or or
.
Name | Type | Description | Required | Default |
---|---|---|---|---|
filters | array | Array of filters to combine. Items can be ComparisonFilter or CompoundFilter . |
Yes | |
type | enum | Type of operation: and or or .Possible values: and , or |
Yes |
This component can be one of the following:
A computer screenshot image used with the computer use tool.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_id | string | The identifier of an uploaded file that contains the screenshot. | No | |
image_url | string | The URL of the screenshot image. | No | |
type | enum | Specifies the event type. For a computer screenshot, this property is always set to computer_screenshot .Possible values: computer_screenshot |
Yes |
A tool that controls a virtual computer.
Name | Type | Description | Required | Default |
---|---|---|---|---|
display_height | number | The height of the computer display. |
Yes | |
display_width | number | The width of the computer display. |
Yes | |
environment | enum | The type of computer environment to control. Possible values: mac , windows , ubuntu , browser |
Yes | |
type | enum | The type of the computer use tool. Always computer_use_preview .Possible values: computer_use_preview |
Yes |
A tool call to a computer use tool.
Name | Type | Description | Required | Default |
---|---|---|---|---|
action | ComputerAction | Yes | ||
call_id | string | An identifier used when responding to the tool call with output. |
Yes | |
id | string | The unique ID of the computer call. | Yes | |
pending_safety_checks | array | The pending safety checks for the computer call. |
Yes | |
status | enum | The status of the item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
Yes | |
type | enum | The type of the computer call. Always computer_call .Possible values: computer_call |
Yes |
The output of a computer tool call.
Name | Type | Description | Required | Default |
---|---|---|---|---|
acknowledged_safety_checks | array | The safety checks reported by the API that have been acknowledged by the developer. |
No | |
call_id | string | The ID of the computer tool call that produced the output. |
Yes | |
id | string | The ID of the computer tool call output. |
No | |
output | ComputerScreenshotImage | A computer screenshot image used with the computer use tool. |
Yes | |
status | enum | The status of the message input. One of in_progress , completed , orincomplete . Populated when input items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the computer tool call output. Always computer_call_output .Possible values: computer_call_output |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
acknowledged_safety_checks | array | The safety checks reported by the API that have been acknowledged by the developer. |
No | |
call_id | string | The ID of the computer tool call that produced the output. |
Yes | |
id | string | The unique ID of the computer call tool output. |
Yes | |
output | ComputerScreenshotImage | A computer screenshot image used with the computer use tool. |
Yes | |
status | enum | The status of the message input. One of in_progress , completed , orincomplete . Populated when input items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the computer tool call output. Always computer_call_output .Possible values: computer_call_output |
Yes |
A pending safety check for the computer call.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | string | The type of the pending safety check. | Yes | |
id | string | The ID of the pending safety check. | Yes | |
message | string | Details about the pending safety check. | Yes |
Multi-modal input and output contents.
This component can be one of the following:
An x/y coordinate pair, e.g. { x: 100, y: 200 }
.
Name | Type | Description | Required | Default |
---|---|---|---|---|
x | integer | The x-coordinate. |
Yes | |
y | integer | The y-coordinate. |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | Metadata | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters. |
No | |
model | string | Model used to generate the responses. | No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. | No |
Name | Type | Description | Required | Default |
---|---|---|---|---|
include | array | {"$ref": "#/components/schemas/includable/description"} | No | |
input | string or array | Text, image, or file inputs to the model, used to generate a response. Learn more: - Text inputs and outputs - Image inputs - File inputs - Conversation state - function calling |
Yes | |
instructions | string | Inserts a system (or developer) message as the first item in the model's context. When using along with previous_response_id , the instructions from a previousresponse will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses. |
No | |
max_output_tokens | integer | An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens. |
No | |
parallel_tool_calls | boolean | Whether to allow the model to run tool calls in parallel. |
No | True |
previous_response_id | string | The unique ID of the previous response to the model. Use this to create multi-turn conversations. Learn more about Conversation state. |
No | |
reasoning | Reasoning | o-series models only Configuration options for reasoning models. |
No | |
store | boolean | Whether to store the generated model response for later retrieval via API. |
No | True |
stream | boolean | If set to true, the model response data will be streamed to the client as it is generated using server-sent events. |
No | False |
text | object | Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - Text inputs and outputs - Structured Outputs |
No | |
└─ format | TextResponseFormatConfiguration | An object specifying the format that the model must output. Configuring { "type": "json_schema" } enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is { "type": "text" } with no additional options.Not recommended for gpt-4o and newer models: Setting to { "type": "json_object" } enables the older JSON mode, whichensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it. |
No | |
tool_choice | ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction | How the model should select which tool (or tools) to use when generating a response. See the tools parameter to see how to specify which toolsthe model can call. |
No | |
tools | array | An array of tools the model may call while generating a response. You can specify which tool to use by setting the tool_choice parameter.The two categories of tools you can provide the model are: - Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, file search. - Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling. |
No | |
truncation | enum | The truncation strategy to use for the model response. - auto : If the context of this response and previous ones exceedsthe model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - disabled (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: auto , disabled |
No |
A double click action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | Specifies the event type. For a double click action, this property is always set to double_click .Possible values: double_click |
Yes | |
x | integer | The x-coordinate where the double click occurred. |
Yes | |
y | integer | The y-coordinate where the double click occurred. |
Yes |
A drag action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
path | array | An array of coordinates representing the path of the drag action. Coordinates will appear as an array of objects, eg <br>[<br> { x: 100, y: 200 },<br> { x: 200, y: 300 }<br>]<br> |
Yes | |
type | enum | Specifies the event type. For a drag action, this property is always set to drag .Possible values: drag |
Yes |
A message input to the model with a role indicating instruction following
hierarchy. Instructions given with the developer
or system
role take
precedence over instructions given with the user
role. Messages with the
assistant
role are presumed to have been generated by the model in previous
interactions.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | string or InputMessageContentList | Text, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses. |
Yes | |
role | enum | The role of the message input. One of user , assistant , system , ordeveloper .Possible values: user , assistant , system , developer |
Yes | |
type | enum | The type of the message input. Always message .Possible values: message |
No |
A citation to a file.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_id | string | The ID of the file. |
Yes | |
index | integer | The index of the file in the list of files. |
Yes | |
type | enum | The type of the file citation. Always file_citation .Possible values: file_citation |
Yes |
A path to a file.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_id | string | The ID of the file. |
Yes | |
index | integer | The index of the file in the list of files. |
Yes | |
type | enum | The type of the file path. Always file_path .Possible values: file_path |
Yes |
The ranker to use for the file search. If not specified will use the auto
ranker.
Property | Value |
---|---|
Description | The ranker to use for the file search. If not specified will use the auto ranker. |
Type | string |
Values | auto default_2024_08_21 |
A tool that searches for relevant content from uploaded files.
Name | Type | Description | Required | Default |
---|---|---|---|---|
filters | ComparisonFilter or CompoundFilter | A filter to apply based on file attributes. | No | |
max_num_results | integer | The maximum number of results to return. This number should be between 1 and 50 inclusive. |
No | |
ranking_options | object | Ranking options for search. | No | |
└─ ranker | enum | The ranker to use for the file search. Possible values: auto , default-2024-11-15 |
No | |
└─ score_threshold | number | The score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results. |
No | 0 |
type | enum | The type of the file search tool. Always file_search .Possible values: file_search |
Yes | |
vector_store_ids | array | The IDs of the vector stores to search. |
Yes |
The results of a file search tool call. See the
Name | Type | Description | Required | Default |
---|---|---|---|---|
id | string | The unique ID of the file search tool call. |
Yes | |
queries | array | The queries used to search for files. |
Yes | |
results | array | The results of the file search tool call. |
No | |
status | enum | The status of the file search tool call. One of in_progress , searching , incomplete or failed ,Possible values: in_progress , searching , completed , incomplete , failed |
Yes | |
type | enum | The type of the file search tool call. Always file_search_call .Possible values: file_search_call |
Yes |
Defines a function in your own code the model can choose to call. Learn more about function calling.
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | A description of the function. Used by the model to determine whether or not to call the function. |
No | |
name | string | The name of the function to call. |
Yes | |
parameters | object | A JSON schema object describing the parameters of the function. |
Yes | |
strict | boolean | Whether to enforce strict parameter validation. Default true . |
Yes | |
type | enum | The type of the function tool. Always function .Possible values: function |
Yes |
A tool call to run a function. See the
Name | Type | Description | Required | Default |
---|---|---|---|---|
arguments | string | A JSON string of the arguments to pass to the function. |
Yes | |
call_id | string | The unique ID of the function tool call generated by the model. |
Yes | |
id | string | The unique ID of the function tool call. |
Yes | |
name | string | The name of the function to run. |
Yes | |
status | enum | The status of the item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the function tool call. Always function_call .Possible values: function_call |
Yes |
The output of a function tool call.
Name | Type | Description | Required | Default |
---|---|---|---|---|
call_id | string | The unique ID of the function tool call generated by the model. |
Yes | |
id | string | The unique ID of the function tool call output. Populated when this item is returned via API. |
No | |
output | string | A JSON string of the output of the function tool call. |
Yes | |
status | enum | The status of the item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the function tool call output. Always function_call_output .Possible values: function_call_output |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
call_id | string | The unique ID of the function tool call generated by the model. |
Yes | |
id | string | The unique ID of the function call tool output. |
Yes | |
output | string | A JSON string of the output of the function tool call. |
Yes | |
status | enum | The status of the item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the function tool call output. Always function_call_output .Possible values: function_call_output |
Yes |
Specify additional output data to include in the model response. Currently supported values are:
file_search_call.results
: Include the search results of
the file search tool call.message.input_image.image_url
: Include image urls from the input message.computer_call_output.output.image_url
: Include image urls from the computer call output.Property | Value |
---|---|
Description | Specify additional output data to include in the model response. Currently supported values are: - file_search_call.results : Include the search results ofthe file search tool call. - message.input_image.image_url : Include image urls from the input message.- computer_call_output.output.image_url : Include image urls from the computer call output. |
Type | string |
Values | file_search_call.results message.input_image.image_url computer_call_output.output.image_url |
An audio input to the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | string | Base64-encoded audio data. |
Yes | |
format | enum | The format of the audio data. Currently supported formats are mp3 andwav .Possible values: mp3 , wav |
Yes | |
type | enum | The type of the input item. Always input_audio .Possible values: input_audio |
Yes |
This component can be one of the following:
A file input to the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
file_data | string | The content of the file to be sent to the model. |
No | |
file_id | string | The ID of the file to be sent to the model. |
No | |
filename | string | The name of the file to be sent to the model. |
No | |
type | enum | The type of the input item. Always input_file .Possible values: input_file |
Yes |
An image input to the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
detail | enum | The detail level of the image to be sent to the model. One of high ,low , or auto . Defaults to auto .Possible values: high , low , auto |
Yes | |
file_id | string | The ID of the file to be sent to the model. |
No | |
image_url | string | The URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL. |
No | |
type | enum | The type of the input item. Always input_image .Possible values: input_image |
Yes |
This component uses the property type
to discriminate between different types:
This component can be one of the following:
A message input to the model with a role indicating instruction following
hierarchy. Instructions given with the developer
or system
role take
precedence over instructions given with the user
role.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | InputMessageContentList | A list of one or many input items to the model, containing different content types. |
Yes | |
role | enum | The role of the message input. One of user , system , or developer .Possible values: user , system , developer |
Yes | |
status | enum | The status of item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the message input. Always set to message .Possible values: message |
No |
A list of one or many input items to the model, containing different content types.
No properties defined for this component.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | InputMessageContentList | A list of one or many input items to the model, containing different content types. |
Yes | |
id | string | The unique ID of the message input. |
Yes | |
role | enum | The role of the message input. One of user , system , or developer .Possible values: user , system , developer |
Yes | |
status | enum | The status of item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the message input. Always set to message .Possible values: message |
No |
A text input to the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
text | string | The text input to the model. |
Yes | |
type | enum | The type of the input item. Always input_text .Possible values: input_text |
Yes |
Content item used to generate a response.
This component can be one of the following:
An internal identifier for an item to reference.
Name | Type | Description | Required | Default |
---|---|---|---|---|
id | string | The ID of the item to reference. |
Yes | |
type | enum | The type of item to reference. Always item_reference .Possible values: item_reference |
Yes |
Content item used to generate a response.
This component can be one of the following:
A collection of keypresses the model would like to perform.
Name | Type | Description | Required | Default |
---|---|---|---|---|
keys | array | The combination of keys the model is requesting to be pressed. This is an array of strings, each representing a key. |
Yes | |
type | enum | Specifies the event type. For a keypress action, this property is always set to keypress .Possible values: keypress |
Yes |
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
No properties defined for this component.
Name | Type | Description | Required | Default |
---|---|---|---|---|
metadata | Metadata | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters. |
No | |
model | string | Model used to generate the responses. | No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
No | 1 |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
No | 1 |
user | string | A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. | No |
A mouse move action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | Specifies the event type. For a move action, this property is always set to move .Possible values: move |
Yes | |
x | integer | The x-coordinate to move to. |
Yes | |
y | integer | The y-coordinate to move to. |
Yes |
An audio output from the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | string | Base64-encoded audio data from the model. |
Yes | |
transcript | string | The transcript of the audio data from the model. |
Yes | |
type | enum | The type of the output audio. Always output_audio .Possible values: output_audio |
Yes |
This component can be one of the following:
Name | Type | Description | Required | Default |
---|---|---|---|---|
action | ComputerAction | Yes | ||
arguments | string | A JSON string of the arguments to pass to the function. |
Yes | |
call_id | string | An identifier used when responding to the tool call with output. |
Yes | |
content | array | Reasoning text contents. |
Yes | |
id | string | The unique identifier of the reasoning content. |
Yes | |
name | string | The name of the function to run. |
Yes | |
pending_safety_checks | array | The pending safety checks for the computer call. |
Yes | |
queries | array | The queries used to search for files. |
Yes | |
results | array | The results of the file search tool call. |
No | |
role | enum | The role of the output message. Always assistant .Possible values: assistant |
Yes | |
status | enum | The status of the item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
Yes | |
type | enum | The type of the object. Always reasoning .Possible values: reasoning |
Yes |
An output message from the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | array | The content of the output message. |
Yes | |
id | string | The unique ID of the output message. |
Yes | |
role | enum | The role of the output message. Always assistant .Possible values: assistant |
Yes | |
status | enum | The status of the message input. One of in_progress , completed , orincomplete . Populated when input items are returned via API.Possible values: in_progress , completed , incomplete |
Yes | |
type | enum | The type of the output message. Always message .Possible values: message |
Yes |
A text output from the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
annotations | array | The annotations of the text output. |
Yes | |
text | string | The text output from the model. |
Yes | |
type | enum | The type of the output text. Always output_text .Possible values: output_text |
Yes |
o-series models only
Configuration options for reasoning models.
Name | Type | Description | Required | Default |
---|---|---|---|---|
effort | ReasoningEffort | o-series models only Constrains effort on reasoning for reasoning models. Currently supported values are low , medium , and high . Reducingreasoning effort can result in faster responses and fewer tokens used on reasoning in a response. |
Yes | medium |
generate_summary | enum | o-series models only A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of concise or detailed .Possible values: concise , detailed |
No |
A description of the chain of thought used by a reasoning model while generating a response.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content | array | Reasoning text contents. |
Yes | |
id | string | The unique identifier of the reasoning content. |
Yes | |
status | enum | The status of the item. One of in_progress , completed , orincomplete . Populated when items are returned via API.Possible values: in_progress , completed , incomplete |
No | |
type | enum | The type of the object. Always reasoning .Possible values: reasoning |
Yes |
A refusal from the model.
Name | Type | Description | Required | Default |
---|---|---|---|---|
refusal | string | The refusal explanationfrom the model. |
Yes | |
type | enum | The type of the refusal. Always refusal .Possible values: refusal |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
created_at | number | Unix timestamp (in seconds) of when this Response was created. |
Yes | |
error | ResponseError | An error object returned when the model fails to generate a Response. |
Yes | |
id | string | Unique identifier for this Response. |
Yes | |
incomplete_details | object | Details about why the response is incomplete. |
Yes | |
└─ reason | enum | The reason why the response is incomplete. Possible values: max_output_tokens , content_filter |
No | |
instructions | string | Inserts a system (or developer) message as the first item in the model's context. When using along with previous_response_id , the instructions from a previousresponse will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses. |
Yes | |
max_output_tokens | integer | An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens. |
No | |
metadata | Metadata | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters. |
Yes | |
model | string | Model used to generate the responses. | Yes | |
object | enum | The object type of this resource - always set to response .Possible values: response |
Yes | |
output | array | An array of content items generated by the model. - The length and order of items in the output array is dependenton the model's response. - Rather than accessing the first item in the output array and assuming it's an assistant message with the content generated bythe model, you might consider using the output_text property wheresupported in SDKs. |
Yes | |
output_text | string | SDK-only convenience property that contains the aggregated text output from all output_text items in the output array, if any are present. Supported in the Python and JavaScript SDKs. |
No | |
parallel_tool_calls | boolean | Whether to allow the model to run tool calls in parallel. |
Yes | True |
previous_response_id | string | The unique ID of the previous response to the model. Use this to create multi-turn conversations. Learn more about Conversation state. |
No | |
reasoning | Reasoning | o-series models only Configuration options for reasoning models. |
No | |
status | enum | The status of the response generation. One of completed , failed , in_progress , or incomplete .Possible values: completed , failed , in_progress , incomplete |
No | |
temperature | number | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both. |
Yes | 1 |
text | object | Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - Text inputs and outputs - Structured Outputs |
No | |
└─ format | TextResponseFormatConfiguration | An object specifying the format that the model must output. Configuring { "type": "json_schema" } enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is { "type": "text" } with no additional options.Not recommended for gpt-4o and newer models: Setting to { "type": "json_object" } enables the older JSON mode, whichensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it. |
No | |
tool_choice | ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction | How the model should select which tool (or tools) to use when generating a response. See the tools parameter to see how to specify which toolsthe model can call. |
Yes | |
tools | array | An array of tools the model may call while generating a response. You can specify which tool to use by setting the tool_choice parameter.The two categories of tools you can provide the model are: - Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, file search. - Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling. |
Yes | |
top_p | number | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |
Yes | 1 |
truncation | enum | The truncation strategy to use for the model response. - auto : If the context of this response and previous ones exceedsthe model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - disabled (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: auto , disabled |
No | |
usage | ResponseUsage | Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used. |
No | |
user | string | A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. | No |
Emitted when there is a partial audio response.
Name | Type | Description | Required | Default |
---|---|---|---|---|
delta | string | A chunk of Base64 encoded response audio bytes. |
Yes | |
type | enum | The type of the event. Always response.audio.delta .Possible values: response.audio.delta |
Yes |
Emitted when the audio response is complete.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | The type of the event. Always response.audio.done .Possible values: response.audio.done |
Yes |
Emitted when there is a partial transcript of audio.
Name | Type | Description | Required | Default |
---|---|---|---|---|
delta | string | The partial transcript of the audio response. |
Yes | |
type | enum | The type of the event. Always response.audio.transcript.delta .Possible values: response.audio.transcript.delta |
Yes |
Emitted when the full audio transcript is completed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | The type of the event. Always response.audio.transcript.done .Possible values: response.audio.transcript.done |
Yes |
Emitted when a partial code snippet is added by the code interpreter.
Name | Type | Description | Required | Default |
---|---|---|---|---|
delta | string | The partial code snippet added by the code interpreter. |
Yes | |
output_index | integer | The index of the output item that the code interpreter call is in progress. |
Yes | |
type | enum | The type of the event. Always response.code_interpreter_call.code.delta .Possible values: response.code_interpreter_call.code.delta |
Yes |
Emitted when code snippet output is finalized by the code interpreter.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | string | The final code snippet output by the code interpreter. |
Yes | |
output_index | integer | The index of the output item that the code interpreter call is in progress. |
Yes | |
type | enum | The type of the event. Always response.code_interpreter_call.code.done .Possible values: response.code_interpreter_call.code.done |
Yes |
Emitted when the code interpreter call is completed.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code_interpreter_call | CodeInterpreterToolCall | A tool call to run code. |
Yes | |
output_index | integer | The index of the output item that the code interpreter call is in progress. |
Yes | |
type | enum | The type of the event. Always response.code_interpreter_call.completed .Possible values: response.code_interpreter_call.completed |
Yes |
Emitted when a code interpreter call is in progress.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code_interpreter_call | CodeInterpreterToolCall | A tool call to run code. |
Yes | |
output_index | integer | The index of the output item that the code interpreter call is in progress. |
Yes | |
type | enum | The type of the event. Always response.code_interpreter_call.in_progress .Possible values: response.code_interpreter_call.in_progress |
Yes |
Emitted when the code interpreter is actively interpreting the code snippet.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code_interpreter_call | CodeInterpreterToolCall | A tool call to run code. |
Yes | |
output_index | integer | The index of the output item that the code interpreter call is in progress. |
Yes | |
type | enum | The type of the event. Always response.code_interpreter_call.interpreting .Possible values: response.code_interpreter_call.interpreting |
Yes |
Emitted when the model response is complete.
Name | Type | Description | Required | Default |
---|---|---|---|---|
response | response | Yes | ||
type | enum | The type of the event. Always response.completed .Possible values: response.completed |
Yes |
Emitted when a new content part is added.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_index | integer | The index of the content part that was added. |
Yes | |
item_id | string | The ID of the output item that the content part was added to. |
Yes | |
output_index | integer | The index of the output item that the content part was added to. |
Yes | |
part | OutputContent | Yes | ||
type | enum | The type of the event. Always response.content_part.added .Possible values: response.content_part.added |
Yes |
Emitted when a content part is done.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_index | integer | The index of the content part that is done. |
Yes | |
item_id | string | The ID of the output item that the content part was added to. |
Yes | |
output_index | integer | The index of the output item that the content part was added to. |
Yes | |
part | OutputContent | Yes | ||
type | enum | The type of the event. Always response.content_part.done .Possible values: response.content_part.done |
Yes |
An event that is emitted when a response is created.
Name | Type | Description | Required | Default |
---|---|---|---|---|
response | response | Yes | ||
type | enum | The type of the event. Always response.created .Possible values: response.created |
Yes |
An error object returned when the model fails to generate a Response.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | ResponseErrorCode | The error code for the response. |
Yes | |
message | string | A human-readable description of the error. |
Yes |
The error code for the response.
Property | Value |
---|---|
Description | The error code for the response. |
Type | string |
Values | server_error rate_limit_exceeded invalid_prompt vector_store_timeout invalid_image invalid_image_format invalid_base64_image invalid_image_url image_too_large image_too_small image_parse_error image_content_policy_violation invalid_image_mode image_file_too_large unsupported_image_media_type empty_image_file failed_to_download_image image_file_not_found |
Emitted when an error occurs.
Name | Type | Description | Required | Default |
---|---|---|---|---|
code | string | The error code. |
Yes | |
message | string | The error message. |
Yes | |
param | string | The error parameter. |
Yes | |
type | enum | The type of the event. Always error .Possible values: error |
Yes |
An event that is emitted when a response fails.
Name | Type | Description | Required | Default |
---|---|---|---|---|
response | response | Yes | ||
type | enum | The type of the event. Always response.failed .Possible values: response.failed |
Yes |
Emitted when a file search call is completed (results found).
Name | Type | Description | Required | Default |
---|---|---|---|---|
item_id | string | The ID of the output item that the file search call is initiated. |
Yes | |
output_index | integer | The index of the output item that the file search call is initiated. |
Yes | |
type | enum | The type of the event. Always response.file_search_call.completed .Possible values: response.file_search_call.completed |
Yes |
Emitted when a file search call is initiated.
Name | Type | Description | Required | Default |
---|---|---|---|---|
item_id | string | The ID of the output item that the file search call is initiated. |
Yes | |
output_index | integer | The index of the output item that the file search call is initiated. |
Yes | |
type | enum | The type of the event. Always response.file_search_call.in_progress .Possible values: response.file_search_call.in_progress |
Yes |
Emitted when a file search is currently searching.
Name | Type | Description | Required | Default |
---|---|---|---|---|
item_id | string | The ID of the output item that the file search call is initiated. |
Yes | |
output_index | integer | The index of the output item that the file search call is searching. |
Yes | |
type | enum | The type of the event. Always response.file_search_call.searching .Possible values: response.file_search_call.searching |
Yes |
Emitted when there is a partial function-call arguments delta.
Name | Type | Description | Required | Default |
---|---|---|---|---|
delta | string | The function-call arguments delta that is added. |
Yes | |
item_id | string | The ID of the output item that the function-call arguments delta is added to. |
Yes | |
output_index | integer | The index of the output item that the function-call arguments delta is added to. |
Yes | |
type | enum | The type of the event. Always response.function_call_arguments.delta .Possible values: response.function_call_arguments.delta |
Yes |
Emitted when function-call arguments are finalized.
Name | Type | Description | Required | Default |
---|---|---|---|---|
arguments | string | The function-call arguments. | Yes | |
item_id | string | The ID of the item. | Yes | |
output_index | integer | The index of the output item. | Yes | |
type | enum | Possible values: response.function_call_arguments.done |
Yes |
Emitted when the response is in progress.
Name | Type | Description | Required | Default |
---|---|---|---|---|
response | response | Yes | ||
type | enum | The type of the event. Always response.in_progress .Possible values: response.in_progress |
Yes |
An event that is emitted when a response finishes as incomplete.
Name | Type | Description | Required | Default |
---|---|---|---|---|
response | response | Yes | ||
type | enum | The type of the event. Always response.incomplete .Possible values: response.incomplete |
Yes |
A list of Response items.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | A list of items used to generate this response. | Yes | |
first_id | string | The ID of the first item in the list. | Yes | |
has_more | boolean | Whether there are more items available. | Yes | |
last_id | string | The ID of the last item in the list. | Yes | |
object | enum | The type of object returned, must be list .Possible values: list |
Yes |
Output types that you would like the model to generate. Most models are capable of generating text, which is the default:
["text"]
The gpt-4o-audio-preview
model can also be used to
generate audio. To request that this model generate
both text and audio responses, you can use:
["text", "audio"]
No properties defined for this component.
Output types that you would like the model to generate. Most models are capable of generating text, which is the default:
["text"]
This API will soon support other output modalities, including audio and images.
No properties defined for this component.
Emitted when a new output item is added.
Name | Type | Description | Required | Default |
---|---|---|---|---|
item | OutputItem | Yes | ||
output_index | integer | The index of the output item that was added. |
Yes | |
type | enum | The type of the event. Always response.output_item.added .Possible values: response.output_item.added |
Yes |
Emitted when an output item is marked done.
Name | Type | Description | Required | Default |
---|---|---|---|---|
item | OutputItem | Yes | ||
output_index | integer | The index of the output item that was marked done. |
Yes | |
type | enum | The type of the event. Always response.output_item.done .Possible values: response.output_item.done |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
instructions | string | Inserts a system (or developer) message as the first item in the model's context. When using along with previous_response_id , the instructions from a previousresponse will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses. |
No | |
max_output_tokens | integer | An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens. |
No | |
previous_response_id | string | The unique ID of the previous response to the model. Use this to create multi-turn conversations. Learn more about Conversation state. |
No | |
reasoning | Reasoning | o-series models only Configuration options for reasoning models. |
No | |
text | object | Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - Text inputs and outputs - Structured Outputs |
No | |
└─ format | TextResponseFormatConfiguration | An object specifying the format that the model must output. Configuring { "type": "json_schema" } enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is { "type": "text" } with no additional options.Not recommended for gpt-4o and newer models: Setting to { "type": "json_object" } enables the older JSON mode, whichensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it. |
No | |
tool_choice | ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction | How the model should select which tool (or tools) to use when generating a response. See the tools parameter to see how to specify which toolsthe model can call. |
No | |
tools | array | An array of tools the model may call while generating a response. You can specify which tool to use by setting the tool_choice parameter.The two categories of tools you can provide the model are: - Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, file search. - Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling. |
No | |
truncation | enum | The truncation strategy to use for the model response. - auto : If the context of this response and previous ones exceedsthe model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - disabled (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: auto , disabled |
No |
Emitted when there is a partial refusal text.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_index | integer | The index of the content part that the refusal text is added to. |
Yes | |
delta | string | The refusal text that is added. |
Yes | |
item_id | string | The ID of the output item that the refusal text is added to. |
Yes | |
output_index | integer | The index of the output item that the refusal text is added to. |
Yes | |
type | enum | The type of the event. Always response.refusal.delta .Possible values: response.refusal.delta |
Yes |
Emitted when refusal text is finalized.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_index | integer | The index of the content part that the refusal text is finalized. |
Yes | |
item_id | string | The ID of the output item that the refusal text is finalized. |
Yes | |
output_index | integer | The index of the output item that the refusal text is finalized. |
Yes | |
refusal | string | The refusal text that is finalized. |
Yes | |
type | enum | The type of the event. Always response.refusal.done .Possible values: response.refusal.done |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
annotation | Annotation | Yes | ||
annotation_index | integer | The index of the annotation that was added. |
Yes | |
arguments | string | The function-call arguments. | Yes | |
code | string | The error code. |
Yes | |
code_interpreter_call | CodeInterpreterToolCall | A tool call to run code. |
Yes | |
content_index | integer | The index of the content part that the text content is finalized. |
Yes | |
delta | string | The text delta that was added. |
Yes | |
item | OutputItem | The output item that was marked done. |
Yes | |
item_id | string | The ID of the output item that the text content is finalized. |
Yes | |
message | string | The error message. |
Yes | |
output_index | integer | The index of the output item that the text content is finalized. |
Yes | |
param | string | The error parameter. |
Yes | |
part | OutputContent | The content part that is done. |
Yes | |
refusal | string | The refusal text that is finalized. |
Yes | |
response | response | The response that was incomplete. |
Yes | |
text | string | The text content that is finalized. |
Yes | |
type | enum | The type of the event. Always response.output_text.done .Possible values: response.output_text.done |
Yes |
Emitted when a text annotation is added.
Name | Type | Description | Required | Default |
---|---|---|---|---|
annotation | Annotation | Yes | ||
annotation_index | integer | The index of the annotation that was added. |
Yes | |
content_index | integer | The index of the content part that the text annotation was added to. |
Yes | |
item_id | string | The ID of the output item that the text annotation was added to. |
Yes | |
output_index | integer | The index of the output item that the text annotation was added to. |
Yes | |
type | enum | The type of the event. Always response.output_text.annotation.added .Possible values: response.output_text.annotation.added |
Yes |
Emitted when there is an additional text delta.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_index | integer | The index of the content part that the text delta was added to. |
Yes | |
delta | string | The text delta that was added. |
Yes | |
item_id | string | The ID of the output item that the text delta was added to. |
Yes | |
output_index | integer | The index of the output item that the text delta was added to. |
Yes | |
type | enum | The type of the event. Always response.output_text.delta .Possible values: response.output_text.delta |
Yes |
Emitted when text content is finalized.
Name | Type | Description | Required | Default |
---|---|---|---|---|
content_index | integer | The index of the content part that the text content is finalized. |
Yes | |
item_id | string | The ID of the output item that the text content is finalized. |
Yes | |
output_index | integer | The index of the output item that the text content is finalized. |
Yes | |
text | string | The text content that is finalized. |
Yes | |
type | enum | The type of the event. Always response.output_text.done .Possible values: response.output_text.done |
Yes |
Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used.
Name | Type | Description | Required | Default |
---|---|---|---|---|
input_tokens | integer | The number of input tokens. | Yes | |
output_tokens | integer | The number of output tokens. | Yes | |
output_tokens_details | object | A detailed breakdown of the output tokens. | Yes | |
└─ reasoning_tokens | integer | The number of reasoning tokens. | No | |
total_tokens | integer | The total number of tokens used. | Yes |
A screenshot action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | Specifies the event type. For a screenshot action, this property is always set to screenshot .Possible values: screenshot |
Yes |
A scroll action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
scroll_x | integer | The horizontal scroll distance. |
Yes | |
scroll_y | integer | The vertical scroll distance. |
Yes | |
type | enum | Specifies the event type. For a scroll action, this property is always set to scroll .Possible values: scroll |
Yes | |
x | integer | The x-coordinate where the scroll occurred. |
Yes | |
y | integer | The y-coordinate where the scroll occurred. |
Yes |
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
This component can be one of the following:
An object specifying the format that the model must output.
Configuring { "type": "json_schema" }
enables Structured Outputs,
which ensures the model will match your supplied JSON schema.
The default format is { "type": "text" }
with no additional options.
Not recommended for gpt-4o and newer models:
Setting to { "type": "json_object" }
enables the older JSON mode, which
ensures the message the model generates is valid JSON. Using json_schema
is preferred for models that support it.
This component can be one of the following:
JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
Name | Type | Description | Required | Default |
---|---|---|---|---|
description | string | A description of what the response format is for, used by the model to determine how to respond in the format. |
No | |
name | string | The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. |
No | |
schema | ResponseFormatJsonSchemaSchema | The schema for the response format, described as a JSON Schema object. | Yes | |
strict | boolean | Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported whenstrict is true . |
No | False |
type | enum | The type of response format being defined. Always json_schema .Possible values: json_schema |
Yes |
This component can be one of the following:
Use this option to force the model to call a specific function.
Name | Type | Description | Required | Default |
---|---|---|---|---|
name | string | The name of the function to call. | Yes | |
type | enum | For function calling, the type is always function .Possible values: function |
Yes |
Controls which (if any) tool is called by the model.
none
means the model will not call any tool and instead generates a message.
auto
means the model can pick between generating a message or calling one or
more tools.
required
means the model must call one or more tools.
Property | Value |
---|---|
Description | Controls which (if any) tool is called by the model.none means the model will not call any tool and instead generates a message.auto means the model can pick between generating a message or calling one ormore tools. required means the model must call one or more tools. |
Type | string |
Values | none auto required |
Indicates that the model should use a built-in tool to generate a response.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | The type of hosted tool the model should to use. Allowed values are: - file_search - computer_use_preview Possible values: file_search , computer_use_preview |
Yes |
An action to type in text.
Name | Type | Description | Required | Default |
---|---|---|---|---|
text | string | The text to type. |
Yes | |
type | enum | Specifies the event type. For a type action, this property is always set to type .Possible values: type |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
attributes | VectorStoreFileAttributes | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers. |
Yes |
A citation for a web resource used to generate a model response.
Name | Type | Description | Required | Default |
---|---|---|---|---|
end_index | integer | The index of the last character of the URL citation in the message. |
Yes | |
start_index | integer | The index of the first character of the URL citation in the message. |
Yes | |
title | string | The title of the web resource. |
Yes | |
type | enum | The type of the URL citation. Always url_citation .Possible values: url_citation |
Yes | |
url | string | The URL of the web resource. |
Yes |
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.
No properties defined for this component.
Represents the parsed content of a vector store file.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | Parsed content of the file. | Yes | |
has_more | boolean | Indicates if there are more content pages to fetch. | Yes | |
next_page | string | The token for the next page, if any. | Yes | |
object | enum | The object type, which is always vector_store.file_content.page Possible values: vector_store.file_content.page |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
filters | ComparisonFilter or CompoundFilter | A filter to apply based on file attributes. | No | |
max_num_results | integer | The maximum number of results to return. This number should be between 1 and 50 inclusive. | No | 10 |
query | string or array | A query string for a search | Yes | |
ranking_options | object | Ranking options for search. | No | |
└─ ranker | enum | Possible values: auto , default-2024-11-15 |
No | |
└─ score_threshold | number | No | 0 | |
rewrite_query | boolean | Whether to rewrite the natural language query for vector search. | No | False |
Name | Type | Description | Required | Default |
---|---|---|---|---|
text | string | The text content returned from search. | Yes | |
type | enum | The type of content. Possible values: text |
Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
attributes | VectorStoreFileAttributes | Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers. |
Yes | |
content | array | Content chunks from the file. | Yes | |
file_id | string | The ID of the vector store file. | Yes | |
filename | string | The name of the vector store file. | Yes | |
score | number | The similarity score for the result. | Yes |
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | array | The list of search result items. | Yes | |
has_more | boolean | Indicates if there are more results to fetch. | Yes | |
next_page | string | The token for the next page, if any. | Yes | |
object | enum | The object type, which is always vector_store.search_results.page Possible values: vector_store.search_results.page |
Yes | |
search_query | array | Yes |
A wait action.
Name | Type | Description | Required | Default |
---|---|---|---|---|
type | enum | Specifies the event type. For a wait action, this property is always set to wait .Possible values: wait |
Yes |
o-series models only
Constrains effort on reasoning for reasoning models.
Currently supported values are low
, medium
, and high
. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Property | Value |
---|---|
Description | o-series models only Constrains effort on reasoning for reasoning models. Currently supported values are low , medium , and high . Reducingreasoning effort can result in faster responses and fewer tokens used on reasoning in a response. |
Type | string |
Default | medium |
Values | low medium high |
Occurs when an error occurs. This can happen due to an internal server error or a timeout.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | error | Yes | ||
event | string | Yes |
event Enum: ErrorEventEnum
Value | Description |
---|---|
error |
Occurs when a stream ends.
Name | Type | Description | Required | Default |
---|---|---|---|---|
data | string | Yes | ||
event | string | Yes |
event Enum: DoneEventEnum
Value | Description |
---|---|
done |
data Enum: DoneEventDataEnum
Value | Description |
---|---|
[DONE] |
Learn about Models, and fine-tuning with the REST API. Learn more about the underlying models that power Azure OpenAI.
Training
Module
Generate text and conversations with .NET and Azure OpenAI Completions - Training
Learn how to use the .NET SDK with the Azure OpenAI service to have your applications carry on conversations with users in natural language.