Edit

Share via


Azure OpenAI Service REST API preview reference

This article provides details on the inference REST API endpoints for Azure OpenAI.

API specs

Managing and interacting with Azure OpenAI models and resources is divided across three primary API surfaces:

  • Control plane
  • Data plane - authoring
  • Data plane - inference

Each API surface/specification encapsulates a different set of Azure OpenAI capabilities. Each API has its own unique set of preview and stable/generally available (GA) API releases. Preview releases currently tend to follow a monthly cadence.

API Latest preview release Latest GA release Specifications Description
Control plane 2024-06-01-preview 2024-10-01 Spec files Azure OpenAI shares a common control plane with all other Azure AI Services. The control plane API is used for things like creating Azure OpenAI resources, model deployment, and other higher level resource management tasks. The control plane also governs what is possible to do with capabilities like Azure Resource Manager, Bicep, Terraform, and Azure CLI.
Data plane - authoring 2025-03-01-preview 2024-10-21 Spec files The data plane authoring API controls fine-tuning, file-upload, ingestion jobs, batch and certain model level queries
Data plane - inference 2025-03-01-preview 2024-10-21 Spec files The data plane inference API provides the inference capabilities/endpoints for features like completions, chat completions, embeddings, speech/whisper, on your data, Dall-e, assistants, etc.

Authentication

Azure OpenAI provides two methods for authentication. You can use either API Keys or Microsoft Entra ID.

  • API Key authentication: For this type of authentication, all API requests must include the API Key in the api-key HTTP header. The Quickstart provides guidance for how to make calls with this type of authentication.

  • Microsoft Entra ID authentication: You can authenticate an API call using a Microsoft Entra token. Authentication tokens are included in a request as the Authorization header. The token provided must be preceded by Bearer, for example Bearer YOUR_AUTH_TOKEN. You can read our how-to guide on authenticating with Microsoft Entra ID.

REST API versioning

The service APIs are versioned using the api-version query parameter. All versions follow the YYYY-MM-DD date structure. For example:

POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01

Data plane inference

The rest of the article covers the latest preview release of the Azure OpenAI data plane inference specification, 2025-03-01-preview. This article includes documentation for the latest preview capabilities like assistants, threads, and vector stores.

If you're looking for documentation on the latest GA API release, refer to the latest GA data plane inference API

Completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-03-01-preview

Creates a completion for the provided prompt, parameters and chosen model.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
best_of integer Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed.

When used with n, best_of controls the number of candidate completions and n specifies how many to return best_of must be greater than n.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
echo boolean Echo back the prompt in addition to the completion
No False
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated.
No None
logprobs integer Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.

The maximum value for logprobs is 5.
No None
max_tokens integer The maximum number of tokens that can be generated in the completion.

The token count of your prompt plus max_tokens cannot exceed the model's context length. Example Python code for counting tokens.
No 16
n integer How many completions to generate for each prompt.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
prompt string or array The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.

Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.
Yes
seed integer If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stop string or array Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
No
stream boolean Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code.
No False
suffix string The suffix that comes after a completion of inserted text.

This parameter is only supported for gpt-3.5-turbo-instruct.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json createCompletionResponse

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-03-01-preview

{
 "prompt": [
  "tell me a joke about mango"
 ],
 "max_tokens": 32,
 "temperature": 1.0,
 "n": 1
}

Responses: Status Code: 200

{
  "body": {
    "id": "cmpl-7QmVI15qgYVllxK0FtxVGG6ywfzaq",
    "created": 1686617332,
    "choices": [
      {
        "text": "es\n\nWhat do you call a mango who's in charge?\n\nThe head mango.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": null
      }
    ],
    "usage": {
      "completion_tokens": 20,
      "prompt_tokens": 6,
      "total_tokens": 26
    }
  }
}

Embeddings - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-03-01-preview

Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string The deployment id of the model which was deployed.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
dimensions integer The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. No
encoding_format string The format to return the embeddings in. Can be either float or base64. Defaults to float. No
input string or array Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8192 tokens for text-embedding-ada-002), cannot be an empty string, and any array must be 2048 dimensions or less. Yes
input_type string input type of embedding search to use No
user string A unique identifier representing your end-user, which can help monitoring and detecting abuse. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json object

Examples

Example

Return the embeddings for a given prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-03-01-preview

{
 "input": [
  "this is a test"
 ]
}

Responses: Status Code: 200

{
  "body": {
    "data": [
      {
        "index": 0,
        "embedding": [
          -0.012838088,
          -0.007421397,
          -0.017617522,
          -0.028278312,
          -0.018666342,
          0.01737855,
          -0.01821495,
          -0.006950092,
          -0.009937238,
          -0.038580645,
          0.010674067,
          0.02412286,
          -0.013647936,
          0.013189907,
          0.0021125758,
          0.012406612,
          0.020790534,
          0.00074595667,
          0.008397198,
          -0.00535031,
          0.008968075,
          0.014351576,
          -0.014086051,
          0.015055214,
          -0.022211088,
          -0.025198232,
          0.0065186154,
          -0.036350243,
          0.009180495,
          -0.009698266,
          0.009446018,
          -0.008463579,
          -0.0040426035,
          -0.03443847,
          -0.00091273896,
          -0.0019217303,
          0.002349888,
          -0.021560553,
          0.016515596,
          -0.015572986,
          0.0038666942,
          -8.432463e-05,
          0.0032178196,
          -0.020365695,
        ]
      }
    ],
    "usage": {
      "prompt_tokens": 4,
      "total_tokens": 4
    }
  }
}

Chat completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

Creates a completion for the chat message

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
audio object Parameters for audio output. Required when audio output is requested with
modalities: ["audio"].
No
└─ format enum Specifies the output audio format. Must be one of wav, mp3, flac,
opus, or pcm16.

Possible values: wav, mp3, flac, opus, pcm16
No
└─ voice enum Specifies the voice type. Supported voices are alloy, echo,
fable, onyx, nova, and shimmer.

Possible values: alloy, echo, fable, onyx, nova, shimmer
No
data_sources array The configuration entries for Azure OpenAI chat extensions that use them.
This additional specification is only compatible with Azure OpenAI.
No
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
function_call string or chatCompletionFunctionCallOption Deprecated in favor of tool_choice.

Controls which (if any) function is called by the model.
none means the model will not call a function and instead generates a message.
auto means the model can pick between generating a message or calling a function.
Specifying a particular function via {"name": "my_function"} forces the model to call that function.

none is the default when no functions are present. auto is the default if functions are present.
No
functions array Deprecated in favor of tools.

A list of functions the model may generate JSON inputs for.
No
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
No None
logprobs boolean Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. No False
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. No
max_tokens integer The maximum number of tokens that can be generated in the chat completion.

The total length of input tokens and generated tokens is limited by the model's context length. Example Python code for counting tokens.
No
messages array A list of messages comprising the conversation so far. Example Python code. Yes
metadata object Developer-defined tags and values used for filtering completions in the stored completions dashboard. No
modalities ChatCompletionModalities Output types that you would like the model to generate for this request.
Most models are capable of generating text, which is the default:

["text"]

The gpt-4o-audio-preview model can also be used to generate audio. To
request that this model generate both text and audio responses, you can
use:

["text", "audio"]
No
n integer How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. No 1
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
prediction PredictionContent Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. No
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
reasoning_effort enum o1 models only

Constrains effort on reasoning for
reasoning models.

Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Possible values: low, medium, high
No
response_format ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which guarantees the model will match your supplied JSON schema.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
seed integer This feature is in Beta.
If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stop string or array Up to 4 sequences where the API will stop generating further tokens.
No
store boolean Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. No
stream boolean If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code.
No False
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
tool_choice chatCompletionToolChoiceOption Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present. No
tools array A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
No
top_logprobs integer An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. No
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No
user_security_context userSecurityContext User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json object

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "system",
   "content": "you are a helpful assistant that talks like a pirate"
  },
  {
   "role": "user",
   "content": "can you tell me how to care for a parrot?"
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Ahoy matey! So ye be wantin' to care for a fine squawkin' parrot, eh? Well, shiver me timbers, let ol' Cap'n Assistant share some wisdom with ye! Here be the steps to keepin' yer parrot happy 'n healthy:\n\n1. Secure a sturdy cage: Yer parrot be needin' a comfortable place to lay anchor! Be sure ye get a sturdy cage, at least double the size of the bird's wingspan, with enough space to spread their wings, yarrrr!\n\n2. Perches 'n toys: Aye, parrots need perches of different sizes, shapes, 'n textures to keep their feet healthy. Also, a few toys be helpin' to keep them entertained 'n their minds stimulated, arrrh!\n\n3. Proper grub: Feed yer feathered friend a balanced diet of high-quality pellets, fruits, 'n veggies to keep 'em strong 'n healthy. Give 'em fresh water every day, or ye\u00e2\u20ac\u2122ll have a scurvy bird on yer hands!\n\n4. Cleanliness: Swab their cage deck! Clean their cage on a regular basis: fresh water 'n food daily, the floor every couple of days, 'n a thorough scrubbing ev'ry few weeks, so the bird be livin' in a tidy haven, arrhh!\n\n5. Socialize 'n train: Parrots be a sociable lot, arrr! Exercise 'n interact with 'em daily to create a bond 'n maintain their mental 'n physical health. Train 'em with positive reinforcement, treat 'em kindly, yarrr!\n\n6. Proper rest: Yer parrot be needin' \u00e2\u20ac\u2122bout 10-12 hours o' sleep each night. Cover their cage 'n let them slumber in a dim, quiet quarter for a proper night's rest, ye scallywag!\n\n7. Keep a weather eye open for illness: Birds be hidin' their ailments, arrr! Be watchful for signs of sickness, such as lethargy, loss of appetite, puffin' up, or change in droppings, and make haste to a vet if need be.\n\n8. Provide fresh air 'n avoid toxins: Parrots be sensitive to draft and pollutants. Keep yer quarters well ventilated, but no drafts, arrr! Be mindful of toxins like Teflon fumes, candles, or air fresheners.\n\nSo there ye have it, me hearty! With proper care 'n commitment, yer parrot will be squawkin' \"Yo-ho-ho\" for many years to come! Good luck, sailor, and may the wind be at yer back!"
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search data and system-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "authentication": {
     "type": "system_assigned_managed_identity"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search image vector data.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "query_type": "vector",
    "fields_mapping": {
     "image_vector_fields": [
      "image_vector"
     ]
    },
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion."
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search vector data, previous assistant message and user-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a cat?"
  },
  {
   "role": "assistant",
   "content": "Content of the completion [doc1].",
   "context": {
    "intent": "cat care"
   }
  },
  {
   "role": "user",
   "content": "how about dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "authentication": {
     "type": "user_assigned_managed_identity",
     "managed_identity_resource_id": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{resource-name}"
    },
    "index_name": "{index name}",
    "query_type": "vector",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "in_scope": true,
    "top_n_documents": 5,
    "strictness": 3,
    "role_information": "You are an AI assistant that helps people find information.",
    "fields_mapping": {
     "content_fields_separator": "\\n",
     "content_fields": [
      "content"
     ],
     "filepath_field": "filepath",
     "title_field": "title",
     "url_field": "url",
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content 2.",
                "title": "Citation Title 2",
                "filepath": "contoso2.txt",
                "url": "https://contoso.blob.windows.net/container/contoso2.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Azure Cosmos DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_cosmos_db",
   "parameters": {
    "authentication": {
     "type": "connection_string",
     "connection_string": "mongodb+srv://rawantest:{password}$@{cluster-name}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
    },
    "database_name": "vectordb",
    "container_name": "azuredocs",
    "index_name": "azuredocindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Mongo DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "mongo_db",
   "parameters": {
    "authentication": {
     "type": "username_and_password",
     "username": "<username>",
     "password": "<password>"
    },
    "endpoint": "<endpoint_name>",
    "app_name": "<application name>",
    "database_name": "sampledb",
    "collection_name": "samplecollection",
    "index_name": "sampleindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Elasticsearch.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "elasticsearch",
   "parameters": {
    "endpoint": "https://your-elasticsearch-endpoint.eastus.azurecontainer.io",
    "index_name": "{index name}",
    "authentication": {
     "type": "key_and_key_id",
     "key": "{key}",
     "key_id": "{key id}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Pinecone resource.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-03-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "pinecone",
   "parameters": {
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    },
    "environment": "{environment name}",
    "index_name": "{index name}",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "title_field": "title",
     "url_field": "url",
     "filepath_field": "filepath",
     "content_fields": [
      "content"
     ],
     "content_fields_separator": "\n"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Transcriptions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-03-01-preview

Transcribes audio into the input language.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name Type Description Required Default
file string The audio file object to transcribe. Yes
language string The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. No
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0
timestamp_granularities[] array The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. No ['segment']

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json object
text/plain string Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-03-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Translations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-03-01-preview

Transcribes and translates input audio into English text.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name Type Description Required Default
file string The audio file to translate. Yes
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json object
text/plain string Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-03-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-03-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Speech - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-03-01-preview

Generates audio from the input text.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name Type Description Required Default
input string The text to synthesize audio for. The maximum length is 4096 characters. Yes
response_format enum The format to synthesize the audio in.
Possible values: mp3, opus, aac, flac, wav, pcm
No
speed number The speed of the synthesized audio. Select a value from 0.25 to 4.0. 1.0 is the default. No 1.0
voice enum The voice to use for speech synthesis.
Possible values: alloy, echo, fable, onyx, nova, shimmer
Yes

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/octet-stream string

Examples

Example

Synthesizes audio from the provided text.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-03-01-preview

{
 "input": "Hi! What are you going to make?",
 "voice": "fable",
 "response_format": "mp3"
}

Responses: Status Code: 200

{
  "body": "101010101"
}

Image generations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-03-01-preview

Generates a batch of images from a text caption on a given DALLE model deployment

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id path Yes string
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
n integer The number of images to generate. No 1
prompt string A text description of the desired image(s). The maximum length is 4000 characters. Yes
quality imageQuality The quality of the image that will be generated. No standard
response_format imagesResponseFormat The format in which the generated images are returned. No url
size imageSize The size of the generated images. No 1024x1024
style imageStyle The style of the generated images. No vivid
user string A unique identifier representing your end-user, which can help to monitor and detect abuse. No

Responses

Status Code: 200

Description: Ok

Content-Type Type Description
application/json generateImagesResponse

Status Code: default

Description: An error occurred.

Content-Type Type Description
application/json dalleErrorResponse

Examples

Example

Creates images given a prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-03-01-preview

{
 "prompt": "In the style of WordArt, Microsoft Clippy wearing a cowboy hat.",
 "n": 1,
 "style": "natural",
 "quality": "standard"
}

Responses: Status Code: 200

{
  "body": {
    "created": 1698342300,
    "data": [
      {
        "revised_prompt": "A vivid, natural representation of Microsoft Clippy wearing a cowboy hat.",
        "prompt_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          },
          "profanity": {
            "detected": false,
            "filtered": false
          },
          "custom_blocklists": {
            "filtered": false,
            "details": []
          }
        },
        "url": "https://dalletipusw2.blob.core.windows.net/private/images/e5451cc6-b1ad-4747-bd46-b89a3a3b8bc3/generated_00.png?se=2023-10-27T17%3A45%3A09Z&...",
        "content_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          }
        }
      }
    ]
  }
}

List - Assistants

GET https://{endpoint}/openai/assistants?api-version=2025-03-01-preview

Returns a list of assistants.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listAssistantsResponse

Examples

Example

Returns a list of assistants.

GET https://{endpoint}/openai/assistants?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "asst_abc123",
        "object": "assistant",
        "created_at": 1707257477,
        "name": "Stock Analyst",
        "description": null,
        "model": "gpt-4-1106-preview",
        "instructions": "You are a financial analyst that analyzes stock market prices and other financial data present on user uploaded files or by calling external APIs.",
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc456",
        "object": "assistant",
        "created_at": 1698982718,
        "name": "My Assistant",
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": "You are a helpful assistant designed to make me better at coding!",
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc789",
        "object": "assistant",
        "created_at": 1698982643,
        "name": null,
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      }
    ],
    "first_id": "asst_abc123",
    "last_id": "asst_abc789",
    "has_more": false
  }
}

Create - Assistant

POST https://{endpoint}/openai/assistants?api-version=2025-03-01-preview

Create an assistant with a model and instructions.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 256,000 characters.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string Yes
name string The name of the assistant. The maximum length is 256 characters.
No
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
└─ vector_stores array A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json assistantObject

Examples

Example

Create an assistant with a model and instructions.

POST https://{endpoint}/openai/assistants?api-version=2025-03-01-preview

{
 "name": "Math Tutor",
 "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
 "tools": [
  {
   "type": "code_interpreter"
  }
 ],
 "model": "gpt-4-1106-preview"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant",
    "created_at": 1707295707,
    "name": "Math Tutor",
    "description": null,
    "model": "gpt-4-1106-preview",
    "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Get - Assistant

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview

Retrieves an assistant.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id path Yes string The ID of the assistant to retrieve.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json assistantObject

Examples

Example

Retrieves an assistant.

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_abc123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Modify - Assistant

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview

Modifies an assistant.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id path Yes string The ID of the assistant to modify.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 32768 characters.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string No
name string The name of the assistant. The maximum length is 256 characters.
No
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array Overrides the list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json assistantObject

Examples

Example

Modifies an assistant.

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview

{
 "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
 "tools": [
  {
   "type": "file_search"
  }
 ],
 "model": "gpt-4-turbo"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": []
      }
    },
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Delete - Assistant

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview

Delete an assistant.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id path Yes string The ID of the assistant to delete.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteAssistantResponse

Examples

Example

Deletes an assistant.

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant.deleted",
    "deleted": true
  }
}

Create - Thread

POST https://{endpoint}/openai/threads?api-version=2025-03-01-preview

Create a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
messages array A list of messages to start the thread with. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
No
└─ vector_stores array A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json threadObject

Examples

Example

Creates a thread.

POST https://{endpoint}/openai/threads?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {}
  }
}

Get - Thread

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview

Retrieves a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to retrieve.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json threadObject

Examples

Example

Retrieves a thread.

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {},
    "tool_resources": {
      "code_interpreter": {
        "file_ids": []
      }
    }
  }
}

Modify - Thread

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview

Modifies a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to modify. Only the metadata can be modified.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of File IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json threadObject

Examples

Example

Modifies a thread.

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {
      "modified": "true",
      "user": "abc123"
    },
    "tool_resources": {}
  }
}

Delete - Thread

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview

Delete a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to delete.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteThreadResponse

Examples

Example

Deletes a thread.

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread.deleted",
    "deleted": true
  }
}

List - Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview

Returns a list of messages for a given thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread the messages belong to.
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
run_id query No string Filter messages by the run ID that generated them.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listMessagesResponse

Examples

Example

List Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "msg_abc123",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "How does AI work? Explain it in simple terms.",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      },
      {
        "id": "msg_abc456",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "Hello, what is AI?",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      }
    ],
    "first_id": "msg_abc123",
    "last_id": "msg_abc456",
    "has_more": false
  }
}

Create - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview

Create a message.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to create a message for.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
attachments array A list of files attached to the message, and the tools they should be added to. No
content string The content of the message. Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
role string The role of the entity that is creating the message. Allowed values include:
- user: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.
- assistant: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.
Yes

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json messageObject

Examples

Example

Create a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-03-01-preview

{
 "role": "user",
 "content": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces."
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "assistant_id": null,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "attachments": [],
    "metadata": {}
  }
}

Get - Message

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview

Retrieve a message.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to which this message belongs.
message_id path Yes string The ID of the message to retrieve.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json messageObject

Examples

Example

Retrieve a message.

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "assistant_id": null,
    "run_id": null,
    "metadata": {}
  }
}

Modify - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview

Modifies a message.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to which this message belongs.
message_id path Yes string The ID of the message to modify.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json messageObject

Examples

Example

Modify a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-03-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_abc123",
    "object": "thread.message",
    "created_at": 1699017614,
    "assistant_id": null,
    "thread_id": "thread_abc123",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "How does AI work? Explain it in simple terms.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "metadata": {
      "modified": "true",
      "user": "abc123"
    }
  }
}

Create - Thread And Run

POST https://{endpoint}/openai/threads/runs?api-version=2025-03-01-preview

Create a thread and run it in one request.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
assistant_id string The ID of the assistant to use to execute this run. Yes
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string The ID of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
thread createThreadRequest No
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model will not call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject

Examples

Example

Create a thread and run it in one request.

POST https://{endpoint}/openai/threads/runs?api-version=2025-03-01-preview

{
 "assistant_id": "asst_abc123",
 "thread": {
  "messages": [
   {
    "role": "user",
    "content": "Explain deep learning to a 5 year old."
   }
  ]
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076792,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": null,
    "expires_at": 1699077392,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "required_action": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You are a helpful assistant.",
    "tools": [],
    "tool_resources": {},
    "metadata": {},
    "temperature": 1.0,
    "top_p": 1.0,
    "max_completion_tokens": null,
    "max_prompt_tokens": null,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "incomplete_details": null,
    "usage": null,
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

List - Runs

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview

Returns a list of runs belonging to a thread.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread the run belongs to.
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listRunsResponse

Examples

Example

Returns a list of runs belonging to a thread.

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "run_abc123",
        "object": "thread.run",
        "created_at": 1699075072,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699075072,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699075073,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      },
      {
        "id": "run_abc456",
        "object": "thread.run",
        "created_at": 1699063290,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699063290,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699063291,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      }
    ],
    "first_id": "run_abc123",
    "last_id": "run_abc456",
    "has_more": false
  }
}

Create - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview

Create a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to run.
include[] query No array A list of additional fields to include in the response. Currently the only supported value is step_details.tool_calls[*].file_search.results[*].content to fetch the file search result content.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
additional_instructions string Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions. No
additional_messages array Adds additional messages to the thread before creating the run. No
assistant_id string The ID of the assistant to use to execute this run. Yes
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model will not call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject

Examples

Example

Create a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-03-01-preview

{
 "assistant_id": "asst_abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699063290,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": 1699063290,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699063291,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Get - Run

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview

Retrieves a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread that was run.
run_id path Yes string The ID of the run to retrieve.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject

Examples

Example

Gets a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_HsO8tYM4K5AAMAHgK0J3om8Q",
    "object": "thread.run",
    "created_at": 1707303196,
    "assistant_id": "asst_JtTwHk28cIocgFXZPCBxhOzl",
    "thread_id": "thread_eRNwflE3ncDYak1np6MdMHJh",
    "status": "completed",
    "started_at": 1707303197,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1707303201,
    "last_error": null,
    "model": "gpt-4-1106-preview",
    "instructions": "You are an AI model that empowers every person and every organization on the planet to achieve more.",
    "tools": [],
    "file_ids": [],
    "metadata": {}
  }
}

Modify - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview

Modifies a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread that was run.
run_id path Yes string The ID of the run to modify.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject

Examples

Example

Modifies a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-03-01-preview

{
 "metadata": {
  "user_id": "user_abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699075072,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "completed",
    "started_at": 1699075072,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699075073,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "tool_resources": {
      "code_interpreter": {
        "file_ids": [
          "file-abc123",
          "file-abc456"
        ]
      }
    },
    "metadata": {
      "user_id": "user_abc123"
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    },
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Submit - Tool Outputs To Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-03-01-preview

When a run has the status: "requires_action" and required_action.type is submit_tool_outputs, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to which this run belongs.
run_id path Yes string The ID of the run that requires the tool output submission.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
tool_outputs array A list of tools for which the outputs are being submitted. Yes

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject

Examples

Example

When a run has the status: "requires_action" and required_action.type is submit_tool_outputs, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-03-01-preview

{
 "tool_outputs": [
  {
   "tool_call_id": "call_001",
   "output": "70 degrees and sunny."
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_123",
    "object": "thread.run",
    "created_at": 1699075592,
    "assistant_id": "asst_123",
    "thread_id": "thread_123",
    "status": "queued",
    "started_at": 1699075592,
    "expires_at": 1699076192,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": [
                  "celsius",
                  "fahrenheit"
                ]
              }
            },
            "required": [
              "location"
            ]
          }
        }
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Cancel - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-03-01-preview

Cancels a run that is in_progress.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to which this run belongs.
run_id path Yes string The ID of the run to cancel.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runObject

Examples

Example

Cancels a run that is in_progress.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076126,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "cancelling",
    "started_at": 1699076126,
    "expires_at": 1699076726,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You summarize books.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": [
          "vs_123"
        ]
      }
    },
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "response_format": "auto"
  }
}

List - Run Steps

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-03-01-preview

Returns a list of run steps belonging to a run.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread the run and run steps belong to.
run_id path Yes string The ID of the run the run steps belong to.
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version query Yes string
include[] query No array A list of additional fields to include in the response. Currently the only supported value is step_details.tool_calls[*].file_search.results[*].content to fetch the file search result content.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listRunStepsResponse

Examples

Example

Returns a list of run steps belonging to a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "step_abc123",
        "object": "thread.run.step",
        "created_at": 1699063291,
        "run_id": "run_abc123",
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "type": "message_creation",
        "status": "completed",
        "cancelled_at": null,
        "completed_at": 1699063291,
        "expired_at": null,
        "failed_at": null,
        "last_error": null,
        "step_details": {
          "type": "message_creation",
          "message_creation": {
            "message_id": "msg_abc123"
          }
        },
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        }
      }
    ],
    "first_id": "step_abc123",
    "last_id": "step_abc456",
    "has_more": false
  }
}

Get - Run Step

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-03-01-preview

Retrieves a run step.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id path Yes string The ID of the thread to which the run and run step belongs.
run_id path Yes string The ID of the run to which the run step belongs.
step_id path Yes string The ID of the run step to retrieve.
include[] query No array A list of additional fields to include in the response. Currently the only supported value is step_details.tool_calls[*].file_search.results[*].content to fetch the file search result content.
api-version query Yes string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json runStepObject

Examples

Example

Retrieves a run step.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "step_abc123",
    "object": "thread.run.step",
    "created_at": 1699063291,
    "run_id": "run_abc123",
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "type": "message_creation",
    "status": "completed",
    "cancelled_at": null,
    "completed_at": 1699063291,
    "expired_at": null,
    "failed_at": null,
    "last_error": null,
    "step_details": {
      "type": "message_creation",
      "message_creation": {
        "message_id": "msg_abc123"
      }
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    }
  }
}

List - Vector Stores

GET https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview

Returns a list of vector stores.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listVectorStoresResponse

Examples

Example

Returns a list of vector stores.

GET https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "vs_abc123",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      },
      {
        "id": "vs_abc456",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ v2",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      }
    ],
    "first_id": "vs_abc123",
    "last_id": "vs_abc456",
    "has_more": false
  }
}

Create - Vector Store

POST https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview

Create a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
chunking_strategy autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is non-empty. No
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
file_ids array A list of file IDs that the vector store should use. Useful for tools like file_search that can access files. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
name string The name of the vector store. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreObject

Examples

Example

Creates a vector store.

POST https://{endpoint}/openai/vector_stores?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Get - Vector Store

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview

Retrieves a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store to retrieve.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreObject

Examples

Example

Retrieves a vector store.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776
  }
}

Modify - Vector Store

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview

Modifies a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store to modify.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
name string The name of the vector store. No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreObject

Examples

Example

Modifies a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview

{
 "name": "Support FAQ"
}

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Delete - Vector Store

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview

Delete a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store to delete.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteVectorStoreResponse

Examples

Example

Deletes a vector store.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store.deleted",
    "deleted": true
  }
}

List - Vector Store Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview

Returns a list of vector store files.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store that the files belong to.
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
filter query No string
Possible values: in_progress, completed, failed, cancelled
Filter by file status. One of in_progress, completed, failed, cancelled.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Create - Vector Store File

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview

Create a vector store file by attaching a file to a vector store.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store for which to create a File.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No
file_id string A File ID that the vector store should use. Useful for tools like file_search that can access files. Yes

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileObject

Examples

Example

Create a vector store file by attaching a File to a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-03-01-preview

{
 "file_id": "file-abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "usage_bytes": 1234,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Get - Vector Store File

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview

Retrieves a vector store file.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store that the file belongs to.
file_id path Yes string The ID of the file being retrieved.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileObject

Examples

Example

Retrieves a vector store file.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Delete - Vector Store File

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview

Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store that the file belongs to.
file_id path Yes string The ID of the file to delete.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json deleteVectorStoreFileResponse

Examples

Example

Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file_abc123",
    "object": "vector_store.file.deleted",
    "deleted": true
  }
}

Updatevectorstorefileattributes

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-03-01-preview

Update attributes on a vector store file.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store the file belongs to.
file_id path Yes string The ID of the file to update attributes.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
attributes VectorStoreFileAttributes Set of 16 key-value pairs that can be attached to an object. This can be
useful for storing additional information about the object in a structured
format, and querying for objects via API or the dashboard. Keys are strings
with a maximum length of 64 characters. Values are strings with a maximum
length of 512 characters, booleans, or numbers.
Yes

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileObject

Retrieve vector store file content

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}/content?api-version=2025-03-01-preview

Retrieve the parsed contents of a vector store file.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store.
file_id path Yes string The ID of the file within the vector store.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json VectorStoreFileContentResponse

Search vector store

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/search?api-version=2025-03-01-preview

Search a vector store for relevant chunks based on a query and file attributes filter.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store to search.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
filters ComparisonFilter or CompoundFilter A filter to apply based on file attributes. No
max_num_results integer The maximum number of results to return. This number should be between 1 and 50 inclusive. No 10
query string or array A query string for a search Yes
ranking_options object Ranking options for search. No
└─ ranker enum
Possible values: auto, default-2024-11-15
No
└─ score_threshold number No 0
rewrite_query boolean Whether to rewrite the natural language query for vector search. No False

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json VectorStoreSearchResultsPage

Create - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-03-01-preview

Create a vector store file batch.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store for which to create a File Batch.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name Type Description Required Default
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No
file_ids array A list of File IDs that the vector store should use. Useful for tools like file_search that can access files. Yes

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileBatchObject

Examples

Example

Create a vector store file batch.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-03-01-preview

{
 "file_ids": [
  "file-abc123",
  "file-abc456"
 ]
}

Responses: Status Code: 200

{
  "id": "vsfb_abc123",
  "object": "vector_store.file_batch",
  "created_at": 1699061776,
  "vector_store_id": "vs_abc123",
  "status": "in_progress",
  "file_counts": {
    "in_progress": 1,
    "completed": 1,
    "failed": 0,
    "cancelled": 0,
    "total": 0
  }
}

Get - Vector Store File Batch

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-03-01-preview

Retrieves a vector store file batch.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store that the file batch belongs to.
batch_id path Yes string The ID of the file batch being retrieved.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileBatchObject

Examples

Example

Retrieves a vector store file batch.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "in_progress",
    "file_counts": {
      "in_progress": 1,
      "completed": 1,
      "failed": 0,
      "cancelled": 0,
      "total": 0
    }
  }
}

Cancel - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-03-01-preview

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store that the file batch belongs to.
batch_id path Yes string The ID of the file batch to cancel.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json vectorStoreFileBatchObject

Examples

Example

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "cancelling",
    "file_counts": {
      "in_progress": 12,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 15
    }
  }
}

List - Vector Store File Batch Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-03-01-preview

Returns a list of vector store files in a batch.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id path Yes string The ID of the vector store that the files belong to.
batch_id path Yes string The ID of the file batch that the files belong to.
limit query No integer A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
Sort order by the created_at timestamp of the objects. asc for ascending order and desc for descending order.
after query No string A cursor for use in pagination. after is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before query No string A cursor for use in pagination. before is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
filter query No string
Possible values: in_progress, completed, failed, cancelled
Filter by file status. One of in_progress, completed, failed, cancelled.
api-version query Yes string api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-03-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Responses API - Create

POST https://{endpoint}/openai/responses?api-version=2025-03-01-preview

Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like file search to use your own data as input for the model's response.

Request Body

Content-Type: application/json

Name Type Description Required Default
include array No
input string or array Text, image, or file inputs to the model, used to generate a response.

Learn more:
- Text inputs and outputs
- Image inputs
- File inputs
- Conversation state
- function calling
Yes
instructions string Inserts a system (or developer) message as the first item in the model's context.

When using along with previous_response_id, the instructions from a previous
response will be not be carried over to the next response. This makes it simple
to swap out system (or developer) messages in new responses.
No
max_output_tokens integer An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.
No
parallel_tool_calls boolean Whether to allow the model to run tool calls in parallel.
No True
previous_response_id string The unique ID of the previous response to the model. Use this to
create multi-turn conversations. Learn more about
Conversation state.
No
reasoning Reasoning o-series models only

Configuration options for
reasoning models.
No
store boolean Whether to store the generated model response for later retrieval via
API.
No True
stream boolean If set to true, the model response data will be streamed to the client
as it is generated using server-sent events.
No False
text object Configuration options for a text response from the model. Can be plain
text or structured JSON data. Learn more:
- Text inputs and outputs
- Structured Outputs
No
└─ format TextResponseFormatConfiguration An object specifying the format that the model must output.

Configuring { "type": "json_schema" } enables Structured Outputs,
which ensures the model will match your supplied JSON schema.

The default format is { "type": "text" } with no additional options.

Not recommended for gpt-4o and newer models:

Setting to { "type": "json_object" } enables the older JSON mode, which
ensures the message the model generates is valid JSON. Using json_schema
is preferred for models that support it.
No
tool_choice ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction How the model should select which tool (or tools) to use when generating
a response. See the tools parameter to see how to specify which tools
the model can call.
No
tools array An array of tools the model may call while generating a response. You
can specify which tool to use by setting the tool_choice parameter.

The two categories of tools you can provide the model are:

- Built-in tools: Tools that are provided by OpenAI that extend the
model's capabilities, file search.
- Function calls (custom tools): Functions that are defined by you,
enabling the model to call your own code. Learn more about
function calling.
No
truncation enum The truncation strategy to use for the model response.
- auto: If the context of this response and previous ones exceeds
the model's context window size, the model will truncate the
response to fit the context window by dropping input items in the
middle of the conversation.
- disabled (default): If a model response will exceed the context window
size for a model, the request will fail with a 400 error.

Possible values: auto, disabled
No

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json response
text/event-stream responseStreamEvent

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Responses API - Get

GET https://{endpoint}/openai/responses/{response_id}?api-version=2025-03-01-preview

Retrieves a model response with the given ID.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
response_id path Yes string The ID of the response to retrieve.
include query No array Additional fields to include in the response. See the include
parameter for Response creation above for more information.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json response

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Responses API - Delete

DELETE https://{endpoint}/openai/responses/{response_id}?api-version=2025-03-01-preview

Deletes a model response with the given ID.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
response_id path Yes string The ID of the response to delete.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Status Code: 404

Description: Not Found

Content-Type Type Description
application/json error

Status Code: default

Description: Service unavailable

Content-Type Type Description
application/json errorResponse

Responses API - input items

GET https://{endpoint}/openai/responses/{response_id}/input_items?api-version=2025-03-01-preview

Returns a list of input items for a given response.

URI Parameters

Name In Required Type Description
endpoint path Yes string
url
Supported Azure OpenAI endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
response_id path Yes string The ID of the response to retrieve input items for.
limit query No integer A limit on the number of objects to be returned. Limit can range between
1 and 100, and the default is 20.
order query No string
Possible values: asc, desc
The order to return the input items in. Default is asc.
- asc: Return the input items in ascending order.
- desc: Return the input items in descending order.
after query No string An item ID to list items after, used in pagination.
before query No string An item ID to list items before, used in pagination.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure..**

Name Required Type Description
Authorization True string Example: Authorization: Bearer {Azure_OpenAI_Auth_Token}

To generate an auth token using Azure CLI: az account get-access-token --resource https://cognitiveservices.azure.com

Type: oauth2
Authorization Url: https://login.microsoftonline.com/common/oauth2/v2.0/authorize
scope: https://cognitiveservices.azure.com/.default
api-key True string Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type Type Description
application/json responseItemList

Components

errorResponse

Name Type Description Required Default
error error No

errorBase

Name Type Description Required Default
code string No
message string No

error

Name Type Description Required Default
inner_error innerError Inner error with additional details. No
param string No
type string No

innerError

Inner error with additional details.

Name Type Description Required Default
code innerErrorCode Error codes for the inner error object. No
content_filter_results contentFilterPromptResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. No

innerErrorCode

Error codes for the inner error object.

Property Value
Description Error codes for the inner error object.
Type string
Values ResponsibleAIPolicyViolation

dalleErrorResponse

Name Type Description Required Default
error dalleError No

dalleError

Name Type Description Required Default
inner_error dalleInnerError Inner error with additional details. No
param string No
type string No

dalleInnerError

Inner error with additional details.

Name Type Description Required Default
code innerErrorCode Error codes for the inner error object. No
content_filter_results dalleFilterResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. No
revised_prompt string The prompt that was used to generate the image, if there was any revision to the prompt. No

contentFilterCompletionTextSpan

Describes a span within generated completion text. Offset 0 is the first UTF32 code point of the completion text.

Name Type Description Required Default
completion_end_offset integer Offset of the first UTF32 code point which is excluded from the span. This field is always equal to completion_start_offset for empty spans. This field is always larger than completion_start_offset for non-empty spans. Yes
completion_start_offset integer Offset of the UTF32 code point which begins the span. Yes

contentFilterResultBase

Name Type Description Required Default
filtered boolean Yes

contentFilterSeverityResult

Name Type Description Required Default
filtered boolean Yes
severity string No

contentFilterDetectedResult

Name Type Description Required Default
detected boolean No
filtered boolean Yes

contentFilterDetectedWithCitationResult

Name Type Description Required Default
citation object No
└─ URL string No
└─ license string No

contentFilterDetectedWithCompletionTextSpansResult

Name Type Description Required Default
details array No

contentFilterIdResult

Name Type Description Required Default
filtered boolean Yes
id string No

contentFilterResultsBase

Information about the content filtering results.

Name Type Description Required Default
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
error errorBase No
hate contentFilterSeverityResult No
profanity contentFilterDetectedResult No
self_harm contentFilterSeverityResult No
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No

contentFilterPromptResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.

Name Type Description Required Default
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
error errorBase No
hate contentFilterSeverityResult No
indirect_attack contentFilterDetectedResult No
jailbreak contentFilterDetectedResult No
profanity contentFilterDetectedResult No
self_harm contentFilterSeverityResult No
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No

contentFilterChoiceResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about third party text and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.

Name Type Description Required Default
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
error errorBase No
hate contentFilterSeverityResult No
profanity contentFilterDetectedResult No
protected_material_code contentFilterDetectedWithCitationResult No
protected_material_text contentFilterDetectedResult No
self_harm contentFilterSeverityResult No
sexual contentFilterSeverityResult No
ungrounded_material contentFilterDetectedWithCompletionTextSpansResult No
violence contentFilterSeverityResult No

contentFilterDetailedResults

Content filtering results with a detail of content filter ids for the filtered segments.

Name Type Description Required Default
details array No
filtered boolean Yes

promptFilterResult

Content filtering results for a single prompt in the request.

Name Type Description Required Default
content_filter_results contentFilterPromptResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. No
prompt_index integer No

promptFilterResults

Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.

No properties defined for this component.

dalleContentFilterResults

Information about the content filtering results.

Name Type Description Required Default
hate contentFilterSeverityResult No
self_harm contentFilterSeverityResult No
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No

dalleFilterResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.

Name Type Description Required Default
custom_blocklists contentFilterDetailedResults Content filtering results with a detail of content filter ids for the filtered segments. No
hate contentFilterSeverityResult No
jailbreak contentFilterDetectedResult No
profanity contentFilterDetectedResult No
self_harm contentFilterSeverityResult No
sexual contentFilterSeverityResult No
violence contentFilterSeverityResult No

chatCompletionsRequestCommon

Name Type Description Required Default
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. No
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. No
max_tokens integer The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens). This is not compatible with o1 series models. No 4096
metadata object Developer-defined tags and values used for filtering completions in the stored completions dashboard. No
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. No 0
stop string or array Up to 4 sequences where the API will stop generating further tokens. No
store boolean Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. No
stream boolean If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. No False
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help Azure OpenAI to monitor and detect abuse. No

createCompletionRequest

Name Type Description Required Default
best_of integer Generates best_of completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed.

When used with n, best_of controls the number of candidate completions and n specifies how many to return best_of must be greater than n.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
echo boolean Echo back the prompt in addition to the completion
No False
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this tokenizer tool to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.

As an example, you can pass {"50256": -100} to prevent the <|endoftext|> token from being generated.
No None
logprobs integer Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the 5 most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.

The maximum value for logprobs is 5.
No None
max_tokens integer The maximum number of tokens that can be generated in the completion.

The token count of your prompt plus max_tokens cannot exceed the model's context length. Example Python code for counting tokens.
No 16
n integer How many completions to generate for each prompt.

Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
No 1
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
prompt string or array The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.

Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.
Yes
seed integer If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stop string or array Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
No
stream boolean Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code.
No False
suffix string The suffix that comes after a completion of inserted text.

This parameter is only supported for gpt-3.5-turbo-instruct.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No

createCompletionResponse

Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).

Name Type Description Required Default
choices array The list of completion choices the model generated for the input prompt. Yes
created integer The Unix timestamp (in seconds) of when the completion was created. Yes
id string A unique identifier for the completion. Yes
model string The model used for completion. Yes
object enum The object type, which is always "text_completion"
Possible values: text_completion
Yes
prompt_filter_results promptFilterResults Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders. No
system_fingerprint string This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.
No
usage completionUsage Usage statistics for the completion request. No

createChatCompletionRequest

Name Type Description Required Default
audio object Parameters for audio output. Required when audio output is requested with
modalities: ["audio"].
No
└─ format enum Specifies the output audio format. Must be one of wav, mp3, flac,
opus, or pcm16.

Possible values: wav, mp3, flac, opus, pcm16
No
└─ voice enum Specifies the voice type. Supported voices are alloy, echo,
fable, onyx, nova, and shimmer.

Possible values: alloy, echo, fable, onyx, nova, shimmer
No
data_sources array The configuration entries for Azure OpenAI chat extensions that use them.
This additional specification is only compatible with Azure OpenAI.
No
frequency_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
No 0
function_call string or chatCompletionFunctionCallOption Deprecated in favor of tool_choice.

Controls which (if any) function is called by the model.
none means the model will not call a function and instead generates a message.
auto means the model can pick between generating a message or calling a function.
Specifying a particular function via {"name": "my_function"} forces the model to call that function.

none is the default when no functions are present. auto is the default if functions are present.
No
functions array Deprecated in favor of tools.

A list of functions the model may generate JSON inputs for.
No
logit_bias object Modify the likelihood of specified tokens appearing in the completion.

Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
No None
logprobs boolean Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message. No False
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release. No
max_tokens integer The maximum number of tokens that can be generated in the chat completion.

The total length of input tokens and generated tokens is limited by the model's context length. Example Python code for counting tokens.
No
messages array A list of messages comprising the conversation so far. Example Python code. Yes
metadata object Developer-defined tags and values used for filtering completions in the stored completions dashboard. No
modalities ChatCompletionModalities Output types that you would like the model to generate for this request.
Most models are capable of generating text, which is the default:

["text"]

The gpt-4o-audio-preview model can also be used to generate audio. To
request that this model generate both text and audio responses, you can
use:

["text", "audio"]
No
n integer How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs. No 1
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
prediction PredictionContent Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content. No
presence_penalty number Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
No 0
reasoning_effort enum o1 models only

Constrains effort on reasoning for
reasoning models.

Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Possible values: low, medium, high
No
response_format ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which guarantees the model will match your supplied JSON schema.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
seed integer This feature is in Beta.
If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Determinism is not guaranteed, and you should refer to the system_fingerprint response parameter to monitor changes in the backend.
No
stop string or array Up to 4 sequences where the API will stop generating further tokens.
No
store boolean Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products. No
stream boolean If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. Example Python code.
No False
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.
No 1
tool_choice chatCompletionToolChoiceOption Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present. No
tools array A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
No
top_logprobs integer An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. No
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help to monitor and detect abuse.
No
user_security_context userSecurityContext User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud. No

userSecurityContext

User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud.

Name Type Description Required Default
application_name string The name of the application. Sensitive personal information should not be included in this field. No
end_user_id string This identifier is the Microsoft Entra ID (formerly Azure Active Directory) user object ID used to authenticate end-users within the generative AI application. Sensitive personal information should not be included in this field. No
end_user_tenant_id string The Microsoft 365 tenant ID the end user belongs to. It's required when the generative AI application is multi tenant. No
source_ip string Captures the original client's IP address, accepting both IPv4 and IPv6 formats. No

chatCompletionFunctions

Name Type Description Required Default
description string A description of what the function does, used by the model to choose when and how to call the function. No
name string The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. Yes
parameters FunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.
No

chatCompletionFunctionCallOption

Specifying a particular function via {"name": "my_function"} forces the model to call that function.

Name Type Description Required Default
name string The name of the function to call. Yes

chatCompletionFunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.

No properties defined for this component.

chatCompletionRequestMessage

This component can be one of the following:

ChatCompletionRequestDeveloperMessage

Developer-provided instructions that the model should follow, regardless of messages sent by the user. With o1 models and newer, developer messages replace the previous system messages.

Name Type Description Required Default
content string or array The contents of the developer message. Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No
role enum The role of the messages author, in this case developer.
Possible values: developer
Yes

chatCompletionRequestSystemMessage

Name Type Description Required Default
content string or array The contents of the system message. Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No
role enum The role of the messages author, in this case system.
Possible values: system
Yes

chatCompletionRequestUserMessage

Name Type Description Required Default
content string or array The contents of the user message.
Yes
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No
role enum The role of the messages author, in this case user.
Possible values: user
Yes

chatCompletionRequestAssistantMessage

Name Type Description Required Default
content string or array The contents of the assistant message. Required unless tool_calls or function_call is specified.
No
function_call object Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model. No
└─ arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. No
└─ name string The name of the function to call. No
name string An optional name for the participant. Provides the model information to differentiate between participants of the same role. No
refusal string The refusal message by the assistant. No
role enum The role of the messages author, in this case assistant.
Possible values: assistant
Yes
tool_calls chatCompletionMessageToolCalls The tool calls generated by the model, such as function calls. No

chatCompletionRequestToolMessage

Name Type Description Required Default
content string or array The contents of the tool message. Yes
role enum The role of the messages author, in this case tool.
Possible values: tool
Yes
tool_call_id string Tool call that this message is responding to. Yes

chatCompletionRequestFunctionMessage

Name Type Description Required Default
content string The contents of the function message. Yes
name string The name of the function to call. Yes
role enum The role of the messages author, in this case function.
Possible values: function
Yes

chatCompletionRequestDeveloperMessageContentPart

This component can be one of the following:

chatCompletionRequestSystemMessageContentPart

This component can be one of the following:

chatCompletionRequestUserMessageContentPart

This component can be one of the following:

chatCompletionRequestAssistantMessageContentPart

This component can be one of the following:

chatCompletionRequestToolMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText

Name Type Description Required Default
text string The text content. Yes
type enum The type of the content part.
Possible values: text
Yes

chatCompletionRequestMessageContentPartAudio

Name Type Description Required Default
input_audio object Yes
└─ data string Base64 encoded audio data. No
└─ format enum The format of the encoded audio data. Currently supports "wav" and "mp3".

Possible values: wav, mp3
No
type enum The type of the content part. Always input_audio.
Possible values: input_audio
Yes

chatCompletionRequestMessageContentPartImage

Name Type Description Required Default
image_url object Yes
└─ detail enum Specifies the detail level of the image. Learn more in the Vision guide.
Possible values: auto, low, high
No
└─ url string Either a URL of the image or the base64 encoded image data. No
type enum The type of the content part.
Possible values: image_url
Yes

chatCompletionRequestMessageContentPartRefusal

Name Type Description Required Default
refusal string The refusal message generated by the model. Yes
type enum The type of the content part.
Possible values: refusal
Yes

azureChatExtensionConfiguration

A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.

Discriminator for azureChatExtensionConfiguration

This component uses the property type to discriminate between different types:

Name Type Description Required Default
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

azureChatExtensionType

A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.

Property Value
Description A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Type string
Values azure_search
azure_cosmos_db
elasticsearch
mongo_db
pinecone

azureSearchChatExtensionConfiguration

A specific representation of configurable options for Azure Search when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
parameters azureSearchChatExtensionParameters Parameters for Azure Search when used as an Azure OpenAI chat extension. No
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

azureSearchChatExtensionParameters

Parameters for Azure Search when used as an Azure OpenAI chat extension.

Name Type Description Required Default
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
authentication onYourDataApiKeyAuthenticationOptions or onYourDataSystemAssignedManagedIdentityAuthenticationOptions or onYourDataUserAssignedManagedIdentityAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions Yes
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataIntegratedVectorizationSource No
endpoint string The absolute endpoint path for the Azure Search resource to use. Yes
fields_mapping azureSearchIndexFieldMappingOptions Optional settings to control how fields are processed when using a configured Azure Search resource. No
filter string Search filter. No
in_scope boolean Whether queries should be restricted to use of indexed data. No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No
index_name string The name of the index to use as available in the referenced Azure Search resource. Yes
max_search_queries integer The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. No
query_type azureSearchQueryType The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension. No
semantic_configuration string The additional semantic configuration for the query. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
top_n_documents integer The configured top number of documents to feature for the configured query. No

azureSearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Search resource.

Name Type Description Required Default
content_fields array The names of index fields that should be treated as content. No
content_fields_separator string The separator pattern that content fields should use. No
filepath_field string The name of the index field to use as a filepath. No
image_vector_fields array The names of fields that represent image vector data. No
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
vector_fields array The names of fields that represent vector data. No

azureSearchQueryType

The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Property Value
Description The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.
Type string
Values simple
semantic
vector
vector_simple_hybrid
vector_semantic_hybrid

azureCosmosDBChatExtensionConfiguration

A specific representation of configurable options for Azure Cosmos DB when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
parameters azureCosmosDBChatExtensionParameters Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for
MongoDB vCore.
No
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

azureCosmosDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.

Name Type Description Required Default
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
authentication onYourDataConnectionStringAuthenticationOptions The authentication options for Azure OpenAI On Your Data when using a connection string. Yes
container_name string The name of the Azure Cosmos DB resource container. Yes
database_name string The MongoDB vCore database name to use with Azure Cosmos DB. Yes
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource Yes
fields_mapping azureCosmosDBFieldMappingOptions Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource. Yes
in_scope boolean Whether queries should be restricted to use of indexed data. No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No
index_name string The MongoDB vCore index name to use with Azure Cosmos DB. Yes
max_search_queries integer The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
top_n_documents integer The configured top number of documents to feature for the configured query. No

azureCosmosDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.

Name Type Description Required Default
content_fields array The names of index fields that should be treated as content. Yes
content_fields_separator string The separator pattern that content fields should use. No
filepath_field string The name of the index field to use as a filepath. No
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
vector_fields array The names of fields that represent vector data. Yes

elasticsearchChatExtensionConfiguration

A specific representation of configurable options for Elasticsearch when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
parameters elasticsearchChatExtensionParameters Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension. No
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

elasticsearchChatExtensionParameters

Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension.

Name Type Description Required Default
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
authentication onYourDataKeyAndKeyIdAuthenticationOptions or onYourDataEncodedApiKeyAuthenticationOptions Yes
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataModelIdVectorizationSource No
endpoint string The endpoint of Elasticsearch®. Yes
fields_mapping elasticsearchIndexFieldMappingOptions Optional settings to control how fields are processed when using a configured Elasticsearch® resource. No
in_scope boolean Whether queries should be restricted to use of indexed data. No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No
index_name string The index name of Elasticsearch®. Yes
max_search_queries integer The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. No
query_type elasticsearchQueryType The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
top_n_documents integer The configured top number of documents to feature for the configured query. No

elasticsearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Elasticsearch® resource.

Name Type Description Required Default
content_fields array The names of index fields that should be treated as content. No
content_fields_separator string The separator pattern that content fields should use. No
filepath_field string The name of the index field to use as a filepath. No
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
vector_fields array The names of fields that represent vector data. No

elasticsearchQueryType

The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Property Value
Description The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.
Type string
Values simple
vector

mongoDBChatExtensionConfiguration

A specific representation of configurable options for Mongo DB when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
parameters mongoDBChatExtensionParameters Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB. No
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

mongoDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.

Name Type Description Required Default
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
app_name string The name of the Mongo DB Application. Yes
authentication onYourDataUsernameAndPasswordAuthenticationOptions The authentication options for Azure OpenAI On Your Data when using a username and a password. Yes
collection_name string The name of the Mongo DB Collection. Yes
database_name string The name of the Mongo DB database. Yes
embedding_dependency onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource Yes
endpoint string The name of the Mongo DB cluster endpoint. Yes
fields_mapping mongoDBFieldMappingOptions Optional settings to control how fields are processed when using a configured Mongo DB resource. Yes
in_scope boolean Whether queries should be restricted to use of indexed data. No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No
index_name string The The name of the Mongo DB index. Yes
max_search_queries integer The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
top_n_documents integer The configured top number of documents to feature for the configured query. No

mongoDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Mongo DB resource.

Name Type Description Required Default
content_fields array The names of index fields that should be treated as content. Yes
content_fields_separator string The separator pattern that content fields should use. No
filepath_field string The name of the index field to use as a filepath. No
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No
vector_fields array The names of fields that represent vector data. Yes

pineconeChatExtensionConfiguration

A specific representation of configurable options for Pinecone when using it as an Azure OpenAI chat extension.

Name Type Description Required Default
parameters pineconeChatExtensionParameters Parameters for configuring Azure OpenAI Pinecone chat extensions. No
type azureChatExtensionType A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.
Yes

pineconeChatExtensionParameters

Parameters for configuring Azure OpenAI Pinecone chat extensions.

Name Type Description Required Default
allow_partial_result boolean If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails. No False
authentication onYourDataApiKeyAuthenticationOptions The authentication options for Azure OpenAI On Your Data when using an API key. Yes
embedding_dependency onYourDataDeploymentNameVectorizationSource The details of a a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based
on an internal embeddings model deployment name in the same Azure OpenAI resource.
Yes
environment string The environment name of Pinecone. Yes
fields_mapping pineconeFieldMappingOptions Optional settings to control how fields are processed when using a configured Pinecone resource. Yes
in_scope boolean Whether queries should be restricted to use of indexed data. No
include_contexts array The included properties of the output context. If not specified, the default value is citations and intent. No
index_name string The name of the Pinecone database index. Yes
max_search_queries integer The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send. No
strictness integer The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer. No
top_n_documents integer The configured top number of documents to feature for the configured query. No

pineconeFieldMappingOptions

Optional settings to control how fields are processed when using a configured Pinecone resource.

Name Type Description Required Default
content_fields array The names of index fields that should be treated as content. Yes
content_fields_separator string The separator pattern that content fields should use. No
filepath_field string The name of the index field to use as a filepath. No
title_field string The name of the index field to use as a title. No
url_field string The name of the index field to use as a URL. No

onYourDataAuthenticationOptions

The authentication options for Azure OpenAI On Your Data.

Discriminator for onYourDataAuthenticationOptions

This component uses the property type to discriminate between different types:

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataContextProperty

The context property.

Property Value
Description The context property.
Type string
Values citations
intent
all_retrieved_documents

onYourDataAuthenticationType

The authentication types supported with Azure OpenAI On Your Data.

Property Value
Description The authentication types supported with Azure OpenAI On Your Data.
Type string
Values api_key
connection_string
key_and_key_id
encoded_api_key
access_token
system_assigned_managed_identity
user_assigned_managed_identity
username_and_password

onYourDataApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an API key.

Name Type Description Required Default
key string The API key to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataConnectionStringAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a connection string.

Name Type Description Required Default
connection_string string The connection string to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataKeyAndKeyIdAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch key and key ID pair.

Name Type Description Required Default
key string The Elasticsearch key to use for authentication. No
key_id string The Elasticsearch key ID to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataEncodedApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch encoded API key.

Name Type Description Required Default
encoded_api_key string The Elasticsearch encoded API key to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataAccessTokenAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using access token.

Name Type Description Required Default
access_token string The access token to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataSystemAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a system-assigned managed identity.

Name Type Description Required Default
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataUserAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a user-assigned managed identity.

Name Type Description Required Default
managed_identity_resource_id string The resource ID of the user-assigned managed identity to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes

onYourDataUsernameAndPasswordAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a username and a password.

Name Type Description Required Default
password string The password. to use for authentication. No
type onYourDataAuthenticationType The authentication types supported with Azure OpenAI On Your Data. Yes
username string The username to use for authentication. No

onYourDataVectorizationSource

An abstract representation of a vectorization source for Azure OpenAI On Your Data with vector search.

Discriminator for onYourDataVectorizationSource

This component uses the property type to discriminate between different types:

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

onYourDataVectorizationSourceType

Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.

Property Value
Description Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Type string
Values endpoint
deployment_name
integrated
model_id

onYourDataEndpointVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a public Azure OpenAI endpoint call for embeddings.

Name Type Description Required Default
authentication onYourDataApiKeyAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions No
dimensions integer The number of dimensions the embeddings should have. Only supported in text-embedding-3 and later models. No
endpoint string Specifies the resource endpoint URL from which embeddings should be retrieved. It should be in the format of https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings. The api-version query parameter is not allowed. No
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

onYourDataDeploymentNameVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.

Name Type Description Required Default
deployment_name string Specifies the name of the model deployment to use for vectorization. This model deployment must be in the same Azure OpenAI resource, but On Your Data will use this model deployment via an internal call rather than a public one, which enables vector search even in private networks. No
dimensions integer The number of dimensions the embeddings should have. Only supported in text-embedding-3 and later models. No
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

onYourDataIntegratedVectorizationSource

Represents the integrated vectorizer defined within the search resource.

Name Type Description Required Default
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

onYourDataModelIdVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a search service model ID. Currently only supported by Elasticsearch®.

Name Type Description Required Default
model_id string Specifies the model ID to use for vectorization. This model ID must be defined in the search service. No
type onYourDataVectorizationSourceType Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.
Yes

azureChatExtensionsMessageContext

A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.

Name Type Description Required Default
all_retrieved_documents array All the retrieved documents. No
citations array The data source retrieval result, used to generate the assistant message in the response. No
intent string The detected intent from the chat history, used to pass to the next turn to carry over the context. No

citation

citation information for a chat completions response message.

Name Type Description Required Default
chunk_id string The chunk ID of the citation. No
content string The content of the citation. Yes
filepath string The file path of the citation. No
rerank_score number The rerank score of the retrieved document. No
title string The title of the citation. No
url string The URL of the citation. No

retrievedDocument

The retrieved document.

Name Type Description Required Default
chunk_id string The chunk ID of the citation. No
content string The content of the citation. Yes
data_source_index integer The index of the data source. No
filepath string The file path of the citation. No
filter_reason filterReason The filtering reason of the retrieved document. No
original_search_score number The original search score of the retrieved document. No
rerank_score number The rerank score of the retrieved document. No
search_queries array The search queries used to retrieve the document. No
title string The title of the citation. No
url string The URL of the citation. No

filterReason

The filtering reason of the retrieved document.

Property Value
Description The filtering reason of the retrieved document.
Type string
Values score
rerank

chatCompletionMessageToolCall

Name Type Description Required Default
function object The function that the model called. Yes
└─ arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. No
└─ name string The name of the function to call. No
id string The ID of the tool call. Yes
type toolCallType The type of the tool call, in this case function. Yes

toolCallType

The type of the tool call, in this case function.

Property Value
Description The type of the tool call, in this case function.
Type string
Values function

chatCompletionRequestMessageTool

Name Type Description Required Default
content string The contents of the message. No
tool_call_id string Tool call that this message is responding to. No

chatCompletionRequestMessageFunction

Name Type Description Required Default
content string The contents of the message. No
name string The contents of the message. No
role enum The role of the messages author, in this case function.
Possible values: function
No

createChatCompletionResponse

Represents a chat completion response returned by model, based on the provided input.

Name Type Description Required Default
choices array A list of chat completion choices. Can be more than one if n is greater than 1. Yes
created integer The Unix timestamp (in seconds) of when the chat completion was created. Yes
id string A unique identifier for the chat completion. Yes
model string The model used for the chat completion. Yes
object enum The object type, which is always chat.completion.
Possible values: chat.completion
Yes
prompt_filter_results promptFilterResults Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders. No
system_fingerprint string This fingerprint represents the backend configuration that the model runs with.

Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.
No
usage completionUsage Usage statistics for the completion request. No

createChatCompletionStreamResponse

Represents a streamed chunk of a chat completion response returned by model, based on the provided input.

Name Type Description Required Default
choices array A list of chat completion choices. Can contain more than one elements if n is greater than 1.
Yes
created integer The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp. Yes
id string A unique identifier for the chat completion. Each chunk has the same ID. Yes
model string The model to generate the completion. Yes
object enum The object type, which is always chat.completion.chunk.
Possible values: chat.completion.chunk
Yes
system_fingerprint string This fingerprint represents the backend configuration that the model runs with.
Can be used in conjunction with the seed request parameter to understand when backend changes have been made that might impact determinism.
No

chatCompletionStreamResponseDelta

A chat completion delta generated by streamed model responses.

Name Type Description Required Default
content string The contents of the chunk message. No
function_call object Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model. No
└─ arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. No
└─ name string The name of the function to call. No
refusal string The refusal message generated by the model. No
role enum The role of the author of this message.
Possible values: system, user, assistant, tool
No
tool_calls array No

chatCompletionMessageToolCallChunk

Name Type Description Required Default
function object No
└─ arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. No
└─ name string The name of the function to call. No
id string The ID of the tool call. No
index integer Yes
type enum The type of the tool. Currently, only function is supported.
Possible values: function
No

chatCompletionStreamOptions

Options for streaming response. Only set this when you set stream: true.

Name Type Description Required Default
include_usage boolean If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value.
No

chatCompletionChoiceLogProbs

Log probability information for the choice.

Name Type Description Required Default
content array A list of message content tokens with log probability information. Yes
refusal array A list of message refusal tokens with log probability information. No

chatCompletionTokenLogprob

Name Type Description Required Default
bytes array A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token. Yes
logprob number The log probability of this token. Yes
token string The token. Yes
top_logprobs array List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested top_logprobs returned. Yes

chatCompletionResponseMessage

A chat completion message generated by the model.

Name Type Description Required Default
audio object If the audio output modality is requested, this object contains data
about the audio response from the model.
No
└─ data string Base64 encoded audio bytes generated by the model, in the format
specified in the request.
No
└─ expires_at integer The Unix timestamp (in seconds) for when this audio response will
no longer be accessible on the server for use in multi-turn
conversations.
No
└─ id string Unique identifier for this audio response. No
└─ transcript string Transcript of the audio generated by the model. No
content string The contents of the message. Yes
context azureChatExtensionsMessageContext A representation of the additional context information available when Azure OpenAI chat extensions are involved
in the generation of a corresponding chat completions response. This context information is only populated when
using an Azure OpenAI request configured to use a matching extension.
No
function_call chatCompletionFunctionCall Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model. No
refusal string The refusal message generated by the model. Yes
role chatCompletionResponseMessageRole The role of the author of the response message. Yes
tool_calls array The tool calls generated by the model, such as function calls. No

chatCompletionResponseMessageRole

The role of the author of the response message.

Property Value
Description The role of the author of the response message.
Type string
Values assistant

chatCompletionToolChoiceOption

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

This component can be one of the following:

chatCompletionNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific function.

Name Type Description Required Default
function object Yes
└─ name string The name of the function to call. No
type enum The type of the tool. Currently, only function is supported.
Possible values: function
Yes

ParallelToolCalls

Whether to enable parallel function calling during tool use.

No properties defined for this component.

PredictionContent

Static predicted output content, such as the content of a text file that is being regenerated.

Name Type Description Required Default
content string or array The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly. Yes
type enum The type of the predicted content you want to provide. This type is currently always content.
Possible values: content
Yes

chatCompletionMessageToolCalls

The tool calls generated by the model, such as function calls.

No properties defined for this component.

ChatCompletionModalities

Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default:

["text"]

The gpt-4o-audio-preview model can also be used to generate audio. To request that this model generate both text and audio responses, you can use:

["text", "audio"]

No properties defined for this component.

chatCompletionFunctionCall

Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model.

Name Type Description Required Default
arguments string The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. Yes
name string The name of the function to call. Yes

completionUsage

Usage statistics for the completion request.

Name Type Description Required Default
completion_tokens integer Number of tokens in the generated completion. Yes
completion_tokens_details object Breakdown of tokens used in a completion. No
└─ accepted_prediction_tokens integer When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion. No
└─ audio_tokens integer Audio input tokens generated by the model. No
└─ reasoning_tokens integer Tokens generated by the model for reasoning. No
└─ rejected_prediction_tokens integer When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits. No
prompt_tokens integer Number of tokens in the prompt. Yes
prompt_tokens_details object Details of the prompt tokens. No
└─ audio_tokens integer Audio input tokens present in the prompt. No
└─ cached_tokens integer The number of cached prompt tokens. No
total_tokens integer Total number of tokens used in the request (prompt + completion). Yes

chatCompletionTool

Name Type Description Required Default
function FunctionObject Yes
type enum The type of the tool. Currently, only function is supported.
Possible values: function
Yes

FunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.

No properties defined for this component.

FunctionObject

Name Type Description Required Default
description string A description of what the function does, used by the model to choose when and how to call the function. No
name string The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. Yes
parameters FunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the guide](https://learn.microsoft.com/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.
No
strict boolean Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true. No False

ResponseFormatText

Name Type Description Required Default
type enum The type of response format being defined: text
Possible values: text
Yes

ResponseFormatJsonObject

Name Type Description Required Default
type enum The type of response format being defined: json_object
Possible values: json_object
Yes

ResponseFormatJsonSchemaSchema

The schema for the response format, described as a JSON Schema object.

No properties defined for this component.

ResponseFormatJsonSchema

Name Type Description Required Default
json_schema object Yes
└─ description string A description of what the response format is for, used by the model to determine how to respond in the format. No
└─ name string The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. No
└─ schema ResponseFormatJsonSchemaSchema The schema for the response format, described as a JSON Schema object. No
└─ strict boolean Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is true. No False
type enum The type of response format being defined: json_schema
Possible values: json_schema
Yes

chatCompletionChoiceCommon

Name Type Description Required Default
finish_reason string No
index integer No

createTranslationRequest

Translation request.

Name Type Description Required Default
file string The audio file to translate. Yes
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0

audioResponse

Translation or transcription response when response_format was json

Name Type Description Required Default
text string Translated or transcribed text. Yes

audioVerboseResponse

Translation or transcription response when response_format was verbose_json

Name Type Description Required Default
duration number Duration. No
language string Language. No
segments array No
task string Type of audio task. No
text string Translated or transcribed text. Yes
words array No

audioResponseFormat

Defines the format of the output.

Property Value
Description Defines the format of the output.
Type string
Values json
text
srt
verbose_json
vtt

createTranscriptionRequest

Transcription request.

Name Type Description Required Default
file string The audio file object to transcribe. Yes
language string The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. No
prompt string An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language. No
response_format audioResponseFormat Defines the format of the output. No
temperature number The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. No 0
timestamp_granularities[] array The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. No ['segment']

audioSegment

Transcription or translation segment.

Name Type Description Required Default
avg_logprob number Average log probability. No
compression_ratio number Compression ratio. No
end number Segment end offset. No
id integer Segment identifier. No
no_speech_prob number Probability of 'no speech'. No
seek number Offset of the segment. No
start number Segment start offset. No
temperature number Temperature. No
text string Segment text. No
tokens array Tokens of the text. No

audioWord

Transcription or translation word.

Name Type Description Required Default
end number Word end offset. No
start number Word start offset. No
word string Word No

createSpeechRequest

Speech request.

Name Type Description Required Default
input string The text to synthesize audio for. The maximum length is 4096 characters. Yes
response_format enum The format to synthesize the audio in.
Possible values: mp3, opus, aac, flac, wav, pcm
No
speed number The speed of the synthesized audio. Select a value from 0.25 to 4.0. 1.0 is the default. No 1.0
voice enum The voice to use for speech synthesis.
Possible values: alloy, echo, fable, onyx, nova, shimmer
Yes

imageQuality

The quality of the image that will be generated.

Property Value
Description The quality of the image that will be generated.
Type string
Default standard
Values standard
hd

imagesResponseFormat

The format in which the generated images are returned.

Property Value
Description The format in which the generated images are returned.
Type string
Default url
Values url
b64_json

imageSize

The size of the generated images.

Property Value
Description The size of the generated images.
Type string
Default 1024x1024
Values 256x256
512x512
1792x1024
1024x1792
1024x1024

imageStyle

The style of the generated images.

Property Value
Description The style of the generated images.
Type string
Default vivid
Values vivid
natural

imageGenerationsRequest

Name Type Description Required Default
n integer The number of images to generate. No 1
prompt string A text description of the desired image(s). The maximum length is 4000 characters. Yes
quality imageQuality The quality of the image that will be generated. No standard
response_format imagesResponseFormat The format in which the generated images are returned. No url
size imageSize The size of the generated images. No 1024x1024
style imageStyle The style of the generated images. No vivid
user string A unique identifier representing your end-user, which can help to monitor and detect abuse. No

generateImagesResponse

Name Type Description Required Default
created integer The unix timestamp when the operation was created. Yes
data array The result data of the operation, if successful Yes

imageResult

The image url or encoded image if successful, and an error otherwise.

Name Type Description Required Default
b64_json string The base64 encoded image No
content_filter_results dalleContentFilterResults Information about the content filtering results. No
prompt_filter_results dalleFilterResults Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id. No
revised_prompt string The prompt that was used to generate the image, if there was any revision to the prompt. No
url string The image url. No

line

A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.

Name Type Description Required Default
spans array An array of spans that represent detected objects and its bounding box information. Yes
text string Yes

span

A span object that represents a detected object and its bounding box information.

Name Type Description Required Default
length integer The length of the span in characters, measured in Unicode codepoints. Yes
offset integer The character offset within the text where the span begins. This offset is defined as the position of the first character of the span, counting from the start of the text as Unicode codepoints. Yes
polygon array An array of objects representing points in the polygon that encloses the detected object. Yes
text string The text content of the span that represents the detected object. Yes

runCompletionUsage

Usage statistics related to the run. This value will be null if the run is not in a terminal state (i.e. in_progress, queued, etc.).

Name Type Description Required Default
completion_tokens integer Number of completion tokens used over the course of the run. Yes
prompt_tokens integer Number of prompt tokens used over the course of the run. Yes
total_tokens integer Total number of tokens used (prompt + completion). Yes

runStepCompletionUsage

Usage statistics related to the run step. This value will be null while the run step's status is in_progress.

Name Type Description Required Default
completion_tokens integer Number of completion tokens used over the course of the run step. Yes
prompt_tokens integer Number of prompt tokens used over the course of the run step. Yes
total_tokens integer Total number of tokens used (prompt + completion). Yes

assistantsApiResponseFormatOption

Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

This component can be one of the following:

assistantsApiResponseFormat

An object describing the expected output of the model. If json_object only function type tools are allowed to be passed to the Run. If text the model can return text or any value needed.

Name Type Description Required Default
type string Must be one of text or json_object. No text

type Enum: AssistantsApiResponseFormat

Value Description
text
json_object

assistantObject

Represents an assistant that can call the model and use tools.

Name Type Description Required Default
created_at integer The Unix timestamp (in seconds) for when the assistant was created. Yes
description string The description of the assistant. The maximum length is 512 characters.
Yes
id string The identifier, which can be referenced in API endpoints. Yes
instructions string The system instructions that the assistant uses. The maximum length is 256,000 characters.
Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
model string ID of the model to use. You can use the list models API to see all of your available models. Yes
name string The name of the assistant. The maximum length is 256 characters.
Yes
object string The object type, which is always assistant. Yes
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, file_search, or function.
Yes []
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1

object Enum: AssistantObjectType

Value Description
assistant The object type, which is always assistant

createAssistantRequest

Name Type Description Required Default
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 256,000 characters.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string Yes
name string The name of the assistant. The maximum length is 256 characters.
No
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
└─ vector_stores array A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1

modifyAssistantRequest

Name Type Description Required Default
description string The description of the assistant. The maximum length is 512 characters.
No
instructions string The system instructions that the assistant uses. The maximum length is 32768 characters.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string No
name string The name of the assistant. The maximum length is 256 characters.
No
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array Overrides the list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, retrieval, or function.
No []
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1

deleteAssistantResponse

Name Type Description Required Default
deleted boolean Yes
id string Yes
object string Yes

object Enum: DeleteAssistantResponseState

Value Description
assistant.deleted

listAssistantsResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

assistantToolsCode

Name Type Description Required Default
type string The type of tool being defined: code_interpreter Yes

type Enum: assistantToolsCodeType

Value Description
code_interpreter

assistantToolsFileSearch

Name Type Description Required Default
file_search object Overrides for the file search tool. No
└─ max_num_results integer The maximum number of results the file search tool should output. The default is 20 for gpt-4* models and 5 for gpt-3.5-turbo. This number should be between 1 and 50 inclusive.

Note that the file search tool may output fewer than max_num_results results.
No
type string The type of tool being defined: file_search Yes

type Enum: assistantToolsFileSearchType

Value Description
file_search

assistantToolsFileSearchTypeOnly

Name Type Description Required Default
type string The type of tool being defined: file_search Yes

type Enum: assistantToolsFileSearchType

Value Description
file_search

assistantToolsFunction

Name Type Description Required Default
function object The function definition. Yes
└─ description string A description of what the function does, used by the model to choose when and how to call the function. No
└─ name string The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. No
└─ parameters chatCompletionFunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format. No
type string The type of tool being defined: function Yes

type Enum: assistantToolsFunction

Value Description
function

truncationObject

Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.

Name Type Description Required Default
last_messages integer The number of most recent messages from the thread when constructing the context for the run. No
type string The truncation strategy to use for the thread. The default is auto. If set to last_messages, the thread will be truncated to the n most recent messages in the thread. When set to auto, messages in the middle of the thread will be dropped to fit the context length of the model, max_prompt_tokens. Yes

type Enum: TruncationType

Value Description
auto
last_messages

assistantsApiToolChoiceOption

Controls which (if any) tool is called by the model. none means the model will not call any tools and instead generates a message. auto is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

This component can be one of the following:

assistantsNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific tool.

Name Type Description Required Default
function object No
└─ name string The name of the function to call. No
type string The type of the tool. If type is function, the function name must be set Yes

type Enum: AssistantsNamedToolChoiceType

Value Description
function
code_interpreter
file_search

runObject

Represents an execution run on a thread.

Name Type Description Required Default
assistant_id string The ID of the assistant used for execution of this run. Yes
cancelled_at integer The Unix timestamp (in seconds) for when the run was cancelled. Yes
completed_at integer The Unix timestamp (in seconds) for when the run was completed. Yes
created_at integer The Unix timestamp (in seconds) for when the run was created. Yes
expires_at integer The Unix timestamp (in seconds) for when the run will expire. Yes
failed_at integer The Unix timestamp (in seconds) for when the run failed. Yes
id string The identifier, which can be referenced in API endpoints. Yes
incomplete_details object Details on why the run is incomplete. Will be null if the run is not incomplete. Yes
└─ reason string The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run. No
instructions string The instructions that the assistant used for this run. Yes
last_error object The last error associated with this run. Will be null if there are no errors. Yes
└─ code string One of server_error or rate_limit_exceeded. No
└─ message string A human-readable description of the error. No
max_completion_tokens integer The maximum number of completion tokens specified to have been used over the course of the run.
Yes
max_prompt_tokens integer The maximum number of prompt tokens specified to have been used over the course of the run.
Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
model string The model that the assistant used for this run. Yes
object string The object type, which is always thread.run. Yes
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
required_action object Details on the action required to continue the run. Will be null if no action is required. Yes
└─ submit_tool_outputs object Details on the tool outputs needed for this run to continue. No
└─ tool_calls array A list of the relevant tool calls. No
└─ type enum For now, this is always submit_tool_outputs.
Possible values: submit_tool_outputs
No
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
Yes
started_at integer The Unix timestamp (in seconds) for when the run was started. Yes
status string The status of the run, which can be either queued, in_progress, requires_action, cancelling, cancelled, failed, completed, or expired. Yes
temperature number The sampling temperature used for this run. If not set, defaults to 1. No
thread_id string The ID of the thread that was executed on as a part of this run. Yes
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model will not call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
Yes
tools array The list of tools that the assistant used for this run. Yes []
top_p number The nucleus sampling value used for this run. If not set, defaults to 1. No
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. Yes
usage runCompletionUsage Usage statistics related to the run. This value will be null if the run is not in a terminal state (i.e. in_progress, queued, etc.). Yes

object Enum: runObjectType

Value Description
thread.run The run object type which is always thread.run

status Enum: RunObjectStatus

Value Description
queued The queued state
in_progress The in_progress state
requires_action The required_action state
cancelling The cancelling state
cancelled The cancelled state
failed The failed state
completed The completed state
expired The expired state

createRunRequest

Name Type Description Required Default
additional_instructions string Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions. No
additional_messages array Adds additional messages to the thread before creating the run. No
assistant_id string The ID of the assistant to use to execute this run. Yes
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model will not call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No

listRunsResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

modifyRunRequest

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

submitToolOutputsRunRequest

Name Type Description Required Default
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
tool_outputs array A list of tools for which the outputs are being submitted. Yes

runToolCallObject

Tool call objects

Name Type Description Required Default
function object The function definition. Yes
└─ arguments string The arguments that the model expects you to pass to the function. No
└─ name string The name of the function. No
id string The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the submit tool outputs to run endpoint. Yes
type string The type of tool call the output is required for. For now, this is always function. Yes

type Enum: RunToolCallObjectType

Value Description
function

createThreadAndRunRequest

Name Type Description Required Default
assistant_id string The ID of the assistant to use to execute this run. Yes
instructions string Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. No
max_completion_tokens integer The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
max_prompt_tokens integer The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.
No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
model string The ID of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. No
parallel_tool_calls ParallelToolCalls Whether to enable parallel function calling during tool use. No True
response_format assistantsApiResponseFormatOption Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
No
stream boolean If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
No
stream_options chatCompletionStreamOptions Options for streaming response. Only set this when you set stream: true.
No None
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
No 1
thread createThreadRequest No
tool_choice assistantsApiToolChoiceOption Controls which (if any) tool is called by the model.
none means the model will not call any tools and instead generates a message.
auto is the default value and means the model can pick between generating a message or calling a tool.
Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.
No
tool_resources object A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
No
tools array Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. No
top_p number An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.
No 1
truncation_strategy truncationObject Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run. No

threadObject

Represents a thread that contains messages.

Name Type Description Required Default
created_at integer The Unix timestamp (in seconds) for when the thread was created. Yes
id string The identifier, which can be referenced in API endpoints. Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
object string The object type, which is always thread. Yes
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
Yes
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
No

object Enum: ThreadObjectType

Value Description
thread The type of thread object which is always thread

createThreadRequest

Name Type Description Required Default
messages array A list of messages to start the thread with. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
No
└─ vector_stores array A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread.
No

modifyThreadRequest

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
tool_resources object A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.
No
└─ code_interpreter object No
└─ file_ids array A list of File IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
No []
└─ file_search object No
└─ vector_store_ids array The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
No

deleteThreadResponse

Name Type Description Required Default
deleted boolean Yes
id string Yes
object string Yes

object Enum: DeleteThreadResponseObjectState

Value Description
thread.deleted The delete thread response object state which is thread.deleted

listThreadsResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

messageObject

Represents a message within a thread.

Name Type Description Required Default
assistant_id string If applicable, the ID of the assistant that authored this message. Yes
attachments array A list of files attached to the message, and the tools they were added to. Yes
completed_at integer The Unix timestamp (in seconds) for when the message was completed. Yes
content array The content of the message in array of text and/or images. Yes
created_at integer The Unix timestamp (in seconds) for when the message was created. Yes
id string The identifier, which can be referenced in API endpoints. Yes
incomplete_at integer The Unix timestamp (in seconds) for when the message was marked as incomplete. Yes
incomplete_details object On an incomplete message, details about why the message is incomplete. Yes
└─ reason string The reason the message is incomplete. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
object string The object type, which is always thread.message. Yes
role string The entity that produced the message. One of user or assistant. Yes
run_id string If applicable, the ID of the run associated with the authoring of this message. Yes
status string The status of the message, which can be either in_progress, incomplete, or completed. Yes
thread_id string The thread ID that this message belongs to. Yes

object Enum: MessageObjectType

Value Description
thread.message The message object type which is thread.message

status Enum: MessageObjectStatus

Value Description
in_progress
incomplete
completed

role Enum: MessageObjectRole

Value Description
user
assistant

messageDeltaObject

Represents a message delta i.e. any changed fields on a message during streaming.

Name Type Description Required Default
delta object The delta containing the fields that have changed on the Message. Yes
└─ content array The content of the message in array of text and/or images. No
└─ role string The entity that produced the message. One of user or assistant. No
id string The identifier of the message, which can be referenced in API endpoints. Yes
object string The object type, which is always thread.message.delta. Yes

object Enum: MessageDeltaObjectType

Value Description
thread.message.delta

createMessageRequest

Name Type Description Required Default
attachments array A list of files attached to the message, and the tools they should be added to. No
content string The content of the message. Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
role string The role of the entity that is creating the message. Allowed values include:
- user: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.
- assistant: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.
Yes

role Enum: CreateMessageRequestRole

Value Description
user
assistant

modifyMessageRequest

Name Type Description Required Default
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No

deleteMessageResponse

Name Type Description Required Default
deleted boolean Yes
id string Yes
object string Yes

object Enum: DeleteMessageResponseObject

Value Description
thread.message.deleted The delete message response object state

listMessagesResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

messageContentImageFileObject

References an image File in the content of a message.

Name Type Description Required Default
image_file object Yes
└─ file_id string The File ID of the image in the message content. No
type string Always image_file. Yes

type Enum: MessageContentImageFileObjectType

Value Description
image_file The message content image file type

messageContentTextObject

The text content that is part of a message.

Name Type Description Required Default
text object Yes
└─ annotations array No
└─ value string The data that makes up the text. No
type string Always text. Yes

type Enum: messageContentTextObjectType

Value Description
text The message content text Object type

messageContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "retrieval" tool to search files.

Name Type Description Required Default
end_index integer Yes
file_citation object Yes
└─ file_id string The ID of the specific File the citation is from. No
start_index integer Yes
text string The text in the message content that needs to be replaced. Yes
type string Always file_citation. Yes

type Enum: FileCitationObjectType

Value Description
file_citation The file citation object type

messageContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name Type Description Required Default
end_index integer Yes
file_path object Yes
└─ file_id string The ID of the file that was generated. No
start_index integer Yes
text string The text in the message content that needs to be replaced. Yes
type string Always file_path. Yes

type Enum: FilePathObjectType

Value Description
file_path The file path object type

messageDeltaContentImageFileObject

References an image File in the content of a message.

Name Type Description Required Default
image_file object No
└─ file_id string The File ID of the image in the message content. No
index integer The index of the content part in the message. Yes
type string Always image_file. Yes

type Enum: MessageDeltaContentImageFileObjectType

Value Description
image_file

messageDeltaContentTextObject

The text content that is part of a message.

Name Type Description Required Default
index integer The index of the content part in the message. Yes
text object No
└─ annotations array No
└─ value string The data that makes up the text. No
type string Always text. Yes

type Enum: MessageDeltaContentTextObjectType

Value Description
text

messageDeltaContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.

Name Type Description Required Default
end_index integer No
file_citation object No
└─ file_id string The ID of the specific File the citation is from. No
└─ quote string The specific quote in the file. No
index integer The index of the annotation in the text content part. Yes
start_index integer No
text string The text in the message content that needs to be replaced. No
type string Always file_citation. Yes

type Enum: MessageDeltaContentTextAnnotationsFileCitationObjectType

Value Description
file_citation

messageDeltaContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name Type Description Required Default
end_index integer No
file_path object No
└─ file_id string The ID of the file that was generated. No
index integer The index of the annotation in the text content part. Yes
start_index integer No
text string The text in the message content that needs to be replaced. No
type string Always file_path. Yes

type Enum: MessageDeltaContentTextAnnotationsFilePathObjectType

Value Description
file_path

runStepObject

Represents a step in execution of a run.

Name Type Description Required Default
assistant_id string The ID of the assistant associated with the run step. Yes
cancelled_at integer The Unix timestamp (in seconds) for when the run step was cancelled. Yes
completed_at integer The Unix timestamp (in seconds) for when the run step completed. Yes
created_at integer The Unix timestamp (in seconds) for when the run step was created. Yes
expired_at integer The Unix timestamp (in seconds) for when the run step expired. A step is considered expired if the parent run is expired. Yes
failed_at integer The Unix timestamp (in seconds) for when the run step failed. Yes
id string The identifier of the run step, which can be referenced in API endpoints. Yes
last_error object The last error associated with this run step. Will be null if there are no errors. Yes
└─ code string One of server_error or rate_limit_exceeded. No
└─ message string A human-readable description of the error. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
object string The object type, which is always assistant.run.step. Yes
run_id string The ID of the run that this run step is a part of. Yes
status string The status of the run, which can be either in_progress, cancelled, failed, completed, or expired. Yes
step_details runStepDetailsMessageCreationObject or runStepDetailsToolCallsObject The details of the run step. Yes
thread_id string The ID of the thread that was run. Yes
type string The type of run step, which can be either message_creation or tool_calls. Yes

object Enum: RunStepObjectType

Value Description
assistant.run.step The object type, which is always assistant.run.step

type Enum: RunStepObjectType

Value Description
message_creation The message_creation run step
tool_calls The tool_calls run step

status Enum: RunStepObjectStatus

Value Description
in_progress The in_progress run status
cancelled The cancelled run status
failed The cancelled run status
completed The cancelled run status
expired The cancelled run status

runStepDeltaObject

Represents a run step delta i.e. any changed fields on a run step during streaming.

Name Type Description Required Default
delta object The delta containing the fields that have changed on the run step. Yes
└─ step_details runStepDeltaStepDetailsMessageCreationObject or runStepDeltaStepDetailsToolCallsObject The details of the run step. No
id string The identifier of the run step, which can be referenced in API endpoints. Yes
object string The object type, which is always thread.run.step.delta. Yes

object Enum: RunStepDeltaObjectType

Value Description
thread.run.step.delta

listRunStepsResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

runStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name Type Description Required Default
message_creation object Yes
└─ message_id string The ID of the message that was created by this run step. No
type string Always message_creation. Yes

type Enum: RunStepDetailsMessageCreationObjectType

Value Description
message_creation

runStepDeltaStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name Type Description Required Default
message_creation object No
└─ message_id string The ID of the message that was created by this run step. No
type string Always message_creation. Yes

type Enum: RunStepDeltaStepDetailsMessageCreationObjectType

Value Description
message_creation

runStepDetailsToolCallsObject

Details of the tool call.

Name Type Description Required Default
tool_calls array An array of tool calls the run step was involved in. These can be associated with one of three types of tools: code_interpreter, retrieval or function.
Yes
type string Always tool_calls. Yes

type Enum: RunStepDetailsToolCallsObjectType

Value Description
tool_calls

runStepDeltaStepDetailsToolCallsObject

Details of the tool call.

Name Type Description Required Default
tool_calls array An array of tool calls the run step was involved in. These can be associated with one of three types of tools: code_interpreter, file_search or function.
No
type string Always tool_calls. Yes

type Enum: RunStepDeltaStepDetailsToolCallsObjectType

Value Description
tool_calls

runStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name Type Description Required Default
code_interpreter object The Code Interpreter tool call definition. Yes
└─ input string The input to the Code Interpreter tool call. No
└─ outputs array The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (logs) or images (image). Each of these are represented by a different object type. No
id string The ID of the tool call. Yes
type string The type of tool call. This is always going to be code_interpreter for this type of tool call. Yes

type Enum: RunStepDetailsToolCallsCodeObjectType

Value Description
code_interpreter

runStepDeltaStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name Type Description Required Default
code_interpreter object The Code Interpreter tool call definition. No
└─ input string The input to the Code Interpreter tool call. No
└─ outputs array The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (logs) or images (image). Each of these are represented by a different object type. No
id string The ID of the tool call. No
index integer The index of the tool call in the tool calls array. Yes
type string The type of tool call. This is always going to be code_interpreter for this type of tool call. Yes

type Enum: RunStepDeltaStepDetailsToolCallsCodeObjectType

Value Description
code_interpreter

runStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name Type Description Required Default
logs string The text output from the Code Interpreter tool call. Yes
type string Always logs. Yes

type Enum: RunStepDetailsToolCallsCodeOutputLogsObjectType

Value Description
logs

runStepDeltaStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name Type Description Required Default
index integer The index of the output in the outputs array. Yes
logs string The text output from the Code Interpreter tool call. No
type string Always logs. Yes

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputLogsObjectType

Value Description
logs

runStepDetailsToolCallsCodeOutputImageObject

Name Type Description Required Default
image object Yes
└─ file_id string The File ID of the image. No
type string Always image. Yes

type Enum: RunStepDetailsToolCallsCodeOutputImageObjectType

Value Description
image

runStepDeltaStepDetailsToolCallsCodeOutputImageObject

Name Type Description Required Default
image object No
└─ file_id string The file ID of the image. No
index integer The index of the output in the outputs array. Yes
type string Always image. Yes

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputImageObject

Value Description
image

runStepDetailsToolCallsFileSearchObject

Name Type Description Required Default
file_search object For now, this is always going to be an empty object. Yes
└─ results array The results of the file search. No
id string The ID of the tool call object. Yes
type string The type of tool call. This is always going to be file_search for this type of tool call. Yes

type Enum: RunStepDetailsToolCallsFileSearchObjectType

Value Description
file_search

runStepDetailsToolCallsFileSearchResultObject

A result instance of the file search.

Name Type Description Required Default
content array The content of the result that was found. The content is only included if requested via the include query parameter. No
file_id string The ID of the file that result was found in. Yes
file_name string The name of the file that result was found in. Yes
score number The score of the result. All values must be a floating point number between 0 and 1. Yes

runStepDeltaStepDetailsToolCallsFileSearchObject

Name Type Description Required Default
file_search object For now, this is always going to be an empty object. Yes
id string The ID of the tool call object. No
index integer The index of the tool call in the tool calls array. Yes
type string The type of tool call. This is always going to be retrieval for this type of tool call. Yes

type Enum: RunStepDeltaStepDetailsToolCallsFileSearchObjectType

Value Description
file_search

runStepDetailsToolCallsFunctionObject

Name Type Description Required Default
function object The definition of the function that was called. Yes
└─ arguments string The arguments passed to the function. No
└─ name string The name of the function. No
└─ output string The output of the function. This will be null if the outputs have not been submitted yet. No
id string The ID of the tool call object. Yes
type string The type of tool call. This is always going to be function for this type of tool call. Yes

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value Description
function

runStepDeltaStepDetailsToolCallsFunctionObject

Name Type Description Required Default
function object The definition of the function that was called. No
└─ arguments string The arguments passed to the function. No
└─ name string The name of the function. No
└─ output string The output of the function. This will be null if the outputs have not been submitted yet. No
id string The ID of the tool call object. No
index integer The index of the tool call in the tool calls array. Yes
type string The type of tool call. This is always going to be function for this type of tool call. Yes

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value Description
function

vectorStoreExpirationAfter

The expiration policy for a vector store.

Name Type Description Required Default
anchor string Anchor timestamp after which the expiration policy applies. Supported anchors: last_active_at. Yes
days integer The number of days after the anchor time that the vector store will expire. Yes

anchor Enum: VectorStoreExpirationAfterAnchor

Value Description
last_active_at The anchor timestamp after which the expiration policy applies.

vectorStoreObject

A vector store is a collection of processed files can be used by the file_search tool.

Name Type Description Required Default
created_at integer The Unix timestamp (in seconds) for when the vector store was created. Yes
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
expires_at integer The Unix timestamp (in seconds) for when the vector store will expire. No
file_counts object Yes
└─ cancelled integer The number of files that were cancelled. No
└─ completed integer The number of files that have been successfully processed. No
└─ failed integer The number of files that have failed to process. No
└─ in_progress integer The number of files that are currently being processed. No
└─ total integer The total number of files. No
id string The identifier, which can be referenced in API endpoints. Yes
last_active_at integer The Unix timestamp (in seconds) for when the vector store was last active. Yes
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
Yes
name string The name of the vector store. Yes
object enum The object type, which is always vector_store.
Possible values: vector_store
Yes
status string The status of the vector store, which can be either expired, in_progress, or completed. A status of completed indicates that the vector store is ready for use. Yes
usage_bytes integer The total number of bytes used by the files in the vector store. Yes

status Enum: VectorStoreObjectStatus

Value Description
expired
in_progress
completed

createVectorStoreRequest

Name Type Description Required Default
chunking_strategy autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is non-empty. No
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
file_ids array A list of file IDs that the vector store should use. Useful for tools like file_search that can access files. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
name string The name of the vector store. No

updateVectorStoreRequest

Name Type Description Required Default
expires_after vectorStoreExpirationAfter The expiration policy for a vector store. No
metadata object Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.
No
name string The name of the vector store. No

listVectorStoresResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

deleteVectorStoreResponse

Name Type Description Required Default
deleted boolean Yes
id string Yes
object string Yes

object Enum: DeleteVectorStoreResponseObject

Value Description
vector_store.deleted The delete vector store response object state

vectorStoreFileObject

A list of files attached to a vector store.

Name Type Description Required Default
chunking_strategy autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. Only applicable if file_ids is non-empty. No
created_at integer The Unix timestamp (in seconds) for when the vector store file was created. Yes
id string The identifier, which can be referenced in API endpoints. Yes
last_error object The last error associated with this vector store file. Will be null if there are no errors. Yes
└─ code string One of server_error or invalid_file or unsupported_file. No
└─ message string A human-readable description of the error. No
object string The object type, which is always vector_store.file. Yes
status string The status of the vector store file, which can be either in_progress, completed, cancelled, or failed. The status completed indicates that the vector store file is ready for use. Yes
usage_bytes integer The total vector store usage in bytes. Note that this may be different from the original file size. Yes
vector_store_id string The ID of the vector store that the file is attached to. Yes

object Enum: VectorStoreFileObjectType

Value Description
vector_store.file

status Enum: VectorStoreFileObjectStatus

Value Description
in_progress
completed
cancelled
failed

otherChunkingStrategyResponseParam

This is returned when the chunking strategy is unknown. Typically, this is because the file was indexed before the chunking_strategy concept was introduced in the API.

Name Type Description Required Default
type string Always other. Yes

type Enum: OtherChunkingStrategyResponseParamType

Value Description
other

staticChunkingStrategyResponseParam

Name Type Description Required Default
static staticChunkingStrategy Yes
type string Always static. Yes

type Enum: StaticChunkingStrategyResponseParamType

Value Description
static

staticChunkingStrategy

Name Type Description Required Default
chunk_overlap_tokens integer The number of tokens that overlap between chunks. The default value is 400.

Note that the overlap must not exceed half of max_chunk_size_tokens.
Yes
max_chunk_size_tokens integer The maximum number of tokens in each chunk. The default value is 800. The minimum value is 100 and the maximum value is 4096. Yes

autoChunkingStrategyRequestParam

The default strategy. This strategy currently uses a max_chunk_size_tokens of 800 and chunk_overlap_tokens of 400.

Name Type Description Required Default
type enum Always auto.
Possible values: auto
Yes

staticChunkingStrategyRequestParam

Name Type Description Required Default
static staticChunkingStrategy Yes
type enum Always static.
Possible values: static
Yes

chunkingStrategyRequestParam

The chunking strategy used to chunk the file(s). If not set, will use the auto strategy.

This component can be one of the following:

createVectorStoreFileRequest

Name Type Description Required Default
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No
file_id string A File ID that the vector store should use. Useful for tools like file_search that can access files. Yes

listVectorStoreFilesResponse

Name Type Description Required Default
data array Yes
first_id string Yes
has_more boolean Yes
last_id string Yes
object string Yes

deleteVectorStoreFileResponse

Name Type Description Required Default
deleted boolean Yes
id string Yes
object string Yes

object Enum: DeleteVectorStoreFileResponseObject

Value Description
vector_store.file.deleted

vectorStoreFileBatchObject

A batch of files attached to a vector store.

Name Type Description Required Default
created_at integer The Unix timestamp (in seconds) for when the vector store files batch was created. Yes
file_counts object Yes
└─ cancelled integer The number of files that where cancelled. No
└─ completed integer The number of files that have been processed. No
└─ failed integer The number of files that have failed to process. No
└─ in_progress integer The number of files that are currently being processed. No
└─ total integer The total number of files. No
id string The identifier, which can be referenced in API endpoints. Yes
object string The object type, which is always vector_store.file_batch. Yes
status string The status of the vector store files batch, which can be either in_progress, completed, cancelled or failed. Yes
vector_store_id string The ID of the vector store that the File is attached to. Yes

object Enum: VectorStoreFileBatchObjectType

Value Description
vector_store.files_batch

status Enum: VectorStoreFileBatchObjectStatus

Value Description
in_progress
completed
cancelled
failed

createVectorStoreFileBatchRequest

Name Type Description Required Default
chunking_strategy chunkingStrategyRequestParam The chunking strategy used to chunk the file(s). If not set, will use the auto strategy. No
file_ids array A list of File IDs that the vector store should use. Useful for tools like file_search that can access files. Yes

assistantStreamEvent

Represents an event emitted when streaming a Run.

Each event in a server-sent events stream has an event and data property:

event: thread.created
data: {"id": "thread_123", "object": "thread", ...}

We emit events whenever a new object is created, transitions to a new state, or is being streamed in parts (deltas). For example, we emit thread.run.created when a new run is created, thread.run.completed when a run completes, and so on. When an Assistant chooses to create a message during a run, we emit a thread.message.created event, a thread.message.in_progress event, many thread.message.delta events, and finally a thread.message.completed event.

We may add additional events over time, so we recommend handling unknown events gracefully in your code.

This component can be one of the following:

threadStreamEvent

This component can be one of the following:

thread.created

Occurs when a new thread is created.

Name Type Description Required Default
data threadObject Represents a thread that contains messages. Yes
event string Yes

Data: threadObject

Event Enum: ThreadStreamEventEnum

Value Description
thread.created The thread created event

runStreamEvent

This component can be one of the following:

thread.run.created

Occurs when a new run is created.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventCreated

Value Description
thread.run.created

thread.run.queued

Occurs when a run moves to a queued status.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventQueued

Value Description
thread.run.queued

thread.run.in_progress

Occurs when a run moves to an in_progress status.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventInProgress

Value Description
thread.run.in_progress

thread.run.requires_action

Occurs when a run moves to a requires_action status.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventRequiresAction

Value Description
thread.run.requires_action

thread.run.completed

Occurs when a run is completed.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventCompleted

Value Description
thread.run.completed

thread.run.failed

Occurs when a run fails.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventFailed

Value Description
thread.run.failed

thread.run.cancelling

Occurs when a run moves to a cancelling status.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventCancelling

Value Description
thread.run.cancelling

thread.run.cancelled

Occurs when a run is cancelled.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventCancelled

Value Description
thread.run.cancelled

thread.run.expired

Occurs when a run expires.

Name Type Description Required Default
data runObject Represents an execution run on a thread. Yes
event string Yes

Data: runObject

Event Enum: RunStreamEventExpired

Value Description
thread.run.expired

runStepStreamEvent

This component can be one of the following:

thread.run.step.created

Occurs when a run step is created.

Name Type Description Required Default
data runStepObject Represents a step in execution of a run.
Yes
event string Yes

Data: runStepObject

Event Enum: RunStepStreamEventCreated

Value Description
thread.run.step.created

thread.run.step.in_progress

Occurs when a run step moves to an in_progress state.

Name Type Description Required Default
data runStepObject Represents a step in execution of a run.
Yes
event string Yes

Data: runStepObject

Event Enum: RunStepStreamEventInProgress

Value Description
thread.run.step.in_progress

thread.run.step.delta

Occurs when parts of a run step are being streamed.

Name Type Description Required Default
data runStepDeltaObject Represents a run step delta i.e. any changed fields on a run step during streaming.
Yes
event string Yes

Data: runStepDeltaObject

Event Enum: RunStepStreamEventDelta

Value Description
thread.run.step.delta

thread.run.step.completed

Occurs when a run step is completed.

Name Type Description Required Default
data runStepObject Represents a step in execution of a run.
Yes
event string Yes

Data: runStepObject

Event Enum: RunStepStreamEventCompleted

Value Description
thread.run.step.completed

thread.run.step.failed

Occurs when a run step fails.

Name Type Description Required Default
data runStepObject Represents a step in execution of a run.
Yes
event string Yes

Data: runStepObject

Event Enum: RunStepStreamEventFailed

Value Description
thread.run.step.failed

thread.run.step.cancelled

Occurs when a run step is cancelled.

Name Type Description Required Default
data runStepObject Represents a step in execution of a run.
Yes
event string Yes

Data: runStepObject

Event Enum: RunStepStreamEventCancelled

Value Description
thread.run.step.cancelled

thread.run.step.expired

Occurs when a run step expires.

Name Type Description Required Default
data runStepObject Represents a step in execution of a run.
Yes
event string Yes

Data: runStepObject

Event Enum: RunStepStreamEventExpired

Value Description
thread.run.step.expired

messageStreamEvent

This component can be one of the following:

thread.message.created

Occurs when a message is created.

Name Type Description Required Default
data messageObject Represents a message within a thread. Yes
event string Yes

Data: messageObject

Event Enum: MessageStreamEventCreated

Value Description
thread.message.created

thread.message.in_progress

Occurs when a message moves to an in_progress state.

Name Type Description Required Default
data messageObject Represents a message within a thread. Yes
event string Yes

Data: messageObject

Event Enum: MessageStreamEventInProgress

Value Description
thread.message.in_progress

thread.message.delta

Occurs when parts of a message are being streamed.

Name Type Description Required Default
data messageDeltaObject Represents a message delta i.e. any changed fields on a message during streaming.
Yes
event string Yes

Data: messageDeltaObject

Event Enum: MessageStreamEventDelta

Value Description
thread.message.delta

thread.message.completed

Occurs when a message is completed.

Name Type Description Required Default
data messageObject Represents a message within a thread. Yes
event string Yes

Data: messageObject

Event Enum: MessageStreamEventCompleted

Value Description
thread.message.completed

thread.message.incomplete

Occurs when a message ends before it is completed.

Name Type Description Required Default
data messageObject Represents a message within a thread. Yes
event string Yes

Data: messageObject

Event Enum: MessageStreamEventIncomplete

Value Description
thread.message.incomplete

Annotation

This component can be one of the following:

Click

A click action.

Name Type Description Required Default
button enum Indicates which mouse button was pressed during the click. One of left, right, wheel, back, or forward.

Possible values: left, right, wheel, back, forward
Yes
type enum Specifies the event type. For a click action, this property is
always set to click.

Possible values: click
Yes
x integer The x-coordinate where the click occurred.
Yes
y integer The y-coordinate where the click occurred.
Yes

CodeInterpreterFileOutput

The output of a code interpreter tool call that is a file.

Name Type Description Required Default
files array Yes
type enum The type of the code interpreter file output. Always files.

Possible values: files
Yes

CodeInterpreterTextOutput

The output of a code interpreter tool call that is text.

Name Type Description Required Default
logs string The logs of the code interpreter tool call.
Yes
type enum The type of the code interpreter text output. Always logs.

Possible values: logs
Yes

CodeInterpreterTool

A tool that runs code.

Name Type Description Required Default
file_ids array The IDs of the files to run the code on.
Yes
type enum The type of the code interpreter tool. Always code_interpreter.

Possible values: code_interpreter
Yes

CodeInterpreterToolCall

A tool call to run code.

Name Type Description Required Default
code string The code to run.
Yes
id string The unique ID of the code interpreter tool call.
Yes
results array The results of the code interpreter tool call.
Yes
status enum The status of the code interpreter tool call.

Possible values: in_progress, interpreting, completed
Yes
type enum The type of the code interpreter tool call. Always code_interpreter_call.

Possible values: code_interpreter_call
Yes

CodeInterpreterToolOutput

This component can be one of the following:

ComparisonFilter

A filter used to compare a specified attribute key to a given value using a defined comparison operation.

Name Type Description Required Default
key string The key to compare against the value. Yes
type enum Specifies the comparison operator: eq, ne, gt, gte, lt, lte.
- eq: equals
- ne: not equal
- gt: greater than
- gte: greater than or equal
- lt: less than
- lte: less than or equal

Possible values: eq, ne, gt, gte, lt, lte
Yes
value string or number or boolean The value to compare against the attribute key; supports string, number, or boolean types. Yes

CompoundFilter

Combine multiple filters using and or or.

Name Type Description Required Default
filters array Array of filters to combine. Items can be ComparisonFilter or CompoundFilter. Yes
type enum Type of operation: and or or.
Possible values: and, or
Yes

ComputerAction

This component can be one of the following:

ComputerScreenshotImage

A computer screenshot image used with the computer use tool.

Name Type Description Required Default
file_id string The identifier of an uploaded file that contains the screenshot. No
image_url string The URL of the screenshot image. No
type enum Specifies the event type. For a computer screenshot, this property is
always set to computer_screenshot.

Possible values: computer_screenshot
Yes

ComputerTool

A tool that controls a virtual computer.

Name Type Description Required Default
display_height number The height of the computer display.
Yes
display_width number The width of the computer display.
Yes
environment enum The type of computer environment to control.

Possible values: mac, windows, ubuntu, browser
Yes
type enum The type of the computer use tool. Always computer_use_preview.

Possible values: computer_use_preview
Yes

ComputerToolCall

A tool call to a computer use tool.

Name Type Description Required Default
action ComputerAction Yes
call_id string An identifier used when responding to the tool call with output.
Yes
id string The unique ID of the computer call. Yes
pending_safety_checks array The pending safety checks for the computer call.
Yes
status enum The status of the item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
Yes
type enum The type of the computer call. Always computer_call.
Possible values: computer_call
Yes

ComputerToolCallOutput

The output of a computer tool call.

Name Type Description Required Default
acknowledged_safety_checks array The safety checks reported by the API that have been acknowledged by the
developer.
No
call_id string The ID of the computer tool call that produced the output.
Yes
id string The ID of the computer tool call output.
No
output ComputerScreenshotImage A computer screenshot image used with the computer use tool.
Yes
status enum The status of the message input. One of in_progress, completed, or
incomplete. Populated when input items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the computer tool call output. Always computer_call_output.

Possible values: computer_call_output
Yes

ComputerToolCallOutputResource

Name Type Description Required Default
acknowledged_safety_checks array The safety checks reported by the API that have been acknowledged by the
developer.
No
call_id string The ID of the computer tool call that produced the output.
Yes
id string The unique ID of the computer call tool output.
Yes
output ComputerScreenshotImage A computer screenshot image used with the computer use tool.
Yes
status enum The status of the message input. One of in_progress, completed, or
incomplete. Populated when input items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the computer tool call output. Always computer_call_output.

Possible values: computer_call_output
Yes

ComputerToolCallSafetyCheck

A pending safety check for the computer call.

Name Type Description Required Default
code string The type of the pending safety check. Yes
id string The ID of the pending safety check. Yes
message string Details about the pending safety check. Yes

Content

Multi-modal input and output contents.

This component can be one of the following:

Coordinate

An x/y coordinate pair, e.g. { x: 100, y: 200 }.

Name Type Description Required Default
x integer The x-coordinate.
Yes
y integer The y-coordinate.
Yes

CreateModelResponseProperties

Name Type Description Required Default
metadata Metadata Set of 16 key-value pairs that can be attached to an object. This can be
useful for storing additional information about the object in a structured
format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings
with a maximum length of 512 characters.
No
model string Model used to generate the responses. No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling,
where the model considers the results of the tokens with top_p probability
mass. So 0.1 means only the tokens comprising the top 10% probability mass
are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. No

createResponse

Name Type Description Required Default
include array {"$ref": "#/components/schemas/includable/description"} No
input string or array Text, image, or file inputs to the model, used to generate a response.

Learn more:
- Text inputs and outputs
- Image inputs
- File inputs
- Conversation state
- function calling
Yes
instructions string Inserts a system (or developer) message as the first item in the model's context.

When using along with previous_response_id, the instructions from a previous
response will be not be carried over to the next response. This makes it simple
to swap out system (or developer) messages in new responses.
No
max_output_tokens integer An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.
No
parallel_tool_calls boolean Whether to allow the model to run tool calls in parallel.
No True
previous_response_id string The unique ID of the previous response to the model. Use this to
create multi-turn conversations. Learn more about
Conversation state.
No
reasoning Reasoning o-series models only

Configuration options for
reasoning models.
No
store boolean Whether to store the generated model response for later retrieval via
API.
No True
stream boolean If set to true, the model response data will be streamed to the client
as it is generated using server-sent events.
No False
text object Configuration options for a text response from the model. Can be plain
text or structured JSON data. Learn more:
- Text inputs and outputs
- Structured Outputs
No
└─ format TextResponseFormatConfiguration An object specifying the format that the model must output.

Configuring { "type": "json_schema" } enables Structured Outputs,
which ensures the model will match your supplied JSON schema.

The default format is { "type": "text" } with no additional options.

Not recommended for gpt-4o and newer models:

Setting to { "type": "json_object" } enables the older JSON mode, which
ensures the message the model generates is valid JSON. Using json_schema
is preferred for models that support it.
No
tool_choice ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction How the model should select which tool (or tools) to use when generating
a response. See the tools parameter to see how to specify which tools
the model can call.
No
tools array An array of tools the model may call while generating a response. You
can specify which tool to use by setting the tool_choice parameter.

The two categories of tools you can provide the model are:

- Built-in tools: Tools that are provided by OpenAI that extend the
model's capabilities, file search.
- Function calls (custom tools): Functions that are defined by you,
enabling the model to call your own code. Learn more about
function calling.
No
truncation enum The truncation strategy to use for the model response.
- auto: If the context of this response and previous ones exceeds
the model's context window size, the model will truncate the
response to fit the context window by dropping input items in the
middle of the conversation.
- disabled (default): If a model response will exceed the context window
size for a model, the request will fail with a 400 error.

Possible values: auto, disabled
No

DoubleClick

A double click action.

Name Type Description Required Default
type enum Specifies the event type. For a double click action, this property is
always set to double_click.

Possible values: double_click
Yes
x integer The x-coordinate where the double click occurred.
Yes
y integer The y-coordinate where the double click occurred.
Yes

Drag

A drag action.

Name Type Description Required Default
path array An array of coordinates representing the path of the drag action. Coordinates will appear as an array
of objects, eg
<br>[<br> { x: 100, y: 200 },<br> { x: 200, y: 300 }<br>]<br>
Yes
type enum Specifies the event type. For a drag action, this property is
always set to drag.

Possible values: drag
Yes

EasyInputMessage

A message input to the model with a role indicating instruction following hierarchy. Instructions given with the developer or system role take precedence over instructions given with the user role. Messages with the assistant role are presumed to have been generated by the model in previous interactions.

Name Type Description Required Default
content string or InputMessageContentList Text, image, or audio input to the model, used to generate a response.
Can also contain previous assistant responses.
Yes
role enum The role of the message input. One of user, assistant, system, or
developer.

Possible values: user, assistant, system, developer
Yes
type enum The type of the message input. Always message.

Possible values: message
No

FileCitation

A citation to a file.

Name Type Description Required Default
file_id string The ID of the file.
Yes
index integer The index of the file in the list of files.
Yes
type enum The type of the file citation. Always file_citation.

Possible values: file_citation
Yes

FilePath

A path to a file.

Name Type Description Required Default
file_id string The ID of the file.
Yes
index integer The index of the file in the list of files.
Yes
type enum The type of the file path. Always file_path.

Possible values: file_path
Yes

FileSearchRanker

The ranker to use for the file search. If not specified will use the auto ranker.

Property Value
Description The ranker to use for the file search. If not specified will use the auto ranker.
Type string
Values auto
default_2024_08_21

FileSearchTool

A tool that searches for relevant content from uploaded files.

Name Type Description Required Default
filters ComparisonFilter or CompoundFilter A filter to apply based on file attributes. No
max_num_results integer The maximum number of results to return. This number should be between 1
and 50 inclusive.
No
ranking_options object Ranking options for search. No
└─ ranker enum The ranker to use for the file search.
Possible values: auto, default-2024-11-15
No
└─ score_threshold number The score threshold for the file search, a number between 0 and 1.
Numbers closer to 1 will attempt to return only the most relevant
results, but may return fewer results.
No 0
type enum The type of the file search tool. Always file_search.

Possible values: file_search
Yes
vector_store_ids array The IDs of the vector stores to search.
Yes

FileSearchToolCall

The results of a file search tool call. See the

Name Type Description Required Default
id string The unique ID of the file search tool call.
Yes
queries array The queries used to search for files.
Yes
results array The results of the file search tool call.
No
status enum The status of the file search tool call. One of in_progress,
searching, incomplete or failed,

Possible values: in_progress, searching, completed, incomplete, failed
Yes
type enum The type of the file search tool call. Always file_search_call.

Possible values: file_search_call
Yes

FunctionTool

Defines a function in your own code the model can choose to call. Learn more about function calling.

Name Type Description Required Default
description string A description of the function. Used by the model to determine whether
or not to call the function.
No
name string The name of the function to call.
Yes
parameters object A JSON schema object describing the parameters of the function.
Yes
strict boolean Whether to enforce strict parameter validation. Default true.
Yes
type enum The type of the function tool. Always function.

Possible values: function
Yes

FunctionToolCall

A tool call to run a function. See the

Name Type Description Required Default
arguments string A JSON string of the arguments to pass to the function.
Yes
call_id string The unique ID of the function tool call generated by the model.
Yes
id string The unique ID of the function tool call.
Yes
name string The name of the function to run.
Yes
status enum The status of the item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the function tool call. Always function_call.

Possible values: function_call
Yes

FunctionToolCallOutput

The output of a function tool call.

Name Type Description Required Default
call_id string The unique ID of the function tool call generated by the model.
Yes
id string The unique ID of the function tool call output. Populated when this item
is returned via API.
No
output string A JSON string of the output of the function tool call.
Yes
status enum The status of the item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the function tool call output. Always function_call_output.

Possible values: function_call_output
Yes

FunctionToolCallOutputResource

Name Type Description Required Default
call_id string The unique ID of the function tool call generated by the model.
Yes
id string The unique ID of the function call tool output.
Yes
output string A JSON string of the output of the function tool call.
Yes
status enum The status of the item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the function tool call output. Always function_call_output.

Possible values: function_call_output
Yes

includable

Specify additional output data to include in the model response. Currently supported values are:

  • file_search_call.results: Include the search results of the file search tool call.
  • message.input_image.image_url: Include image urls from the input message.
  • computer_call_output.output.image_url: Include image urls from the computer call output.
Property Value
Description Specify additional output data to include in the model response. Currently
supported values are:
- file_search_call.results: Include the search results of
the file search tool call.
- message.input_image.image_url: Include image urls from the input message.
- computer_call_output.output.image_url: Include image urls from the computer call output.
Type string
Values file_search_call.results
message.input_image.image_url
computer_call_output.output.image_url

InputAudio

An audio input to the model.

Name Type Description Required Default
data string Base64-encoded audio data.
Yes
format enum The format of the audio data. Currently supported formats are mp3 and
wav.

Possible values: mp3, wav
Yes
type enum The type of the input item. Always input_audio.

Possible values: input_audio
Yes

InputContent

This component can be one of the following:

InputFile

A file input to the model.

Name Type Description Required Default
file_data string The content of the file to be sent to the model.
No
file_id string The ID of the file to be sent to the model.
No
filename string The name of the file to be sent to the model.
No
type enum The type of the input item. Always input_file.

Possible values: input_file
Yes

InputImage

An image input to the model.

Name Type Description Required Default
detail enum The detail level of the image to be sent to the model. One of high,
low, or auto. Defaults to auto.

Possible values: high, low, auto
Yes
file_id string The ID of the file to be sent to the model.
No
image_url string The URL of the image to be sent to the model. A fully qualified URL or
base64 encoded image in a data URL.
No
type enum The type of the input item. Always input_image.

Possible values: input_image
Yes

InputItem

Discriminator for InputItem

This component uses the property type to discriminate between different types:

This component can be one of the following:

InputMessage

A message input to the model with a role indicating instruction following hierarchy. Instructions given with the developer or system role take precedence over instructions given with the user role.

Name Type Description Required Default
content InputMessageContentList A list of one or many input items to the model, containing different content
types.
Yes
role enum The role of the message input. One of user, system, or developer.

Possible values: user, system, developer
Yes
status enum The status of item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the message input. Always set to message.

Possible values: message
No

InputMessageContentList

A list of one or many input items to the model, containing different content types.

No properties defined for this component.

InputMessageResource

Name Type Description Required Default
content InputMessageContentList A list of one or many input items to the model, containing different content
types.
Yes
id string The unique ID of the message input.
Yes
role enum The role of the message input. One of user, system, or developer.

Possible values: user, system, developer
Yes
status enum The status of item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the message input. Always set to message.

Possible values: message
No

InputText

A text input to the model.

Name Type Description Required Default
text string The text input to the model.
Yes
type enum The type of the input item. Always input_text.

Possible values: input_text
Yes

Item

Content item used to generate a response.

This component can be one of the following:

ItemReference

An internal identifier for an item to reference.

Name Type Description Required Default
id string The ID of the item to reference.
Yes
type enum The type of item to reference. Always item_reference.

Possible values: item_reference
Yes

ItemResource

Content item used to generate a response.

This component can be one of the following:

KeyPress

A collection of keypresses the model would like to perform.

Name Type Description Required Default
keys array The combination of keys the model is requesting to be pressed. This is an
array of strings, each representing a key.
Yes
type enum Specifies the event type. For a keypress action, this property is
always set to keypress.

Possible values: keypress
Yes

Metadata

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

No properties defined for this component.

ModelResponseProperties

Name Type Description Required Default
metadata Metadata Set of 16 key-value pairs that can be attached to an object. This can be
useful for storing additional information about the object in a structured
format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings
with a maximum length of 512 characters.
No
model string Model used to generate the responses. No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or top_p but not both.
No 1
top_p number An alternative to sampling with temperature, called nucleus sampling,
where the model considers the results of the tokens with top_p probability
mass. So 0.1 means only the tokens comprising the top 10% probability mass
are considered.

We generally recommend altering this or temperature but not both.
No 1
user string A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. No

Move

A mouse move action.

Name Type Description Required Default
type enum Specifies the event type. For a move action, this property is
always set to move.

Possible values: move
Yes
x integer The x-coordinate to move to.
Yes
y integer The y-coordinate to move to.
Yes

OutputAudio

An audio output from the model.

Name Type Description Required Default
data string Base64-encoded audio data from the model.
Yes
transcript string The transcript of the audio data from the model.
Yes
type enum The type of the output audio. Always output_audio.

Possible values: output_audio
Yes

OutputContent

This component can be one of the following:

OutputItem

Name Type Description Required Default
action ComputerAction Yes
arguments string A JSON string of the arguments to pass to the function.
Yes
call_id string An identifier used when responding to the tool call with output.
Yes
content array Reasoning text contents.
Yes
id string The unique identifier of the reasoning content.
Yes
name string The name of the function to run.
Yes
pending_safety_checks array The pending safety checks for the computer call.
Yes
queries array The queries used to search for files.
Yes
results array The results of the file search tool call.
No
role enum The role of the output message. Always assistant.

Possible values: assistant
Yes
status enum The status of the item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
Yes
type enum The type of the object. Always reasoning.

Possible values: reasoning
Yes

OutputMessage

An output message from the model.

Name Type Description Required Default
content array The content of the output message.
Yes
id string The unique ID of the output message.
Yes
role enum The role of the output message. Always assistant.

Possible values: assistant
Yes
status enum The status of the message input. One of in_progress, completed, or
incomplete. Populated when input items are returned via API.

Possible values: in_progress, completed, incomplete
Yes
type enum The type of the output message. Always message.

Possible values: message
Yes

OutputText

A text output from the model.

Name Type Description Required Default
annotations array The annotations of the text output.
Yes
text string The text output from the model.
Yes
type enum The type of the output text. Always output_text.

Possible values: output_text
Yes

Reasoning

o-series models only

Configuration options for reasoning models.

Name Type Description Required Default
effort ReasoningEffort o-series models only

Constrains effort on reasoning for
reasoning models.
Currently supported values are low, medium, and high. Reducing
reasoning effort can result in faster responses and fewer tokens used
on reasoning in a response.
Yes medium
generate_summary enum o-series models only

A summary of the reasoning performed by the model. This can be
useful for debugging and understanding the model's reasoning process.
One of concise or detailed.

Possible values: concise, detailed
No

ReasoningItem

A description of the chain of thought used by a reasoning model while generating a response.

Name Type Description Required Default
content array Reasoning text contents.
Yes
id string The unique identifier of the reasoning content.
Yes
status enum The status of the item. One of in_progress, completed, or
incomplete. Populated when items are returned via API.

Possible values: in_progress, completed, incomplete
No
type enum The type of the object. Always reasoning.

Possible values: reasoning
Yes

Refusal

A refusal from the model.

Name Type Description Required Default
refusal string The refusal explanationfrom the model.
Yes
type enum The type of the refusal. Always refusal.

Possible values: refusal
Yes

response

Name Type Description Required Default
created_at number Unix timestamp (in seconds) of when this Response was created.
Yes
error ResponseError An error object returned when the model fails to generate a Response.
Yes
id string Unique identifier for this Response.
Yes
incomplete_details object Details about why the response is incomplete.
Yes
└─ reason enum The reason why the response is incomplete.
Possible values: max_output_tokens, content_filter
No
instructions string Inserts a system (or developer) message as the first item in the model's context.

When using along with previous_response_id, the instructions from a previous
response will be not be carried over to the next response. This makes it simple
to swap out system (or developer) messages in new responses.
Yes
max_output_tokens integer An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.
No
metadata Metadata Set of 16 key-value pairs that can be attached to an object. This can be
useful for storing additional information about the object in a structured
format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings
with a maximum length of 512 characters.
Yes
model string Model used to generate the responses. Yes
object enum The object type of this resource - always set to response.

Possible values: response
Yes
output array An array of content items generated by the model.

- The length and order of items in the output array is dependent
on the model's response.
- Rather than accessing the first item in the output array and
assuming it's an assistant message with the content generated by
the model, you might consider using the output_text property where
supported in SDKs.
Yes
output_text string SDK-only convenience property that contains the aggregated text output
from all output_text items in the output array, if any are present.
Supported in the Python and JavaScript SDKs.
No
parallel_tool_calls boolean Whether to allow the model to run tool calls in parallel.
Yes True
previous_response_id string The unique ID of the previous response to the model. Use this to
create multi-turn conversations. Learn more about
Conversation state.
No
reasoning Reasoning o-series models only

Configuration options for
reasoning models.
No
status enum The status of the response generation. One of completed, failed,
in_progress, or incomplete.

Possible values: completed, failed, in_progress, incomplete
No
temperature number What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or top_p but not both.
Yes 1
text object Configuration options for a text response from the model. Can be plain
text or structured JSON data. Learn more:
- Text inputs and outputs
- Structured Outputs
No
└─ format TextResponseFormatConfiguration An object specifying the format that the model must output.

Configuring { "type": "json_schema" } enables Structured Outputs,
which ensures the model will match your supplied JSON schema.

The default format is { "type": "text" } with no additional options.

Not recommended for gpt-4o and newer models:

Setting to { "type": "json_object" } enables the older JSON mode, which
ensures the message the model generates is valid JSON. Using json_schema
is preferred for models that support it.
No
tool_choice ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction How the model should select which tool (or tools) to use when generating
a response. See the tools parameter to see how to specify which tools
the model can call.
Yes
tools array An array of tools the model may call while generating a response. You
can specify which tool to use by setting the tool_choice parameter.

The two categories of tools you can provide the model are:

- Built-in tools: Tools that are provided by OpenAI that extend the
model's capabilities, file search.
- Function calls (custom tools): Functions that are defined by you,
enabling the model to call your own code. Learn more about
function calling.
Yes
top_p number An alternative to sampling with temperature, called nucleus sampling,
where the model considers the results of the tokens with top_p probability
mass. So 0.1 means only the tokens comprising the top 10% probability mass
are considered.

We generally recommend altering this or temperature but not both.
Yes 1
truncation enum The truncation strategy to use for the model response.
- auto: If the context of this response and previous ones exceeds
the model's context window size, the model will truncate the
response to fit the context window by dropping input items in the
middle of the conversation.
- disabled (default): If a model response will exceed the context window
size for a model, the request will fail with a 400 error.

Possible values: auto, disabled
No
usage ResponseUsage Represents token usage details including input tokens, output tokens,
a breakdown of output tokens, and the total tokens used.
No
user string A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. No

ResponseAudioDeltaEvent

Emitted when there is a partial audio response.

Name Type Description Required Default
delta string A chunk of Base64 encoded response audio bytes.
Yes
type enum The type of the event. Always response.audio.delta.

Possible values: response.audio.delta
Yes

ResponseAudioDoneEvent

Emitted when the audio response is complete.

Name Type Description Required Default
type enum The type of the event. Always response.audio.done.

Possible values: response.audio.done
Yes

ResponseAudioTranscriptDeltaEvent

Emitted when there is a partial transcript of audio.

Name Type Description Required Default
delta string The partial transcript of the audio response.
Yes
type enum The type of the event. Always response.audio.transcript.delta.

Possible values: response.audio.transcript.delta
Yes

ResponseAudioTranscriptDoneEvent

Emitted when the full audio transcript is completed.

Name Type Description Required Default
type enum The type of the event. Always response.audio.transcript.done.

Possible values: response.audio.transcript.done
Yes

ResponseCodeInterpreterCallCodeDeltaEvent

Emitted when a partial code snippet is added by the code interpreter.

Name Type Description Required Default
delta string The partial code snippet added by the code interpreter.
Yes
output_index integer The index of the output item that the code interpreter call is in progress.
Yes
type enum The type of the event. Always response.code_interpreter_call.code.delta.

Possible values: response.code_interpreter_call.code.delta
Yes

ResponseCodeInterpreterCallCodeDoneEvent

Emitted when code snippet output is finalized by the code interpreter.

Name Type Description Required Default
code string The final code snippet output by the code interpreter.
Yes
output_index integer The index of the output item that the code interpreter call is in progress.
Yes
type enum The type of the event. Always response.code_interpreter_call.code.done.

Possible values: response.code_interpreter_call.code.done
Yes

ResponseCodeInterpreterCallCompletedEvent

Emitted when the code interpreter call is completed.

Name Type Description Required Default
code_interpreter_call CodeInterpreterToolCall A tool call to run code.
Yes
output_index integer The index of the output item that the code interpreter call is in progress.
Yes
type enum The type of the event. Always response.code_interpreter_call.completed.

Possible values: response.code_interpreter_call.completed
Yes

ResponseCodeInterpreterCallInProgressEvent

Emitted when a code interpreter call is in progress.

Name Type Description Required Default
code_interpreter_call CodeInterpreterToolCall A tool call to run code.
Yes
output_index integer The index of the output item that the code interpreter call is in progress.
Yes
type enum The type of the event. Always response.code_interpreter_call.in_progress.

Possible values: response.code_interpreter_call.in_progress
Yes

ResponseCodeInterpreterCallInterpretingEvent

Emitted when the code interpreter is actively interpreting the code snippet.

Name Type Description Required Default
code_interpreter_call CodeInterpreterToolCall A tool call to run code.
Yes
output_index integer The index of the output item that the code interpreter call is in progress.
Yes
type enum The type of the event. Always response.code_interpreter_call.interpreting.

Possible values: response.code_interpreter_call.interpreting
Yes

ResponseCompletedEvent

Emitted when the model response is complete.

Name Type Description Required Default
response response Yes
type enum The type of the event. Always response.completed.

Possible values: response.completed
Yes

ResponseContentPartAddedEvent

Emitted when a new content part is added.

Name Type Description Required Default
content_index integer The index of the content part that was added.
Yes
item_id string The ID of the output item that the content part was added to.
Yes
output_index integer The index of the output item that the content part was added to.
Yes
part OutputContent Yes
type enum The type of the event. Always response.content_part.added.

Possible values: response.content_part.added
Yes

ResponseContentPartDoneEvent

Emitted when a content part is done.

Name Type Description Required Default
content_index integer The index of the content part that is done.
Yes
item_id string The ID of the output item that the content part was added to.
Yes
output_index integer The index of the output item that the content part was added to.
Yes
part OutputContent Yes
type enum The type of the event. Always response.content_part.done.

Possible values: response.content_part.done
Yes

ResponseCreatedEvent

An event that is emitted when a response is created.

Name Type Description Required Default
response response Yes
type enum The type of the event. Always response.created.

Possible values: response.created
Yes

ResponseError

An error object returned when the model fails to generate a Response.

Name Type Description Required Default
code ResponseErrorCode The error code for the response.
Yes
message string A human-readable description of the error.
Yes

ResponseErrorCode

The error code for the response.

Property Value
Description The error code for the response.
Type string
Values server_error
rate_limit_exceeded
invalid_prompt
vector_store_timeout
invalid_image
invalid_image_format
invalid_base64_image
invalid_image_url
image_too_large
image_too_small
image_parse_error
image_content_policy_violation
invalid_image_mode
image_file_too_large
unsupported_image_media_type
empty_image_file
failed_to_download_image
image_file_not_found

ResponseErrorEvent

Emitted when an error occurs.

Name Type Description Required Default
code string The error code.
Yes
message string The error message.
Yes
param string The error parameter.
Yes
type enum The type of the event. Always error.

Possible values: error
Yes

ResponseFailedEvent

An event that is emitted when a response fails.

Name Type Description Required Default
response response Yes
type enum The type of the event. Always response.failed.

Possible values: response.failed
Yes

ResponseFileSearchCallCompletedEvent

Emitted when a file search call is completed (results found).

Name Type Description Required Default
item_id string The ID of the output item that the file search call is initiated.
Yes
output_index integer The index of the output item that the file search call is initiated.
Yes
type enum The type of the event. Always response.file_search_call.completed.

Possible values: response.file_search_call.completed
Yes

ResponseFileSearchCallInProgressEvent

Emitted when a file search call is initiated.

Name Type Description Required Default
item_id string The ID of the output item that the file search call is initiated.
Yes
output_index integer The index of the output item that the file search call is initiated.
Yes
type enum The type of the event. Always response.file_search_call.in_progress.

Possible values: response.file_search_call.in_progress
Yes

ResponseFileSearchCallSearchingEvent

Emitted when a file search is currently searching.

Name Type Description Required Default
item_id string The ID of the output item that the file search call is initiated.
Yes
output_index integer The index of the output item that the file search call is searching.
Yes
type enum The type of the event. Always response.file_search_call.searching.

Possible values: response.file_search_call.searching
Yes

ResponseFunctionCallArgumentsDeltaEvent

Emitted when there is a partial function-call arguments delta.

Name Type Description Required Default
delta string The function-call arguments delta that is added.
Yes
item_id string The ID of the output item that the function-call arguments delta is added to.
Yes
output_index integer The index of the output item that the function-call arguments delta is added to.
Yes
type enum The type of the event. Always response.function_call_arguments.delta.

Possible values: response.function_call_arguments.delta
Yes

ResponseFunctionCallArgumentsDoneEvent

Emitted when function-call arguments are finalized.

Name Type Description Required Default
arguments string The function-call arguments. Yes
item_id string The ID of the item. Yes
output_index integer The index of the output item. Yes
type enum
Possible values: response.function_call_arguments.done
Yes

ResponseInProgressEvent

Emitted when the response is in progress.

Name Type Description Required Default
response response Yes
type enum The type of the event. Always response.in_progress.

Possible values: response.in_progress
Yes

ResponseIncompleteEvent

An event that is emitted when a response finishes as incomplete.

Name Type Description Required Default
response response Yes
type enum The type of the event. Always response.incomplete.

Possible values: response.incomplete
Yes

responseItemList

A list of Response items.

Name Type Description Required Default
data array A list of items used to generate this response. Yes
first_id string The ID of the first item in the list. Yes
has_more boolean Whether there are more items available. Yes
last_id string The ID of the last item in the list. Yes
object enum The type of object returned, must be list.
Possible values: list
Yes

ResponseModalities

Output types that you would like the model to generate. Most models are capable of generating text, which is the default:

["text"]

The gpt-4o-audio-preview model can also be used to generate audio. To request that this model generate both text and audio responses, you can use:

["text", "audio"]

No properties defined for this component.

ResponseModalitiesTextOnly

Output types that you would like the model to generate. Most models are capable of generating text, which is the default:

["text"]

This API will soon support other output modalities, including audio and images.

No properties defined for this component.

ResponseOutputItemAddedEvent

Emitted when a new output item is added.

Name Type Description Required Default
item OutputItem Yes
output_index integer The index of the output item that was added.
Yes
type enum The type of the event. Always response.output_item.added.

Possible values: response.output_item.added
Yes

ResponseOutputItemDoneEvent

Emitted when an output item is marked done.

Name Type Description Required Default
item OutputItem Yes
output_index integer The index of the output item that was marked done.
Yes
type enum The type of the event. Always response.output_item.done.

Possible values: response.output_item.done
Yes

ResponseProperties

Name Type Description Required Default
instructions string Inserts a system (or developer) message as the first item in the model's context.

When using along with previous_response_id, the instructions from a previous
response will be not be carried over to the next response. This makes it simple
to swap out system (or developer) messages in new responses.
No
max_output_tokens integer An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.
No
previous_response_id string The unique ID of the previous response to the model. Use this to
create multi-turn conversations. Learn more about
Conversation state.
No
reasoning Reasoning o-series models only

Configuration options for
reasoning models.
No
text object Configuration options for a text response from the model. Can be plain
text or structured JSON data. Learn more:
- Text inputs and outputs
- Structured Outputs
No
└─ format TextResponseFormatConfiguration An object specifying the format that the model must output.

Configuring { "type": "json_schema" } enables Structured Outputs,
which ensures the model will match your supplied JSON schema.

The default format is { "type": "text" } with no additional options.

Not recommended for gpt-4o and newer models:

Setting to { "type": "json_object" } enables the older JSON mode, which
ensures the message the model generates is valid JSON. Using json_schema
is preferred for models that support it.
No
tool_choice ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction How the model should select which tool (or tools) to use when generating
a response. See the tools parameter to see how to specify which tools
the model can call.
No
tools array An array of tools the model may call while generating a response. You
can specify which tool to use by setting the tool_choice parameter.

The two categories of tools you can provide the model are:

- Built-in tools: Tools that are provided by OpenAI that extend the
model's capabilities, file search.
- Function calls (custom tools): Functions that are defined by you,
enabling the model to call your own code. Learn more about
function calling.
No
truncation enum The truncation strategy to use for the model response.
- auto: If the context of this response and previous ones exceeds
the model's context window size, the model will truncate the
response to fit the context window by dropping input items in the
middle of the conversation.
- disabled (default): If a model response will exceed the context window
size for a model, the request will fail with a 400 error.

Possible values: auto, disabled
No

ResponseRefusalDeltaEvent

Emitted when there is a partial refusal text.

Name Type Description Required Default
content_index integer The index of the content part that the refusal text is added to.
Yes
delta string The refusal text that is added.
Yes
item_id string The ID of the output item that the refusal text is added to.
Yes
output_index integer The index of the output item that the refusal text is added to.
Yes
type enum The type of the event. Always response.refusal.delta.

Possible values: response.refusal.delta
Yes

ResponseRefusalDoneEvent

Emitted when refusal text is finalized.

Name Type Description Required Default
content_index integer The index of the content part that the refusal text is finalized.
Yes
item_id string The ID of the output item that the refusal text is finalized.
Yes
output_index integer The index of the output item that the refusal text is finalized.
Yes
refusal string The refusal text that is finalized.
Yes
type enum The type of the event. Always response.refusal.done.

Possible values: response.refusal.done
Yes

responseStreamEvent

Name Type Description Required Default
annotation Annotation Yes
annotation_index integer The index of the annotation that was added.
Yes
arguments string The function-call arguments. Yes
code string The error code.
Yes
code_interpreter_call CodeInterpreterToolCall A tool call to run code.
Yes
content_index integer The index of the content part that the text content is finalized.
Yes
delta string The text delta that was added.
Yes
item OutputItem The output item that was marked done.
Yes
item_id string The ID of the output item that the text content is finalized.
Yes
message string The error message.
Yes
output_index integer The index of the output item that the text content is finalized.
Yes
param string The error parameter.
Yes
part OutputContent The content part that is done.
Yes
refusal string The refusal text that is finalized.
Yes
response response The response that was incomplete.
Yes
text string The text content that is finalized.
Yes
type enum The type of the event. Always response.output_text.done.

Possible values: response.output_text.done
Yes

ResponseTextAnnotationDeltaEvent

Emitted when a text annotation is added.

Name Type Description Required Default
annotation Annotation Yes
annotation_index integer The index of the annotation that was added.
Yes
content_index integer The index of the content part that the text annotation was added to.
Yes
item_id string The ID of the output item that the text annotation was added to.
Yes
output_index integer The index of the output item that the text annotation was added to.
Yes
type enum The type of the event. Always response.output_text.annotation.added.

Possible values: response.output_text.annotation.added
Yes

ResponseTextDeltaEvent

Emitted when there is an additional text delta.

Name Type Description Required Default
content_index integer The index of the content part that the text delta was added to.
Yes
delta string The text delta that was added.
Yes
item_id string The ID of the output item that the text delta was added to.
Yes
output_index integer The index of the output item that the text delta was added to.
Yes
type enum The type of the event. Always response.output_text.delta.

Possible values: response.output_text.delta
Yes

ResponseTextDoneEvent

Emitted when text content is finalized.

Name Type Description Required Default
content_index integer The index of the content part that the text content is finalized.
Yes
item_id string The ID of the output item that the text content is finalized.
Yes
output_index integer The index of the output item that the text content is finalized.
Yes
text string The text content that is finalized.
Yes
type enum The type of the event. Always response.output_text.done.

Possible values: response.output_text.done
Yes

ResponseUsage

Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used.

Name Type Description Required Default
input_tokens integer The number of input tokens. Yes
output_tokens integer The number of output tokens. Yes
output_tokens_details object A detailed breakdown of the output tokens. Yes
└─ reasoning_tokens integer The number of reasoning tokens. No
total_tokens integer The total number of tokens used. Yes

Screenshot

A screenshot action.

Name Type Description Required Default
type enum Specifies the event type. For a screenshot action, this property is
always set to screenshot.

Possible values: screenshot
Yes

Scroll

A scroll action.

Name Type Description Required Default
scroll_x integer The horizontal scroll distance.
Yes
scroll_y integer The vertical scroll distance.
Yes
type enum Specifies the event type. For a scroll action, this property is
always set to scroll.

Possible values: scroll
Yes
x integer The x-coordinate where the scroll occurred.
Yes
y integer The y-coordinate where the scroll occurred.
Yes

StopConfiguration

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

This component can be one of the following:

TextResponseFormatConfiguration

An object specifying the format that the model must output.

Configuring { "type": "json_schema" } enables Structured Outputs, which ensures the model will match your supplied JSON schema.

The default format is { "type": "text" } with no additional options.

Not recommended for gpt-4o and newer models:

Setting to { "type": "json_object" } enables the older JSON mode, which ensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it.

This component can be one of the following:

TextResponseFormatJsonSchema

JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.

Name Type Description Required Default
description string A description of what the response format is for, used by the model to
determine how to respond in the format.
No
name string The name of the response format. Must be a-z, A-Z, 0-9, or contain
underscores and dashes, with a maximum length of 64.
No
schema ResponseFormatJsonSchemaSchema The schema for the response format, described as a JSON Schema object. Yes
strict boolean Whether to enable strict schema adherence when generating the output.
If set to true, the model will always follow the exact schema defined
in the schema field. Only a subset of JSON Schema is supported when
strict is true.
No False
type enum The type of response format being defined. Always json_schema.
Possible values: json_schema
Yes

Tool

This component can be one of the following:

ToolChoiceFunction

Use this option to force the model to call a specific function.

Name Type Description Required Default
name string The name of the function to call. Yes
type enum For function calling, the type is always function.
Possible values: function
Yes

ToolChoiceOptions

Controls which (if any) tool is called by the model.

none means the model will not call any tool and instead generates a message.

auto means the model can pick between generating a message or calling one or more tools.

required means the model must call one or more tools.

Property Value
Description Controls which (if any) tool is called by the model.

none means the model will not call any tool and instead generates a message.

auto means the model can pick between generating a message or calling one or
more tools.

required means the model must call one or more tools.
Type string
Values none
auto
required

ToolChoiceTypes

Indicates that the model should use a built-in tool to generate a response.

Name Type Description Required Default
type enum The type of hosted tool the model should to use.

Allowed values are:
- file_search
- computer_use_preview

Possible values: file_search, computer_use_preview
Yes

Type

An action to type in text.

Name Type Description Required Default
text string The text to type.
Yes
type enum Specifies the event type. For a type action, this property is
always set to type.

Possible values: type
Yes

UpdateVectorStoreFileAttributesRequest

Name Type Description Required Default
attributes VectorStoreFileAttributes Set of 16 key-value pairs that can be attached to an object. This can be
useful for storing additional information about the object in a structured
format, and querying for objects via API or the dashboard. Keys are strings
with a maximum length of 64 characters. Values are strings with a maximum
length of 512 characters, booleans, or numbers.
Yes

UrlCitation

A citation for a web resource used to generate a model response.

Name Type Description Required Default
end_index integer The index of the last character of the URL citation in the message.
Yes
start_index integer The index of the first character of the URL citation in the message.
Yes
title string The title of the web resource.
Yes
type enum The type of the URL citation. Always url_citation.

Possible values: url_citation
Yes
url string The URL of the web resource.
Yes

VectorStoreFileAttributes

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.

No properties defined for this component.

VectorStoreFileContentResponse

Represents the parsed content of a vector store file.

Name Type Description Required Default
data array Parsed content of the file. Yes
has_more boolean Indicates if there are more content pages to fetch. Yes
next_page string The token for the next page, if any. Yes
object enum The object type, which is always vector_store.file_content.page
Possible values: vector_store.file_content.page
Yes

VectorStoreSearchRequest

Name Type Description Required Default
filters ComparisonFilter or CompoundFilter A filter to apply based on file attributes. No
max_num_results integer The maximum number of results to return. This number should be between 1 and 50 inclusive. No 10
query string or array A query string for a search Yes
ranking_options object Ranking options for search. No
└─ ranker enum
Possible values: auto, default-2024-11-15
No
└─ score_threshold number No 0
rewrite_query boolean Whether to rewrite the natural language query for vector search. No False

VectorStoreSearchResultContentObject

Name Type Description Required Default
text string The text content returned from search. Yes
type enum The type of content.
Possible values: text
Yes

VectorStoreSearchResultItem

Name Type Description Required Default
attributes VectorStoreFileAttributes Set of 16 key-value pairs that can be attached to an object. This can be
useful for storing additional information about the object in a structured
format, and querying for objects via API or the dashboard. Keys are strings
with a maximum length of 64 characters. Values are strings with a maximum
length of 512 characters, booleans, or numbers.
Yes
content array Content chunks from the file. Yes
file_id string The ID of the vector store file. Yes
filename string The name of the vector store file. Yes
score number The similarity score for the result. Yes

VectorStoreSearchResultsPage

Name Type Description Required Default
data array The list of search result items. Yes
has_more boolean Indicates if there are more results to fetch. Yes
next_page string The token for the next page, if any. Yes
object enum The object type, which is always vector_store.search_results.page
Possible values: vector_store.search_results.page
Yes
search_query array Yes

Wait

A wait action.

Name Type Description Required Default
type enum Specifies the event type. For a wait action, this property is
always set to wait.

Possible values: wait
Yes

ReasoningEffort

o-series models only

Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

Property Value
Description o-series models only

Constrains effort on reasoning for
reasoning models.
Currently supported values are low, medium, and high. Reducing
reasoning effort can result in faster responses and fewer tokens used
on reasoning in a response.
Type string
Default medium
Values low
medium
high

errorEvent

Occurs when an error occurs. This can happen due to an internal server error or a timeout.

Name Type Description Required Default
data error Yes
event string Yes

event Enum: ErrorEventEnum

Value Description
error

doneEvent

Occurs when a stream ends.

Name Type Description Required Default
data string Yes
event string Yes

event Enum: DoneEventEnum

Value Description
done

data Enum: DoneEventDataEnum

Value Description
[DONE]

Next steps

Learn about Models, and fine-tuning with the REST API. Learn more about the underlying models that power Azure OpenAI.