Azure AI Foundry Inference (Preview)

This is a first-party Swagger specification for Azure AI Foundry models, specifically designed for Chat Completion tasks. It currently supports only the Chat Completions endpoint.
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Logic Apps | Standard | All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD) |
Contact | |
---|---|
Name | Microsoft |
URL | Microsoft LogicApps Support |
Connector Metadata | |
---|---|
Publisher | Microsoft |
Creating a connection
The connector supports the following authentication types:
Default | Parameters for creating connection. | All regions | Shareable |
Default
Applicable: All regions
Parameters for creating connection.
This is shareable connection. If the power app is shared with another user, connection is shared as well. For more information, please see the Connectors overview for canvas apps - Power Apps | Microsoft Docs
Name | Type | Description | Required |
---|---|---|---|
API Key | securestring | The API Key for this Model Inference Endpoint | True |
Target Uri | string | Specify the inference endpoint for the Foundry model | True |
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 5000 | 60 seconds |
Actions
Create a chat completion |
Generates a completion for a conversation, based on the provided messages and model configuration. |
Create a chat completion
Generates a completion for a conversation, based on the provided messages and model configuration.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
API version
|
api-version | string |
The version of the API to use for this model chat completions endpoint. |
|
role
|
role | string |
The role of the sender of the message (e.g., 'user', 'assistant'). |
|
content
|
content | object |
The content of the message. |
|
temperature
|
temperature | float |
The sampling temperature to use, between 0 and 1. Higher values make the output more random. |
|
top_p
|
top_p | float |
The top-p sampling parameter, between 0 and 1. |
|
max_tokens
|
max_tokens | integer |
The maximum number of tokens to generate in the response. |
|
model
|
model | string |
Model Deployment Name. |
Returns
Definitions
Choice
Name | Path | Type | Description |
---|---|---|---|
content_filter_results
|
content_filter_results | object |
Results from the content filter applied to the response. |
finish_reason
|
finish_reason | string |
The reason the model stopped generating further tokens. Possible values include 'stop', 'length', 'content_filter', etc. |
index
|
index | integer |
The index of this choice within the generated set of completions. |
logprobs
|
logprobs | string |
Log probabilities associated with each token in the response (if requested). |
content
|
message.content | string |
The content of the generated message in the conversation. This is the response to user's NL query. |
refusal
|
message.refusal | string |
If the model refuses to generate a message, this field describes the refusal. |
role
|
message.role | string |
The role of the sender of the message (e.g., 'user', 'assistant'). |
PromptFilterResult
Name | Path | Type | Description |
---|---|---|---|
prompt_index
|
prompt_index | integer |
The index of the prompt in the original input. |
content_filter_results
|
content_filter_results | object |
The content filter metadata applied to the prompt. |
CompletionTokensDetails
Details about the token usage for completion.
Name | Path | Type | Description |
---|---|---|---|
accepted_prediction_tokens
|
accepted_prediction_tokens | integer |
The number of tokens accepted as valid predictions for the response. |
reasoning_tokens
|
reasoning_tokens | integer |
The number of tokens used for the model's reasoning process. |
rejected_prediction_tokens
|
rejected_prediction_tokens | integer |
The number of tokens rejected during the prediction process. |
PromptTokensDetails
Details about the tokens used in the prompt.
Name | Path | Type | Description |
---|---|---|---|
cached_tokens
|
cached_tokens | integer |
The number of tokens that were cached and reused for the prompt. |
Usage
Token usage details for the request, including both prompt and completion tokens.
Name | Path | Type | Description |
---|---|---|---|
completion_tokens
|
completion_tokens | integer |
The number of tokens consumed by the completion. |
completion_tokens_details
|
completion_tokens_details | CompletionTokensDetails |
Details about the token usage for completion. |
prompt_tokens
|
prompt_tokens | integer |
The number of tokens consumed by the prompt. |
prompt_tokens_details
|
prompt_tokens_details | PromptTokensDetails |
Details about the tokens used in the prompt. |
total_tokens
|
total_tokens | integer |
The total number of tokens consumed by the entire request (prompt + completion). |
ChatCompletionResponse
Name | Path | Type | Description |
---|---|---|---|
choices
|
choices | array of Choice |
The list of generated completions for the given prompt. |
id
|
id | string |
A unique identifier for the chat completion request. |
model
|
model | string |
The model used for generating the chat completion. |
prompt_filter_results
|
prompt_filter_results | array of PromptFilterResult |
The content filter results for each prompt in the request. |
usage
|
usage | Usage |
Token usage details for the request, including both prompt and completion tokens. |