Azure AI Foundry Inference (Preview)

Reference

This is a first-party Swagger specification for Azure AI Foundry models, specifically designed for Chat Completion tasks. It currently supports only the Chat Completions endpoint.

This connector is available in the following products and regions:

Service	Class	Regions
Logic Apps	Standard	All Logic Apps regions except the following: - Azure Government regions - Azure China regions - US Department of Defense (DoD)

Contact
Name	Microsoft
URL	Microsoft LogicApps Support

Connector Metadata
Publisher	Microsoft

Creating a connection

The connector supports the following authentication types:


Default	Parameters for creating connection.	All regions	Shareable

Default

Applicable: All regions

Parameters for creating connection.

This is shareable connection. If the power app is shared with another user, connection is shared as well. For more information, please see the Connectors overview for canvas apps - Power Apps | Microsoft Docs

Name	Type	Description	Required
API Key	securestring	The API Key for this Model Inference Endpoint	True
Target Uri	string	Specify the inference endpoint for the Foundry model	True

Throttling Limits

Name	Calls	Renewal Period
API calls per connection	5000	60 seconds

Actions

Create a chat completion

Generates a completion for a conversation, based on the provided messages and model configuration.

Create a chat completion

Operation ID:: ChatCompletion

Generates a completion for a conversation, based on the provided messages and model configuration.

Parameters

Name	Key	Type	Description
API version	api-version	string	The version of the API to use for this model chat completions endpoint.
role	role	string	The role of the sender of the message (e.g., 'user', 'assistant').
content	content	object	The content of the message.
temperature	temperature	float	The sampling temperature to use, between 0 and 1. Higher values make the output more random.
top_p	top_p	float	The top-p sampling parameter, between 0 and 1.
max_tokens	max_tokens	integer	The maximum number of tokens to generate in the response.
model	model	string	Model Deployment Name.

Returns

Body: ChatCompletionResponse

Definitions

Choice

Name	Path	Type	Description
content_filter_results	content_filter_results	object	Results from the content filter applied to the response.
finish_reason	finish_reason	string	The reason the model stopped generating further tokens. Possible values include 'stop', 'length', 'content_filter', etc.
index	index	integer	The index of this choice within the generated set of completions.
logprobs	logprobs	string	Log probabilities associated with each token in the response (if requested).
content	message.content	string	The content of the generated message in the conversation. This is the response to user's NL query.
refusal	message.refusal	string	If the model refuses to generate a message, this field describes the refusal.
role	message.role	string	The role of the sender of the message (e.g., 'user', 'assistant').

PromptFilterResult

Name	Path	Type	Description
prompt_index	prompt_index	integer	The index of the prompt in the original input.
content_filter_results	content_filter_results	object	The content filter metadata applied to the prompt.

CompletionTokensDetails

Details about the token usage for completion.

Name	Path	Type	Description
accepted_prediction_tokens	accepted_prediction_tokens	integer	The number of tokens accepted as valid predictions for the response.
reasoning_tokens	reasoning_tokens	integer	The number of tokens used for the model's reasoning process.
rejected_prediction_tokens	rejected_prediction_tokens	integer	The number of tokens rejected during the prediction process.

PromptTokensDetails

Details about the tokens used in the prompt.

Name	Path	Type	Description
cached_tokens	cached_tokens	integer	The number of tokens that were cached and reused for the prompt.

Usage

Token usage details for the request, including both prompt and completion tokens.

Name	Path	Type	Description
completion_tokens	completion_tokens	integer	The number of tokens consumed by the completion.
completion_tokens_details	completion_tokens_details	CompletionTokensDetails	Details about the token usage for completion.
prompt_tokens	prompt_tokens	integer	The number of tokens consumed by the prompt.
prompt_tokens_details	prompt_tokens_details	PromptTokensDetails	Details about the tokens used in the prompt.
total_tokens	total_tokens	integer	The total number of tokens consumed by the entire request (prompt + completion).

ChatCompletionResponse

Name	Path	Type	Description
choices	choices	array of Choice	The list of generated completions for the given prompt.
id	id	string	A unique identifier for the chat completion request.
model	model	string	The model used for generating the chat completion.
prompt_filter_results	prompt_filter_results	array of PromptFilterResult	The content filter results for each prompt in the request.
usage	usage	Usage	Token usage details for the request, including both prompt and completion tokens.

مشاركة عبر

Azure AI Foundry Inference (Preview)

Creating a connection

Default

Throttling Limits

Actions

Create a chat completion

Parameters

Returns

Definitions

Choice

PromptFilterResult

CompletionTokensDetails

PromptTokensDetails

Usage

ChatCompletionResponse

الموارد الإضافية