Tutorial: Generate images using serverless GPUs in Azure Container Apps

Artikkel
03/18/2025

In this article, you learn how to create a container app that uses serverless GPUs to power an AI application.

With serverless GPUs, you have direct access to GPU compute resources without having to do manual infrastructure configuration such as installing drivers. All you have to do is deploy your AI model's image.

In this tutorial you:

Create a new container app and environment
Configure the environment to use serverless GPUs
Deploy your app to Azure Container Apps
Use the new serverless GPU enable application
Enable artifact streaming to reduce GPU cold start

Prerequisites

Resource	Description
Azure account	You need an Azure account with an active subscription. If you don't have one, you can create one for free.
Access to serverless GPUs	Access to GPUs is only available after you request GPU quotas. You can submit your GPU quota request via a customer support case.

Resource	Description
Azure account	You need an Azure account with an active subscription. If you don't have one, you can create one for free.
Access to serverless GPUs	Access to GPUs is only available after you request GPU quotas. You can submit your GPU quota request via a customer support case.
Azure CLI	Install the Azure CLI or upgrade to the latest version.

Create your container app

Go to the Azure portal and search for and select Container Apps.
Select Create and then select Container App.

In the Basics window, enter the following values into each section.

Under Project details enter the following values:

Setting	Value
Subscription	Select your Azure subscription.
Resource group	Select Create new and enter my-gpu-demo-group.
Container app name	Enter my-gpu-demo-app.
Deployment source	Select Container image.

Under Container Apps environment enter the following values:

Setting	Value
Region	Select Sweden Central. For more supported regions, refer to Using serverless GPUs in Azure.
Container Apps environment	Select Create new.

In the Create Container Apps environment window, enter the following values:

Setting	Value
Environment name	Enter my-gpu-demo-env.

Select Create.

Select Next: Container >.

In the Container window, enter the following values:

Setting	Value
Name	Enter my-gpu-demo-container.
Image source	Select Docker Hub or other registries.
Image type	Select public.
Registry login server	Enter mcr.microsoft.com.
Image and tag	Enter k8se/gpu-quickstart:latest.
Workload profile	Select Consumption - Up to 4 vCPUs, 8 Gib memory.
GPU	Select the checkbox.
GPU Type	Select Consumption-GPU-NC8as-T4 - Up to 8 vCPUs, 56 GiB memory and select the link to add the profile to your environment.

Select Next: Ingress >.

In the Ingress window, enter the following values:

Setting Value

Ingress Select the Enabled checkbox.

Ingress traffic Select the Accepting traffic from anywhere radio button.

Target port Enter 80.
Select Review + create.
Select Create.
Wait a few moments for the deployment to complete and then select Go to resource.

This process can take up to five minutes to complete.

Setting	Value
Ingress	Select the Enabled checkbox.
Ingress traffic	Select the Accepting traffic from anywhere radio button.
Target port	Enter 80.

Use your GPU app

From the Overview window, select the Application Url link to open the web app front end in your browser and use the GPU application.

Note

To achieve the best performance of your GPU apps, follow the steps to improve cold start for your serverless GPUs.
When there are multiple containers in your application, the first container gets access to the GPU.

Create environment variables

Define the following environment variables. Before running this command, replace the <PLACEHOLDERS> with your values.

RESOURCE_GROUP="<RESOURCE_GROUP>"
ENVIRONMENT_NAME="<ENVIRONMENT_NAME>"
LOCATION="swedencentral"
CONTAINER_APP_NAME="<CONTAINER_APP_NAME>"
CONTAINER_IMAGE="mcr.microsoft.com/k8se/gpu-quickstart:latest"
WORKLOAD_PROFILE_NAME="NC8as-T4"
WORKLOAD_PROFILE_TYPE="Consumption-GPU-NC8as-T4"

Create your container app

Create the resource group to contain the resources you create in this tutorial. This command should output Succeeded.
```
az group create \
  --name $RESOURCE_GROUP \
  --location $LOCATION \
  --query "properties.provisioningState"
```

Create a Container Apps environment to host your container app. This command should output Succeeded.

az containerapp env create \
  --name $ENVIRONMENT_NAME \
  --resource-group $RESOURCE_GROUP \
  --location "$LOCATION" \
  --query "properties.provisioningState"

Add a workload profile to your environment.

az containerapp env workload-profile add \
  --name $ENVIRONMENT_NAME \
  --resource-group $RESOURCE_GROUP \
  --workload-profile-name $WORKLOAD_PROFILE_NAME \
  --workload-profile-type $WORKLOAD_PROFILE_TYPE

Create your container app.

az containerapp create \
  --name $CONTAINER_APP_NAME \
  --resource-group $RESOURCE_GROUP \
  --environment $ENVIRONMENT_NAME \
  --image $CONTAINER_IMAGE \
  --target-port 80 \
  --ingress external \
  --cpu 8.0 \
  --memory 56.0Gi \
  --workload-profile-name $WORKLOAD_PROFILE_NAME \
  --query properties.configuration.ingress.fqdn

This command outputs the application URL for your container app.

Use your GPU app

Open the application URL for your container app in your browser. Note it can take up to five minutes for the container app to start up.

The Azure Container Apps with Serverless GPUs application lets you enter a prompt to generate an image. You can also simply select Generate Image to use the default prompt. In the next step, you view the results of the GPU processing.

Note

To achieve the best performance of your GPU apps, follow the steps to improve cold start for your serverless GPUs.
When there are multiple containers in your application, the first container gets access to the GPU.

Monitor your GPU

Once you generate an image, use the following steps to view the results of the GPU processing:

Open your container app in the Azure portal.
From the Monitoring section, select Console.
Select your replica.
Select your container.
Select Reconnect.
In the Choose start up command window, select /bin/bash, and select Connect.
Once the shell is set up, enter the command nvidia-smi to review the status and output of your GPU.

Clean up resources

The resources created in this tutorial have an effect on your Azure bill.

If you aren't going to use these services long-term, use the steps to remove everything created in this tutorial.

In the Azure portal, search for and select Resource Groups.
Select my-gpu-demo-group.
Select Delete resource group.
In the confirmation box, enter my-gpu-demo-group.
Select Delete.

Run the following command.

az group delete --name $RESOURCE_GROUP

Next steps

Improve cold start for your serverless GPUs

Del via

Tutorial: Generate images using serverless GPUs in Azure Container Apps

Prerequisites

Create your container app

Use your GPU app

Create environment variables

Create your container app

Use your GPU app

Monitor your GPU

Clean up resources

Next steps

Tilbakemeldinger

Flere ressurser