Tutorial: Generate images using serverless GPUs in Azure Container Apps
In this article, you learn how to create a container app that uses serverless GPUs to power an AI application.
With serverless GPUs, you have direct access to GPU compute resources without having to do manual infrastructure configuration such as installing drivers. All you have to do is deploy your AI model's image.
In this tutorial you:
- Create a new container app and environment
- Configure the environment to use serverless GPUs
- Deploy your app to Azure Container Apps
- Use the new serverless GPU enable application
- Enable artifact streaming to reduce GPU cold start
Prerequisites
Resource | Description |
---|---|
Azure account | You need an Azure account with an active subscription. If you don't have one, you can create one for free. |
Access to serverless GPUs | Access to GPUs is only available after you request GPU quotas. You can submit your GPU quota request via a customer support case. |
Resource | Description |
---|---|
Azure account | You need an Azure account with an active subscription. If you don't have one, you can create one for free. |
Access to serverless GPUs | Access to GPUs is only available after you request GPU quotas. You can submit your GPU quota request via a customer support case. |
Azure CLI | Install the Azure CLI or upgrade to the latest version. |
Create your container app
Go to the Azure portal and search for and select Container Apps.
Select Create and then select Container App.
In the Basics window, enter the following values into each section.
Under Project details enter the following values:
Setting Value Subscription Select your Azure subscription. Resource group Select Create new and enter my-gpu-demo-group. Container app name Enter my-gpu-demo-app. Deployment source Select Container image. Under Container Apps environment enter the following values:
Setting Value Region Select Sweden Central.
For more supported regions, refer to Using serverless GPUs in Azure.Container Apps environment Select Create new. In the Create Container Apps environment window, enter the following values:
Setting Value Environment name Enter my-gpu-demo-env. Select Create.
Select Next: Container >.
In the Container window, enter the following values:
Setting Value Name Enter my-gpu-demo-container. Image source Select Docker Hub or other registries. Image type Select public. Registry login server Enter mcr.microsoft.com. Image and tag Enter k8se/gpu-quickstart:latest. Workload profile Select Consumption - Up to 4 vCPUs, 8 Gib memory. GPU Select the checkbox. GPU Type Select Consumption-GPU-NC8as-T4 - Up to 8 vCPUs, 56 GiB memory and select the link to add the profile to your environment. Select Next: Ingress >.
In the Ingress window, enter the following values:
Setting Value Ingress Select the Enabled checkbox. Ingress traffic Select the Accepting traffic from anywhere radio button. Target port Enter 80. Select Review + create.
Select Create.
Wait a few moments for the deployment to complete and then select Go to resource.
This process can take up to five minutes to complete.
Use your GPU app
From the Overview window, select the Application Url link to open the web app front end in your browser and use the GPU application.
Note
- To achieve the best performance of your GPU apps, follow the steps to improve cold start for your serverless GPUs.
- When there are multiple containers in your application, the first container gets access to the GPU.
Create environment variables
Define the following environment variables. Before running this command, replace the <PLACEHOLDERS>
with your values.
RESOURCE_GROUP="<RESOURCE_GROUP>"
ENVIRONMENT_NAME="<ENVIRONMENT_NAME>"
LOCATION="swedencentral"
CONTAINER_APP_NAME="<CONTAINER_APP_NAME>"
CONTAINER_IMAGE="mcr.microsoft.com/k8se/gpu-quickstart:latest"
WORKLOAD_PROFILE_NAME="NC8as-T4"
WORKLOAD_PROFILE_TYPE="Consumption-GPU-NC8as-T4"
Create your container app
Create the resource group to contain the resources you create in this tutorial. This command should output
Succeeded
.az group create \ --name $RESOURCE_GROUP \ --location $LOCATION \ --query "properties.provisioningState"
Create a Container Apps environment to host your container app. This command should output
Succeeded
.az containerapp env create \ --name $ENVIRONMENT_NAME \ --resource-group $RESOURCE_GROUP \ --location "$LOCATION" \ --query "properties.provisioningState"
Add a workload profile to your environment.
az containerapp env workload-profile add \ --name $ENVIRONMENT_NAME \ --resource-group $RESOURCE_GROUP \ --workload-profile-name $WORKLOAD_PROFILE_NAME \ --workload-profile-type $WORKLOAD_PROFILE_TYPE
Create your container app.
az containerapp create \ --name $CONTAINER_APP_NAME \ --resource-group $RESOURCE_GROUP \ --environment $ENVIRONMENT_NAME \ --image $CONTAINER_IMAGE \ --target-port 80 \ --ingress external \ --cpu 8.0 \ --memory 56.0Gi \ --workload-profile-name $WORKLOAD_PROFILE_NAME \ --query properties.configuration.ingress.fqdn
This command outputs the application URL for your container app.
Use your GPU app
Open the application URL for your container app in your browser. Note it can take up to five minutes for the container app to start up.
The Azure Container Apps with Serverless GPUs application lets you enter a prompt to generate an image. You can also simply select Generate Image
to use the default prompt. In the next step, you view the results of the GPU processing.
Note
- To achieve the best performance of your GPU apps, follow the steps to improve cold start for your serverless GPUs.
- When there are multiple containers in your application, the first container gets access to the GPU.
Monitor your GPU
Once you generate an image, use the following steps to view the results of the GPU processing:
Open your container app in the Azure portal.
From the Monitoring section, select Console.
Select your replica.
Select your container.
Select Reconnect.
In the Choose start up command window, select /bin/bash, and select Connect.
Once the shell is set up, enter the command nvidia-smi to review the status and output of your GPU.
Clean up resources
The resources created in this tutorial have an effect on your Azure bill.
If you aren't going to use these services long-term, use the steps to remove everything created in this tutorial.
In the Azure portal, search for and select Resource Groups.
Select my-gpu-demo-group.
Select Delete resource group.
In the confirmation box, enter my-gpu-demo-group.
Select Delete.
Run the following command.
az group delete --name $RESOURCE_GROUP