Copy jobs in Azure Cosmos DB (preview)

Άρθρο
25/10/2024

You can perform data copy in Azure Cosmos DB by using container copy jobs.

You might need to copy data from your Azure Cosmos DB account if you want to achieve any of these scenarios:

Copy all items from one container to another.
Change the granularity at which throughput is provisioned, from database to container and vice versa.
Change the partition key of a container.
Update the unique keys for a container.
Rename a container or database.
Change capacity mode of an account from serverless to provisioned or vice-versa.
Adopt new features that are supported only for new containers, e.g. Hierarchical partition keys.

Copy jobs can be created and managed by using Azure CLI commands.

To get started with online container copy for Azure Cosmos DB for NoSQL API accounts, register for the Online container copy (NoSQL) preview feature flag in Preview Features in the Azure portal. Once the registration is complete, the preview is effective for all NoSQL API accounts in the subscription.

Prerequisites

Enable continuous backup on source Azure Cosmos DB account.
Register for All version and delete change feed mode preview feature on the source account’s subscription.

Σημαντικό

All write operations to the source container will be charged double RUs in order to preserve both the previous and current versions of changes to items in the container. This RU charge increase is subject to change in the future.

Copy a container's data

Create the target Azure Cosmos DB container by using the settings that you want to use (partition key, throughput granularity, request units, unique key, and so on).
Create the container copy job.
Monitor the progress of the copy job.
Once all documents have been copied, stop the updates on source container and then call the completion API to mark job as completed.
Resume the operations by appropriately pointing the application or client to the source or target container as intended.

How does container copy work?

The platform allocates server-side compute instances for the destination Azure Cosmos DB account to run the container copy jobs.
A single job is executed across all instances at any time.
The online copy jobs utilize all version and delete change feed mode to copy the data and replicate incremental changes from the source container to the destination container.
Once the job is completed, the platform de-allocates these instances after 15 minutes of inactivity.

You can perform offline collection copy jobs to copy data within the same Azure Cosmos DB for Mongo DB account.

Copy a collection's data

Create the target Azure Cosmos DB collection by using the settings that you want to use (partition key, throughput granularity, request units, unique key, and so on).
Stop the operations on the source collection by pausing the application instances or any clients that connect to it.
Create the copy job.
Monitor the progress of the copy job and wait until it's completed.
Resume the operations by appropriately pointing the application or client to the source or target collection as intended.

Σημείωση

We strongly recommend that you stop performing any operations on the source collection before you begin the offline collection copy job. Item deletions and updates that are done on the source collection after you start the copy job might not be captured. If you continue to perform operations on the source collection while the copy job is in progress, you might have duplicate or missing data on the target collection.

How does collection copy work?

The platform allocates server-side compute instances for the destination Azure Cosmos DB account.
These instances are allocated when one or more collection copy jobs are created within the account.
The copy jobs run on these instances.
A single job is executed across all instances at any time.
The instances are shared by all the copy jobs that are running within the same account.
The offline copy jobs utilize Change streams to copy the data and replicate incremental changes from the source collection to the destination collection.
The platform might de-allocate the instances if they're idle for longer than 15 minutes.

You can perform offline table copy to copy data of one table to another table within the same Azure Cosmos DB for Apache Cassandra account.

Copy a table's data

Create the target Azure Cosmos DB table by using the settings that you want to use (partition key, throughput granularity, request units and so on).
Stop the operations on the source table by pausing the application instances or any clients that connect to it.
Create the copy job.
Monitor the progress of the copy job and wait until it's completed.
Resume the operations by appropriately pointing the application or client to the source or target table as intended.

Σημείωση

We strongly recommend that you stop performing any operations on the source table before you begin the offline table copy job. Item deletions and updates that are done on the source table after you start the copy job might not be captured. If you continue to perform operations on the source table while the copy job is in progress, you might have duplicate or missing data on the target table.

How does table copy work?

The platform allocates server-side compute instances for the destination Azure Cosmos DB account.
These instances are allocated when one or more copy jobs are created within the account.
The copy jobs run on these instances.
A single job is executed across all instances at any time.
The instances are shared by all the copy jobs that are running within the same account.
The offline copy jobs utilize Change feed to copy the data and replicate incremental changes from the source table to the destination table.
The platform might de-allocate the instances if they're idle for longer than 15 minutes.

Factors that affect the rate of a copy job

The rate of container copy job progress is determined by these factors:

The source container or database throughput setting.
The target container or database throughput setting.

Φιλοδώρημα

Set the target container throughput to at least two times the source container's throughput.
Server-side compute instances that are allocated to the Azure Cosmos DB account for performing the data transfer.

Σημαντικό

The default SKU offers two 4-vCPU 16-GB server-side instances per account.

Limitations

Preview eligibility criteria

Container copy jobs don't work with accounts that have the following capabilities enabled. Disable these features before you run container copy jobs:

Account configurations

The Time to Live (TTL) setting isn't adjusted in the destination container. As a result, if a document hasn't expired in the source container, it starts its countdown anew in the destination container.

FAQs

Is there a service-level agreement for container copy jobs?

Container copy jobs are currently supported on a best-effort basis. We don't provide any service-level agreement (SLA) guarantees for the time it takes for the jobs to finish.

Can I create multiple container copy jobs within an account?

Yes, you can create multiple jobs within the same account. The jobs run consecutively. You can list all the jobs that are created within an account, and monitor their progress.

Can I copy an entire database within the Azure Cosmos DB account?

You must create a job for each container in the database.

I have an Azure Cosmos DB account with multiple regions. In which region will the container copy job run?

The container copy job runs in the write region. In an account that's configured with multi-region writes, the job runs in one of the regions in the list of write regions.

What happens to the container copy jobs when the account's write region changes?

The account's write region might change in the rare scenario of a region outage or due to manual failover. In this scenario, incomplete container copy jobs that were created within the account fail. You would need to re-create these failed jobs. Re-created jobs then run in the new (current) write region.

Supported regions

Currently, container copy is supported in the following regions:

Americas	Europe and Africa	Asia Pacific
Brazil South	France Central	Australia Central
Canada Central	France South	Australia Central 2
Canada East	Germany North	Australia East
Central US	Germany West Central	Central India
Central US EUAP	North Europe	Japan East
East US	Norway East	Korea Central
East US 2	Norway West	Southeast Asia
East US 2 EUAP	Switzerland North	UAE Central
North Central US	Switzerland West	West India
South Central US	UK South	East Asia
West Central US	UK West	Malaysia South
West US	West Europe	Japan West
West US 2	Israel Central	Australia Southeast
Not supported	South Africa North	Not supported

Known and common issues

Error - Owner resource doesn't exist.

If the job creation fails and displays the error Owner resource doesn't exist (error code 404), either the target container hasn't been created yet or the container name that's used to create the job doesn't match an actual container name.

Make sure that the target container is created before you run the job and ensure that the container name in the job matches an actual container name.
```
"code": "404",
"message": "Response status code does not indicate success: NotFound (404); Substatus: 1003; ActivityId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx; Reason: (Message: {\"Errors\":[\"Owner resource does not exist\"]
```
Error - Request is unauthorized.

If the request fails and displays the error Unauthorized (error code 401), local authorization might be disabled.

Container copy jobs use primary keys to authenticate. If local authorization is disabled, the job creation fails. Local authorization must be enabled for container copy jobs to work.
```
"code": "401",
"message": " Response status code does not indicate success: Unauthorized (401); Substatus: 5202; ActivityId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx; Reason: Local Authorization is disabled. Use an AAD token to authorize all requests."
```
Error - Error while getting resources for job.

This error might occur due to internal server issues. To resolve this issue, contact Microsoft Support by opening a New Support Request in the Azure portal. For Problem Type, select Data Migration. For Problem subtype, select Intra-account container copy.
```
"code": "500"
"message": "Error while getting resources for job, StatusCode: 500, SubStatusCode: 0, OperationId:  xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, ActivityId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
```

Next steps

Learn how to create, monitor, and manage container copy jobs in Azure Cosmos DB account by using CLI commands.

Το μέλλον είναι δικό σας

Κοινή χρήση μέσω

Copy jobs in Azure Cosmos DB (preview)

Get started

Prerequisites

Copy a container's data

How does container copy work?

Copy a container's data

How does container copy work?

Copy a collection's data

How does collection copy work?

Copy a table's data

How does table copy work?

Factors that affect the rate of a copy job

Limitations

Preview eligibility criteria

Account configurations

FAQs

Is there a service-level agreement for container copy jobs?

Can I create multiple container copy jobs within an account?

Can I copy an entire database within the Azure Cosmos DB account?

I have an Azure Cosmos DB account with multiple regions. In which region will the container copy job run?

What happens to the container copy jobs when the account's write region changes?

Supported regions

Known and common issues

Next steps

Σχόλια

Πρόσθετοι πόροι