Error while Unzipping .tar.gz files using Copy activity - Failing for csv file having 8GB of data

Nagarajan Arumugam 0 Reputation points
2025-03-11T06:51:23.57+00:00

Hi,

I am trying to uncompress .tar.gz files using Copy Activity and it was working fine till yesterday. Today we have received a file with more data and it got failed while copying one file with 8GB of data with following error "ErrorCode=UserErrorWriteFailedFileOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The file operation is failed, upload file failed at path: 'xxx-xx/yy/zzz/Test/251963-20250310-0000-F '.,Source=Microsoft.DataTransfer.Common,''Type=ICSharpCode.SharpZipLib.Tar.TarException,Message=Header checksum is invalid,Source=ICSharpCode.SharpZipLib,'".

Note : .tar.gz file contents multiple csv file data. We are unzipping and copying to further layer in our data lake.,

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,343 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Venkat Reddy Navari 80 Reputation points Microsoft External Staff
    2025-03-11T17:50:22.41+00:00

    Hi Nagarajan Arumugam.

    It looks like you're encountering an issue while trying to uncompress an 8GB.tar.gz file using the Copy Activity in Azure Data Factory. The error indicates a checksum issue with the file's header, which can be caused by several factors.

    1. Verify the file integrity to ensure the.tar.gz file isn’t corrupted. You can check the file’s integrity or compare it with the source file if possible. If needed, attempt to download or transfer the file again from the original source.
    2. Ensure there is enough disk space in the destination where you're attempting to uncompress and transfer the files. A lack of available space could lead to the operation failing.
    3. Try testing with smaller files to determine if the issue is related to the 8GB file. If the smaller files work without issues, then the size of the file might be a contributing factor.
    4. If the issue persists, you might want to try extracting the files manually using tools like tar or gzip before uploading them to your data lake. This can help you identify whether the issue is with Azure Data Factory or the file itself.
    5. Verify that the.tar.gz file is correctly structured and includes valid CSV files, as any formatting issues could lead to the checksum error. Take a look at the logs for any other error messages or information that might provide more insight into the issue.

    I hope this helps. Please let us know if you have any further questions.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.