从 azure blob 下载大 zip 文件并解压缩

Question

我目前有以下代码使用 SAS URI 从 blob 下载 zip 文件，解压缩并将内容上传到新容器

        var response = await new BlobClient(new Uri(sasUri)).DownloadAsync();
        using (ZipArchive archive = new ZipArchive(response.Value.Content))
        {
            foreach (ZipArchiveEntry entry in archive.Entries)
            {
                BlobClient blobClient = _blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient(entry.FullName);
                using (var fileStream = entry.Open())
                {
                    await blobClient.UploadAsync(fileStream, true);
                }
            }
        }

我的代码因“流太长”异常而失败：System.IO.IOException：流太长。 at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count) at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize) at System.IO.Compression.ZipArchive.Init(Stream stream, ZipArchiveMode mode, Boolean leaveOpen).

我的 zip 文件大小是 9G。什么是解决此异常的更好方法？我想避免将任何文件写入磁盘。

Answer 1

所以这里的问题是

.Net 的数组大小有限（取决于平台）。
Arrays 返回 streams 作为缓冲区或内存数据存储。
在 64 位平台上，数组大小为 2 GB
您想将 9 gig 流（由数组支持）放在大对象堆上。

因此，您将需要允许更大的对象（以某种方式）

Allow large objects

在 .Net Framework 4.5+ 中，您可以设置 <gcAllowVeryLargeObjects> 项目元素
在内核中你需要设置环境变量 COMPlus_gcAllowVeryLargeObjects

但是，在 大型对象堆 上放置 9 GB 的任何东西都是有问题的，除了其他问题之外，它对于 GC 来说效率低下，你应该尽可能地避免 LOH可以。

请注意，具体取决于图书馆和您有权访问的内容。执行此操作的 LOHy 方法可能更少。如果你可以提供你自己的流/数据结构，那么有一些库可以分解缓冲区，这样它们就不会通过 ReadOnlySequence 和微软鲜为人知的 RecyclableMemoryStream.[=15 之类的东西在 LOH 上积极分配=]

Answer 2

以下解决方案对我有用。不使用 DownloadAsync，而是使用 OpenReadAsync

var response = await new BlobClient(new Uri(sasUri)).OpenReadAsync(new BlobOpenReadOptions(false), cancellationToken);
using (ZipArchive archive = new ZipArchive(response))
{
    foreach (ZipArchiveEntry entry in archive.Entries)
    {
        BlobClient blobClient = _blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient($"{buildVersion}/{entry.FullName}");
        using (var fileStream = entry.Open())
        {
           await blobClient.UploadAsync(fileStream, true, cancellationToken).ConfigureAwait(false);
        }
    }
}

从 azure blob 下载大 zip 文件并解压缩

Download large zip file from azure blob and unzip

.net

c#

ziparchive

.net-core

azure-blob-storage