如何在没有临时文件的情况下将流从 Web API 传递到 Azure Blob 存储？

Question

我正在开发一个经常上传文件的应用程序，而且文件的大小可能非常大。

这些文件正在上传到 Web API，然后它将从请求中获取流，并将其传递到我的存储服务，然后再将其上传到 Azure Blob 存储。

我需要确保：

没有温度。文件写在网络上 API 实例
请求流在传递到存储服务之前没有完全读入内存（以防止内存不足异常）。

我看过 this article，它描述了如何禁用输入流缓冲，但是由于来自许多不同用户的许多文件上传同时发生，所以它实际上按照罐子上说的做是很重要的。

目前我的控制器中有以下内容：

if (this.Request.Content.IsMimeMultipartContent())
{
    var provider = new MultipartMemoryStreamProvider();
    await this.Request.Content.ReadAsMultipartAsync(provider);
    var fileContent = provider.Contents.SingleOrDefault();

    if (fileContent == null)
    {
        throw new ArgumentException("No filename.");
    }

    var fileName = fileContent.Headers.ContentDisposition.FileName.Replace("\"", string.Empty);

    // I need to make sure this stream is ready to be processed by 
    // the Azure client lib, but not buffered fully, to prevent OoM.
    var stream = await fileContent.ReadAsStreamAsync();
}

我不知道如何可靠地测试它。

编辑：我忘了提到直接上传到 Blob 存储（绕过我的 API）是行不通的，因为我正在做一些大小检查（例如这个用户可以上传 500mb 吗？这个用户用完他的配额了吗？）。

Answer 1

我认为更好的方法是从客户端直接转到 Azure Blob 存储。通过利用 Azure 存储中的 CORS 支持，您可以消除 Web API 服务器上的负载，从而提高应用程序的整体规模。

基本上，您将创建一个共享访问签名 (SAS) URL，您的客户端可以使用它来将文件直接上传到 Azure 存储。出于安全原因，建议您限制 SAS 的有效时间段。生成 SAS URL 的最佳实践指南可用 here。

对于您的特定场景，请查看 Azure 存储团队的 this blog，他们在该团队中讨论了针对此确切场景使用 CORS 和 SAS。还有一个示例应用程序，因此这应该可以为您提供所需的一切。

Answer 2

在 this Gist 的帮助下解决了它。

以下是我使用它的方法，以及一个巧妙的 "hack" 来获取实际文件大小，而无需先将文件复制到内存中。哦，它的速度是原来的两倍（显然）。

// Create an instance of our provider.
// See https://gist.github.com/JamesRandall/11088079#file-blobstoragemultipartstreamprovider-cs for implementation.
var provider = new BlobStorageMultipartStreamProvider ();

// This is where the uploading is happening, by writing to the Azure stream
// as the file stream from the request is being read, leaving almost no memory footprint.
await this.Request.Content.ReadAsMultipartAsync(provider);

// We want to know the exact size of the file, but this info is not available to us before
// we've uploaded everything - which has just happened.
// We get the stream from the content (and that stream is the same instance we wrote to).
var stream = await provider.Contents.First().ReadAsStreamAsync();

// Problem: If you try to use stream.Length, you'll get an exception, because BlobWriteStream
// does not support it.

// But this is where we get fancy.

// Position == size, because the file has just been written to it, leaving the
// position at the end of the file.
var sizeInBytes = stream.Position;

瞧，您已获得上传文件的大小，而无需将文件复制到 Web 实例的内存中。

至于在文件上传之前获取文件长度，这并不容易，我不得不求助于一些相当non-pleasant的方法来获得只是一个近似值。

在BlobStorageMultipartStreamProvider:

var approxSize = parent.Headers.ContentLength.Value - parent.Headers.ToString().Length;

这给了我一个非常接近的文件大小，少了几百个字节（我猜取决于 HTTP header）。这对我来说已经足够了，因为我的配额实施可以接受被削减的几个字节。

只是为了炫耀，这是内存占用量，由任务管理器中的异常准确和高级性能选项卡报告。

之前 - 使用 MemoryStream，在上传之前将其读入内存

之后 - 直接写入 Blob 存储

如何在没有临时文件的情况下将流从 Web API 传递到 Azure Blob 存储？

How to I pass a stream from Web API to Azure Blob storage without temp files?

c#

asp.net

azure

azure-storage

asp.net-web-api

之前 - 使用 MemoryStream，在上传之前将其读入内存

之后 - 直接写入 Blob 存储