从 c# 中的 azure blob 中提取嵌入式文件

Question

我在 blob 中存储了嵌入的 pdf 文件file.I想从我的 blob 中提取这些文件。

以下是我到目前为止所做的事情：

我做了http触发功能app
与存储容器建立连接
能够获取 blob。

获取我正在使用以下代码的嵌入文件：

namespace PDFDownloader {
  public static class Function1 { [FunctionName("Function1")]
    public static async Task <IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req, ILogger log) {
      log.LogInformation($"GetVolumeData function executed at: 
       {DateTime.Now}");
      try {
        CloudStorageAccount storageAccount = CloudStorageAccount.Parse(Parameter.ConnectionString);
        CloudBlobClient cloudBlobClient = storageAccount.CreateCloudBlobClient();
        CloudBlobContainer cloudcontainer = cloudBlobClient.GetContainerReference(Parameter.SuccessContainer);

        BlobResultSegment resultSegment = await
        cloudcontainer.ListBlobsSegmentedAsync(currentToken: null);
        IEnumerable <IListBlobItem> blobItems = resultSegment.Results;

        string response = "";
        int count = 0;
        //string blobName = "";

        foreach(IListBlobItem item in blobItems) {
          var type = item.GetType();
          if (type == typeof(CloudBlockBlob)) {
            CloudBlockBlob blob = (CloudBlockBlob) item;
            count++;
            var blobname = blob.Name;
            // response = blobname;
            response = blob.DownloadTextAsync().Result;
            //response = blob.DownloadToStream().Result;
          }
        }

        if (count == 0) {
          return new OkObjectResult("Error : File Not Found !!");
        } else {
          return new OkObjectResult(Convert.ToString(response));
        }
      } catch(Exception ex) {
        log.LogError($ " Function Exception Message: {ex.Message}");
        return new OkObjectResult(ex.Message.ToString());
      } finally {
        log.LogInformation($"Function- ENDED ON : {DateTime.Now}");
      }
    }
  }

如何从我的 blob 文件响应中读取嵌入文件并将其发送到 http？

Answer 1

除了您的代码需要大量清理并且您应该阅读有关异步的正确使用这一事实之外，我相信您的实际问题在于：

FileStream inputStream = new FileStream(response, FileMode.Open);

response 对象包含您之前下载的 blob 的文本内容。然而，Filestream ctor 需要一个文件路径。由于您在这里没有文件，因此 Filestream 不是正确的选择。要么将 blob 下载为临时文件，要么直接下载为字符串

另外，帮自己一个忙，切换到最新版本的 Storage Blob SDK (https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/storage/Azure.Storage.Blobs#downloading-a-blob)。

Answer 2

认为一个 blob 相当于一个文件（检查它）。

检查线路：

response =  Convert.ToString(blob.DownloadTextAsync().Result);

您的 blob 内容是有效的文件路径吗？

可能你没有正确使用FileStream的构造函数class public FileStream (string path, System.IO.FileMode mode)。这个构造函数可以抛出很多不同的异常，尝试找到yours.

此外，正如之前的回答中所推荐的，值得使用基于SDK版本12的Azure.Storage.Blobs包，现在您使用的是SDK版本11(Microsoft.Azure.Storage.Blob).

Answer 3

             using Bytescout.PDFExtractor;

            var stream1 = await blob.OpenReadAsync(); //read your blob like 
            this
            attachmentExtractor extractor = new AttachmentExtractor();
                     extractor.RegistrationName = "demo";
                     extractor.RegistrationKey = "demo";


                     // Load sample PDF document
                     extractor.LoadDocumentFromFile(stream1);

                     for (int i = 0; i < extractor.Count; i++)
                     {
                         Console.WriteLine("Saving attachment: " + 
                         extractor.GetFileName(i));
                         // Save attachment to file
                         extractor.Save(i, extractor.GetFileName(i));
                         Console.WriteLine("File size: " + extractor.GetSize(i));
                     }

                     extractor.Dispose();*/

从 c# 中的 azure blob 中提取嵌入式文件

Extract embedded files from azure blob in c#

c#

pdf

azure

azure-function-app