将文件创建为流并上传到 Azure

Creating a file as a stream and uploading to Azure

我正在使用 ChoETL 和 ChoETL.Parquet 库来创建基于其他一些数据的 parquet 文件。我可以在本地创建文件。

  using (ChoParquetWriter parser = new ChoParquetWriter($"..\..\..\parquet_files\{club}_events.parquet"))
       {
           parser.Write(events);
       }

在此代码片段中,事件是包含字符串的对象列表。它们将被转换为镶木地板数据。

到目前为止,我已经编写了上传到 Azure 的代码,但它需要一个本地文件作为输入。

BlobServiceClient BlobServiceClient = new BlobServiceClient("REDACTED");
var containerClient = BlobServiceClient.GetBlobContainerClient("base-test");
BlobClient blobClient = containerClient.GetBlobClient($"Base/{RequestTime.Year}/{RequestTime.Month}/{RequestTime.Day}/{RequestTime.Hour}/{RequestTime.Minute}/events.parquet");
using FileStream uploadFileStream = File.OpenRead("..\..\..\events.parquet"); 
await blobClient.UploadAsync(uploadFileStream, true);
uploadFileStream.Close();

我需要在内存中创建它然后上传到 Azure blob 存储。我怎样才能做到这一点?澄清一下:我需要上传镶木地板文件。

关于这个问题,您可以使用方法BlockBlobClient.OpenWriteAsync获取流并为ChoParquetWriter提供流。然后writer会直接把东西写到Azure blob中。

例如

  List<EmployeeRecSimple> objs = new List<EmployeeRecSimple>();

            EmployeeRecSimple rec1 = new EmployeeRecSimple();
            rec1.Id = 1;
            rec1.Name = "Mark";
            objs.Add(rec1);

            EmployeeRecSimple rec2 = new EmployeeRecSimple();
            rec2.Id = 2;
            rec2.Name = "Jason";
            objs.Add(rec2);

            BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
            var desContainer = blobServiceClient.GetBlobContainerClient("output");
            var desBlob= desContainer.GetBlockBlobClient("my.parquet");
            var options = new BlockBlobOpenWriteOptions {
                HttpHeaders = new BlobHttpHeaders {
                    ContentType = MimeMapping.GetMimeMapping("parquet"),
                },
                // progress updates about data transfers
                ProgressHandler = new Progress<long> (
                    progress => Console.WriteLine("Progress: {0} bytes written", progress))
                    
                
            };

            using (var outStream = await desBlob.OpenWriteAsync(true, options).ConfigureAwait(false))
            using (ChoParquetWriter parser = new ChoParquetWriter(outStream)) {

                parser.Write(objs);
            }

public partial class EmployeeRecSimple
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }