Failing to upload larger blobs to Azure: azure.core.exceptions.ServiceRequestError: The operation did not complete (write) (_ssl.c:2317)

Failing to upload larger blobs to Azure: azure.core.exceptions.ServiceRequestError: The operation did not complete (write) (_ssl.c:2317)

我正在尝试使用 Python SDK 将一些较大的 blob (>50MB) 上传到我的 Azure 存储容器:

connect_str = os.environ['AZURE_STORAGE_CONNECTION_STRING']
blob_service_client = BlobServiceClient.from_connection_string(connect_str)

def upload_blob(file_path):
    if os.path.exists(file_path):
        with open(file_path, 'rb') as data:
            blob_client = blob_service_client.get_blob_client(container='foo', blob=file_path)

            print(f"Uploading file {file_path} to blob storage...")
            print(os.path.getsize(file_path))
            return blob_client.upload_blob(data, length=os.path.getsize(file_path))
    else:
        print(f"File {file_path} not found. Please store the file first before uploading")
        return False

当我 运行 然而,我得到一个 azure.core.exceptions.ServiceRequestError:

Traceback (most recent call last):
  File "C:/Users/.../storage_controller.py", line 96, in <module>
    upload_blob(config.VECTORIZER_PATH)
  File "C:/Users/.../storage_controller.py", line 34, in upload_blob
    return blob_client.upload_blob(data, length=os.path.getsize(file_path))
  File "C:\Users\...\venv\lib\site-packages\azure\core\tracing\decorator.py", line 83, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_blob_client.py", line 496, in upload_blob
    return upload_block_blob(**options)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_upload_helpers.py", line 104, in upload_block_blob
    **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_generated\operations\_block_blob_operations.py", line 207, in upload
    pipeline_response = self._client._pipeline.run(request, stream=False, **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 211, in run
    return first_node.send(pipeline_request)  # type: ignore
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  [Previous line repeated 4 more times]
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\policies\_redirect.py", line 157, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\policies.py", line 515, in send
    raise err
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\policies.py", line 489, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\policies.py", line 290, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 71, in send
    response = self.next.send(request)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\_base.py", line 103, in send
    self._sender.send(request.http_request, **request.context.options),
  File "C:\Users\...\venv\lib\site-packages\azure\storage\blob\_shared\base_client.py", line 312, in send
    return self._transport.send(request, **kwargs)
  File "C:\Users\...\venv\lib\site-packages\azure\core\pipeline\transport\_requests_basic.py", line 284, in send
    raise error
azure.core.exceptions.ServiceRequestError: The operation did not complete (write) (_ssl.c:2317)

我尝试了一些方法,找到了一些关于分块和使用 put_blob 方法处理更大文件的建议,但这些解决方案在当前版本的 SDK 中似乎不可行应该自己处理较大的文件。然而,较小的文件(例如只有一行的 .txt 文件)绝对可以正常工作。这是 Azure SDK 的问题还是我自己的 networking/SSL 配置错误,我该如何解决?

提前致谢!

我将解决方案总结如下。

如果你想用包 azure.storage.blob 将文件分块上传到 Azure blob,我们可以使用方法 BlobClient.stage_block 来上传每个块。上传后,我们使用方法BlobClient.commit_block_list将所有块拼成一个blob。

例如

# Instantiate a new BlobServiceClient using a connection string
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
# Instantiate a new ContainerClient
container_client = blob_service_client.get_container_client('')
blob_client = container_client.get_blob_client("csvfile.csv")
# upload data
block_list=[]
chunk_size=1024
with open('csvfile.csv','rb') as f:
   
   while True:
        read_data = f.read(chunk_size)
        if not read_data:
            break # done
        blk_id = str(uuid.uuid4())
        blob_client.stage_block(block_id=blk_id,data=read_data) 
        block_list.append(BlobBlock(block_id=blk_id))
        

blob_client.commit_block_list(block_list)

详情请参考here