使用 pickle 或 dill 从 Azure blob 存储中读取文件而不保存到磁盘

Question

我正在尝试从 Python 中的 Azure 存储 Blob 中读取机器学习模型的权重。这应该是 Azure Functions 中的运行，所以我不相信我能够使用将 blob 保存到磁盘的方法。

我使用的是 azure-storage-blob 12.5.0，不是旧版本。

我试过使用 Dill.loads 加载 .pkl 文件，如下所示：

connection_string = 'my_connection_string'
blob_client = BlobClient.from_connection_string(connection_string, container_name, blob_name)
downloader = blob_client.download_blob(0)

with BytesIO() as f:
    downloader.readinto(f)
    weights = dill.loads(f)

哪个returns:

>>> TypeError: a bytes-like object is required, not '_io.BytesIO'

我不确定使用 Pickle 的方法如何。如何解决？

Answer 1

解决这个问题的方法如下：

def get_weights_blob(blob_name):
    connection_string = 'my_connection_string'
    blob_client = BlobClient.from_connection_string(connection_string, container_name, blob_name)
    downloader = blob_client.download_blob(0)

    # Load to pickle
    b = downloader.readall()
    weights = pickle.loads(b)

    return weights

然后使用函数检索权重：

weights = get_weights_blob(blob_name = 'myPickleFile')

Answer 2

这是我的工作样本

def main(req: func.HttpRequest) -> func.HttpResponse:

 connection_string = ''
    blob_client = BlobClient.from_connection_string(connection_string, 'blog-storage-containe', 'blobfile')
    downloader = blob_client.download_blob(0)

b = downloader.readall()
loaded_model = pickle.loads(b)

和requirements.txt文件

azure-functions
numpy
joblib
azure-storage-blob
sklearn

使用 pickle 或 dill 从 Azure blob 存储中读取文件而不保存到磁盘

Reading file from Azure blob storage using pickle or dill without saving to disk

pickle

dill

azure-blob-storage

azure-functions