如何使用 Azure StorageStreamDownloader 中的 lxml iterparse?

how to use lxml iterparse from Azure StorageStreamDownloader?

我目前正在使用 lxml.etree.iterparse 逐个标记遍历 XML 文件标记。这在本地工作正常,但我想将 XML 文件移动到 Azure Blob 存储并在 Azure 函数中处理该文件。但是,我有点无法尝试从 StorageStreamDownloader

解析 XML 文件

本地编码

from lxml import etree

context = etree.iterparse('c:\Users\', tag='InstanceElement')

for event, elem in context:
    # processing of the tag

从 Blob 流式传输

from lxml import etree
from azure.storage.filedatalake import DataLakeServiceClient

connect_str = ''
service = DataLakeServiceClient.from_connection_string(conn_str=connect_string)

System = service.get_file_system_client('')
FileClient = System.get_file_client('')
Stream = FileClient.download_file()

# Stuck on what the input must be for iterparse
context = etree.iterparse(, tag='InstanceElement')

for event, elem in context:
    # processing of the tag

我不知道 iterparse 的输入必须是什么,所以关于如何在流式传输时解析 XML 文件有什么想法吗?

试试这个:

from lxml import etree
from azure.storage.filedatalake import DataLakeServiceClient
from io  import BytesIO

connect_str = ''
service = DataLakeServiceClient.from_connection_string(conn_str=connect_str)

System = service.get_file_system_client('')
FileClient = System.get_file_client('test.xml')
content = FileClient.download_file().readall()

context = etree.iterparse(BytesIO(content), tag='InstanceElement')
for event, elem in context:
    print(elem.text)

我的test.xml内容:

结果: