如何解析嵌套的 EventGrid 消息?

How to parse nested EventGrid message?

我刚刚了解由队列存储事件处理程序触发的 Azure Functions。在这种情况下,队列存储正在处理事件网格消息。

问题:如何使用Python访问下面"body"中嵌套的各种值?

队列存储消息如下所示(为了便于阅读而缩进):

{
    "id": "<big-long-guid>", 
    "body": "{
        \"topic\":\"/subscriptions/<big-long-guid>/resourceGroups/azureStorage/providers/Microsoft.Storage/storageAccounts/stgcool\",
        \"subject\":\"/blobServices/default/containers/cont-pics/blobs/profile_pic.jpg\",
        \"eventType\":\"Microsoft.Storage.BlobCreated\",
        \"id\":\"<big-long-guid>\",
        \"data\":{
            \"api\":\"PutBlob\",
            \"clientRequestId\":\"<big-long-guid>\",
            \"requestId\":\"<big-long-guid>\",
            \"eTag\":\"0x8D94CE0B2F5CD71\",
            \"contentType\":\"image/jpeg\",
            \"contentLength\":35799,
            \"blobType\":\"BlockBlob\",
            \"blobUrl\":\"https://stgcool.blob.core.windows.net/cont-pics/profile_pic.jpg\",
            \"url\":\"https://stgcool.blob.core.windows.net/cont-pics/profile_pic.jpg\",
            \"sequencer\":\"00000000000000000000000000003730000000000000312a\",
            \"storageDiagnostics\":{
                \"batchId\":\"<big-long-guid>\"
            }
        },
        \"dataVersion\":\"\",
        \"metadataVersion\":\"1\",
        \"eventTime\":\"2021-07-22T07:17:00.8479184Z\"
    }", 
    "expiration_time": "2021-07-30T05:10:37+00:00", 
    "insertion_time": "2021-07-23T05:10:37+00:00", 
    "time_next_visible": "2021-07-23T05:20:37+00:00", 
    "pop_receipt": "cOQ8m5lN2QgBAAAA", 
    "dequeue_count": 1
}

下面是生成上述日志的示例函数代码:

import logging
import json

import azure.functions as func

def main(msg: func.QueueMessage):
    logging.info('Python queue trigger function processed a queue item.')

    result = json.dumps({
        'id': msg.id,
        'body': msg.get_body().decode('utf-8'),
        'expiration_time': (msg.expiration_time.isoformat()
                            if msg.expiration_time else None),
        'insertion_time': (msg.insertion_time.isoformat()
                           if msg.insertion_time else None),
        'time_next_visible': (msg.time_next_visible.isoformat()
                              if msg.time_next_visible else None),
        'pop_receipt': msg.pop_receipt,
        'dequeue_count': msg.dequeue_count
    })

    logging.info(result)

尝试过:

编辑 1:

我写这篇文章是因为您正在尝试更新为 logging 提供 result 的方式。如果您实际上想做的是从构造不佳的 Queue Storage 消息中解析它,请告诉我。

鉴于您作为字典给出的示例:

d = {
    "id": "<big-long-guid>", 
    "body": "{
        \"topic\":\"/subscriptions/<big-long-guid>/resourceGroups/azureStorage/providers/Microsoft.Storage/storageAccounts/stgcool\",
        \"subject\":\"/blobServices/default/containers/cont-pics/blobs/profile_pic.jpg\",
        \"eventType\":\"Microsoft.Storage.BlobCreated\",
        \"id\":\"<big-long-guid>\",
        \"data\":{
            \"api\":\"PutBlob\",
            \"clientRequestId\":\"<big-long-guid>\",
            \"requestId\":\"<big-long-guid>\",
            \"eTag\":\"0x8D94CE0B2F5CD71\",
            \"contentType\":\"image/jpeg\",
            \"contentLength\":35799,
            \"blobType\":\"BlockBlob\",
            \"blobUrl\":\"https://stgcool.blob.core.windows.net/cont-pics/profile_pic.jpg\",
            \"url\":\"https://stgcool.blob.core.windows.net/cont-pics/profile_pic.jpg\",
            \"sequencer\":\"00000000000000000000000000003730000000000000312a\",
            \"storageDiagnostics\":{
                \"batchId\":\"<big-long-guid>\"
            }
        }",
        \"dataVersion\":\"\",
        \"metadataVersion\":\"1\",
        \"eventTime\":\"2021-07-22T07:17:00.8479184Z\"
    }", 
    "expiration_time": "2021-07-30T05:10:37+00:00", 
    "insertion_time": "2021-07-23T05:10:37+00:00", 
    "time_next_visible": "2021-07-23T05:20:37+00:00", 
    "pop_receipt": "cOQ8m5lN2QgBAAAA", 
    "dequeue_count": 1
}

我们可以这样查看 "body" 中的值:

d['body']['data']['api']
PutBlob

不幸的是,如果您尝试此操作,您将收到 TypeError 抛出,因为您试图使用 string 尝试访问另一个 string 中的索引。我们使用 string 个对象作为 dict 的键,而不是 string 的索引。您被抛出此类错误的原因是 "body"dict”实际上是 str,而不是 dict(请注意 "花括号的任一侧)。

通过为 'body' 更新您的 json.dumps 来解决此问题:

    result = json.dumps({
        'id': msg.id,
        'body': json.loads(msg.get_body().decode('utf-8')),
...

编辑:

重新阅读您的问题,我在您的示例中称为 dict 的内容在您获取它时似乎仍然是一个字符串。在那种情况下,您可能会遇到更多问题,因为 "body" 格式不正确。

如果是这种情况,您可以通过 运行:

清除 "body" 值周围那些讨厌的 " 标记
message = message.replace('"{', '{').replace('}"', '}')

阅读之前:

d = json.loads(message)