Azure 数据资源管理器中的 mv-expand 运算符未按预期为 JSON 数组工作

mv-expand operator in Azure Data Explorer not working as expected for JSON Array

我正在尝试按照 the documentation 中的说明获取 JSON 数组并在 Azure 数据资源管理器中为数组中的每个项目创建记录,但事情没有按预期运行。

我的中级 table 有一些我想要保留的顶级字段并且可以正常工作,但是数组中的所有字段都是空白的。

.create function RecordsExpandTest() {
    records_intermediate_test
    | mv-expand records_test = answers
    | project
        fullraw = tostring(fullraw),
        question = tostring(question),
        question_class = tostring(question_class),
        question_raw = tostring(question_raw),
        answer_class = tostring(answers["class"]),
        answer_type = tostring(answers["type"]),
        answer_raw = tostring(answers["raw"]),
        request_time = todatetime(request_time)
}

当我将包含 3 个答案的数组的行提取到中间 table (records_intermediate_test) 时,在最终 table (records_test) 但是所有与答案相关的字段都是空的,即使在源数据中它们不是。

为中级创建 table:

.create table records_intermediate_test(fullraw: string, question: string, question_class: string, question_raw: string, answers : dynamic, request_time: datetime)

为最终 table 创建:

.create table records_test(fullraw: string, question: string, question_class: string, question_raw: string, answer_class : string, answer_type: string, answer_raw : string, request_time: datetime)

改变 table 应用 mv-expand:

.alter table records_test policy update @'[{"Source": "records_intermediate_test", "Query": "RecordsExpandTest()", "IsEnabled": "True"}]'

示例行来自 records_intermediate_test

"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answers": [
    {
        "class": "C",
        "type": "C",
        "raw": "TEST"
    },
    {
        "class": "B",
        "type": "B",
        "raw": "TEST"
    },
    {
        "class": "A",
        "type": "A",
        "raw": "TEST"
    }
],
"request_time": 2019-01-01T10:07:49.0105654Z

示例行来自 records_test

"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": ,
"answer_type": ,
"answers_raw": ,
"request_time": 2019-01-01T10:07:49.0105654Z

^在table

中重复了3次

来自 records_test

的预期行
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": A,
"answer_type": A,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z

"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": B,
"answer_type": B,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z

"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": C,
"answer_type": C,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z

中间 table 的输入总是包含一个包含 1 个问题的数组。这是映射:

.create table records_intermediate_test ingestion json mapping 'mappingtest' 
'['
'   { "column" : "fullraw", "Properties":{"Path":"$.fullraw"}},'
'   { "column" : "question", "Properties" {"Path":"$.question[0].question"}},'
'   { "column" : "question_class", "Properties":{"Path":"$.question[0].class"}},'
'   { "column" : "question_raw", "Properties":{"Path":"$.question[0].raw"}},'
'   { "column" : "answers", "Properties":{"Path":"$.answers"}},'
    '   { "column" : "request_time", "Properties":{"Path":"$.request_time"}}'
']'

records_intermediate_test table:

的原始 JSON 输入示例
{
"fullraw": "TEST",
"question": "TEST",
"question_class": "TEST",
"answers": [
    {
        "class": "C",
        "type": "C",
        "raw": "TEST"
    },
    {
        "class": "B",
        "type": "B",
        "raw": "TEST"
    },
    {
        "class": "A",
        "type": "A",
        "raw": "TEST"
    }
],
"request_time": 2019-01-01T10:07:49.0105654Z
}

更改更新策略使用的函数以包含这些字段:

.create function RecordsExpandTest() {
records_intermediate_test
| mv-expand records_test = answers
| project
    fullraw = tostring(fullraw),
    question = tostring(question),
    question_class = tostring(question_class),
    question_raw = tostring(question_raw),
    answer_class = tostring(answers["class"]),
    answer_type = tostring(answers["type"]),
    answer_raw = tostring(answers["raw"]),
    request_time = todatetime(request_time),
    MyTopLevelField
}

确保中间 table 和目标 table 符合您的模式:

.create table RawEvents ingestion json mapping 'RawEventMapping' '[{"column":"Answers","path":"$.answers"}, {"column":"MyTopLevelField","path":"$.myTopLevelField"}]'

@Keren 的回答是正确的。我应该只映射数组中的字段,而将其他字段保持原样。这是我有效的扩展功能:

.create function RecordsExpandTest() {
    dns_records_intermediate
    | mv-expand dns_records = answers
     | project
        request_time,
        fullraw,
        question_class,
        question,
        question_raw,
        answer_class = tostring(answers.class),
        answer_raw = tostring(answers.raw),
        answer_type = tostring(answers.type)
}

使用这个新映射,当我在 records_intermediate_test 中插入一个包含 3 个答案的条目时,在 records_intermediate_test 中创建了 1 条记录,在 records_test 中创建了 3 条记录,并映射了所有字段不出所料。

也回答["raw"] 格式也有效。