Azure 数据资源管理器中的 mv-expand 运算符未按预期为 JSON 数组工作
mv-expand operator in Azure Data Explorer not working as expected for JSON Array
我正在尝试按照 the documentation 中的说明获取 JSON 数组并在 Azure 数据资源管理器中为数组中的每个项目创建记录,但事情没有按预期运行。
我的中级 table 有一些我想要保留的顶级字段并且可以正常工作,但是数组中的所有字段都是空白的。
.create function RecordsExpandTest() {
records_intermediate_test
| mv-expand records_test = answers
| project
fullraw = tostring(fullraw),
question = tostring(question),
question_class = tostring(question_class),
question_raw = tostring(question_raw),
answer_class = tostring(answers["class"]),
answer_type = tostring(answers["type"]),
answer_raw = tostring(answers["raw"]),
request_time = todatetime(request_time)
}
当我将包含 3 个答案的数组的行提取到中间 table (records_intermediate_test) 时,在最终 table (records_test) 但是所有与答案相关的字段都是空的,即使在源数据中它们不是。
为中级创建 table:
.create table records_intermediate_test(fullraw: string, question: string, question_class: string, question_raw: string, answers : dynamic, request_time: datetime)
为最终 table 创建:
.create table records_test(fullraw: string, question: string, question_class: string, question_raw: string, answer_class : string, answer_type: string, answer_raw : string, request_time: datetime)
改变 table 应用 mv-expand:
.alter table records_test policy update @'[{"Source": "records_intermediate_test", "Query": "RecordsExpandTest()", "IsEnabled": "True"}]'
示例行来自 records_intermediate_test
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answers": [
{
"class": "C",
"type": "C",
"raw": "TEST"
},
{
"class": "B",
"type": "B",
"raw": "TEST"
},
{
"class": "A",
"type": "A",
"raw": "TEST"
}
],
"request_time": 2019-01-01T10:07:49.0105654Z
示例行来自 records_test
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": ,
"answer_type": ,
"answers_raw": ,
"request_time": 2019-01-01T10:07:49.0105654Z
^在table
中重复了3次
来自 records_test
的预期行
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": A,
"answer_type": A,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": B,
"answer_type": B,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": C,
"answer_type": C,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z
中间 table 的输入总是包含一个包含 1 个问题的数组。这是映射:
.create table records_intermediate_test ingestion json mapping 'mappingtest'
'['
' { "column" : "fullraw", "Properties":{"Path":"$.fullraw"}},'
' { "column" : "question", "Properties" {"Path":"$.question[0].question"}},'
' { "column" : "question_class", "Properties":{"Path":"$.question[0].class"}},'
' { "column" : "question_raw", "Properties":{"Path":"$.question[0].raw"}},'
' { "column" : "answers", "Properties":{"Path":"$.answers"}},'
' { "column" : "request_time", "Properties":{"Path":"$.request_time"}}'
']'
records_intermediate_test table:
的原始 JSON 输入示例
{
"fullraw": "TEST",
"question": "TEST",
"question_class": "TEST",
"answers": [
{
"class": "C",
"type": "C",
"raw": "TEST"
},
{
"class": "B",
"type": "B",
"raw": "TEST"
},
{
"class": "A",
"type": "A",
"raw": "TEST"
}
],
"request_time": 2019-01-01T10:07:49.0105654Z
}
更改更新策略使用的函数以包含这些字段:
.create function RecordsExpandTest() {
records_intermediate_test
| mv-expand records_test = answers
| project
fullraw = tostring(fullraw),
question = tostring(question),
question_class = tostring(question_class),
question_raw = tostring(question_raw),
answer_class = tostring(answers["class"]),
answer_type = tostring(answers["type"]),
answer_raw = tostring(answers["raw"]),
request_time = todatetime(request_time),
MyTopLevelField
}
确保中间 table 和目标 table 符合您的模式:
.create table RawEvents ingestion json mapping 'RawEventMapping' '[{"column":"Answers","path":"$.answers"}, {"column":"MyTopLevelField","path":"$.myTopLevelField"}]'
@Keren 的回答是正确的。我应该只映射数组中的字段,而将其他字段保持原样。这是我有效的扩展功能:
.create function RecordsExpandTest() {
dns_records_intermediate
| mv-expand dns_records = answers
| project
request_time,
fullraw,
question_class,
question,
question_raw,
answer_class = tostring(answers.class),
answer_raw = tostring(answers.raw),
answer_type = tostring(answers.type)
}
使用这个新映射,当我在 records_intermediate_test 中插入一个包含 3 个答案的条目时,在 records_intermediate_test 中创建了 1 条记录,在 records_test 中创建了 3 条记录,并映射了所有字段不出所料。
也回答["raw"] 格式也有效。
我正在尝试按照 the documentation 中的说明获取 JSON 数组并在 Azure 数据资源管理器中为数组中的每个项目创建记录,但事情没有按预期运行。
我的中级 table 有一些我想要保留的顶级字段并且可以正常工作,但是数组中的所有字段都是空白的。
.create function RecordsExpandTest() {
records_intermediate_test
| mv-expand records_test = answers
| project
fullraw = tostring(fullraw),
question = tostring(question),
question_class = tostring(question_class),
question_raw = tostring(question_raw),
answer_class = tostring(answers["class"]),
answer_type = tostring(answers["type"]),
answer_raw = tostring(answers["raw"]),
request_time = todatetime(request_time)
}
当我将包含 3 个答案的数组的行提取到中间 table (records_intermediate_test) 时,在最终 table (records_test) 但是所有与答案相关的字段都是空的,即使在源数据中它们不是。
为中级创建 table:
.create table records_intermediate_test(fullraw: string, question: string, question_class: string, question_raw: string, answers : dynamic, request_time: datetime)
为最终 table 创建:
.create table records_test(fullraw: string, question: string, question_class: string, question_raw: string, answer_class : string, answer_type: string, answer_raw : string, request_time: datetime)
改变 table 应用 mv-expand:
.alter table records_test policy update @'[{"Source": "records_intermediate_test", "Query": "RecordsExpandTest()", "IsEnabled": "True"}]'
示例行来自 records_intermediate_test
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answers": [
{
"class": "C",
"type": "C",
"raw": "TEST"
},
{
"class": "B",
"type": "B",
"raw": "TEST"
},
{
"class": "A",
"type": "A",
"raw": "TEST"
}
],
"request_time": 2019-01-01T10:07:49.0105654Z
示例行来自 records_test
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": ,
"answer_type": ,
"answers_raw": ,
"request_time": 2019-01-01T10:07:49.0105654Z
^在table
中重复了3次来自 records_test
的预期行"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": A,
"answer_type": A,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": B,
"answer_type": B,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z
"fullraw": TEST,
"question": TEST,
"question_class": TEST,
"answer_class": C,
"answer_type": C,
"answers_raw": TEST,
"request_time": 2019-01-01T10:07:49.0105654Z
中间 table 的输入总是包含一个包含 1 个问题的数组。这是映射:
.create table records_intermediate_test ingestion json mapping 'mappingtest'
'['
' { "column" : "fullraw", "Properties":{"Path":"$.fullraw"}},'
' { "column" : "question", "Properties" {"Path":"$.question[0].question"}},'
' { "column" : "question_class", "Properties":{"Path":"$.question[0].class"}},'
' { "column" : "question_raw", "Properties":{"Path":"$.question[0].raw"}},'
' { "column" : "answers", "Properties":{"Path":"$.answers"}},'
' { "column" : "request_time", "Properties":{"Path":"$.request_time"}}'
']'
records_intermediate_test table:
的原始 JSON 输入示例{
"fullraw": "TEST",
"question": "TEST",
"question_class": "TEST",
"answers": [
{
"class": "C",
"type": "C",
"raw": "TEST"
},
{
"class": "B",
"type": "B",
"raw": "TEST"
},
{
"class": "A",
"type": "A",
"raw": "TEST"
}
],
"request_time": 2019-01-01T10:07:49.0105654Z
}
更改更新策略使用的函数以包含这些字段:
.create function RecordsExpandTest() {
records_intermediate_test
| mv-expand records_test = answers
| project
fullraw = tostring(fullraw),
question = tostring(question),
question_class = tostring(question_class),
question_raw = tostring(question_raw),
answer_class = tostring(answers["class"]),
answer_type = tostring(answers["type"]),
answer_raw = tostring(answers["raw"]),
request_time = todatetime(request_time),
MyTopLevelField
}
确保中间 table 和目标 table 符合您的模式:
.create table RawEvents ingestion json mapping 'RawEventMapping' '[{"column":"Answers","path":"$.answers"}, {"column":"MyTopLevelField","path":"$.myTopLevelField"}]'
@Keren 的回答是正确的。我应该只映射数组中的字段,而将其他字段保持原样。这是我有效的扩展功能:
.create function RecordsExpandTest() {
dns_records_intermediate
| mv-expand dns_records = answers
| project
request_time,
fullraw,
question_class,
question,
question_raw,
answer_class = tostring(answers.class),
answer_raw = tostring(answers.raw),
answer_type = tostring(answers.type)
}
使用这个新映射,当我在 records_intermediate_test 中插入一个包含 3 个答案的条目时,在 records_intermediate_test 中创建了 1 条记录,在 records_test 中创建了 3 条记录,并映射了所有字段不出所料。
也回答["raw"] 格式也有效。