U-SQL - 从复杂的 json 对象中提取数据
U-SQL - Extract data from complex json object
所以我有很多 json 文件结构如下:
{
"Id": "2551faee-20e5-41e4-a7e6-57bd20b02a22",
"Timestamp": "2016-12-06T08:09:57.5541438+01:00",
"EventEntry": {
"EventId": 1,
"Payload": [
"1a3e0c9e-ef69-4c6a-ac8c-9b2de2fbc701",
"DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork.FetchVisionModelsForClientOnReferenceDateAsync(System.Int64 clientId, System.DateTime referenceDate, System.Threading.CancellationToken cancellationToken)",
25,
"DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork+<FetchVisionModelsForClientOnReferenceDateAsync>d__11.MoveNext\r\nDHS.PlanCare.Core.Extensions.IQueryableExtensions+<ExecuteAndThrowTaskCancelledWhenRequestedAsync>d__16`1.MoveNext\r\n",
false,
"2197, 6-12-2016 0:00:00, System.Threading.CancellationToken"
],
"EventName": "Duration",
"KeyWordsDescription": "Duration",
"PayloadSchema": [
"instanceSessionId",
"member",
"durationInMilliseconds",
"minimalStacktrace",
"hasFailed",
"parameters"
]
},
"Session": {
"SessionId": "0016e54b-6c4a-48bd-9813-39bb040f7736",
"EnvironmentId": "C15E535B8D0BD9EF63E39045F1859C98FEDD47F2",
"OrganisationId": "AC6752D4-883D-42EE-9FEA-F9AE26978E54"
}
}
如何创建输出
的 u-sql 查询
Id,
Timestamp,
EventEntry.EventId and
EventEntry.Payload[2] (value 25 in the example below)
我不知道如何扩展我的查询
@extract =
EXTRACT
Timestamp DateTime
FROM @"wasb://xxx/2016/12/06/0016e54b-6c4a-48bd-9813-39bb040f7736/yyy/{*}/{*}.json"
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
@res =
SELECT Timestamp
FROM @extract;
OUTPUT @res TO "/output/result.csv" USING Outputters.Csv();
我看过一些例子,例如:
=> 这只查询文档的一个级别,我需要来自多个级别的数据。
U-SQL - Extract data from json-array => 这只查询文档的一个级别,我需要来自多个级别的数据。
您可能想看看这个 GIT 示例。 https://github.com/Azure/usql/blob/master/Examples/JsonSample/JsonSample/NestedJsonParsing.usql
这需要 2 个不同的数据元素并将它们组合起来,就像您拥有有效负载和有效负载架构一样。如果您使用 "Donut" 或 "Cake and Batter" 示例创建键值对,您可以将模式与有效负载匹配并使用交叉应用爆炸函数。
JSONTuple
一次支持多个JSONPath。
@extract =
EXTRACT
Id String,
Timestamp DateTime,
EventEntry String
FROM @"..."
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
@res =
SELECT Id, Timestamp, EventEntry,
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(EventEntry,
"EventId", "Payload[2]") AS Event
FROM @extract;
@res =
SELECT Id,
Timestamp,
Event["EventId"] AS EventId,
Event["Payload[2]"] AS Something
FROM @res;
所以我有很多 json 文件结构如下:
{
"Id": "2551faee-20e5-41e4-a7e6-57bd20b02a22",
"Timestamp": "2016-12-06T08:09:57.5541438+01:00",
"EventEntry": {
"EventId": 1,
"Payload": [
"1a3e0c9e-ef69-4c6a-ac8c-9b2de2fbc701",
"DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork.FetchVisionModelsForClientOnReferenceDateAsync(System.Int64 clientId, System.DateTime referenceDate, System.Threading.CancellationToken cancellationToken)",
25,
"DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork+<FetchVisionModelsForClientOnReferenceDateAsync>d__11.MoveNext\r\nDHS.PlanCare.Core.Extensions.IQueryableExtensions+<ExecuteAndThrowTaskCancelledWhenRequestedAsync>d__16`1.MoveNext\r\n",
false,
"2197, 6-12-2016 0:00:00, System.Threading.CancellationToken"
],
"EventName": "Duration",
"KeyWordsDescription": "Duration",
"PayloadSchema": [
"instanceSessionId",
"member",
"durationInMilliseconds",
"minimalStacktrace",
"hasFailed",
"parameters"
]
},
"Session": {
"SessionId": "0016e54b-6c4a-48bd-9813-39bb040f7736",
"EnvironmentId": "C15E535B8D0BD9EF63E39045F1859C98FEDD47F2",
"OrganisationId": "AC6752D4-883D-42EE-9FEA-F9AE26978E54"
}
}
如何创建输出
的 u-sql 查询Id,
Timestamp,
EventEntry.EventId and
EventEntry.Payload[2] (value 25 in the example below)
我不知道如何扩展我的查询
@extract =
EXTRACT
Timestamp DateTime
FROM @"wasb://xxx/2016/12/06/0016e54b-6c4a-48bd-9813-39bb040f7736/yyy/{*}/{*}.json"
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
@res =
SELECT Timestamp
FROM @extract;
OUTPUT @res TO "/output/result.csv" USING Outputters.Csv();
我看过一些例子,例如:
U-SQL - Extract data from json-array => 这只查询文档的一个级别,我需要来自多个级别的数据。
您可能想看看这个 GIT 示例。 https://github.com/Azure/usql/blob/master/Examples/JsonSample/JsonSample/NestedJsonParsing.usql
这需要 2 个不同的数据元素并将它们组合起来,就像您拥有有效负载和有效负载架构一样。如果您使用 "Donut" 或 "Cake and Batter" 示例创建键值对,您可以将模式与有效负载匹配并使用交叉应用爆炸函数。
JSONTuple
一次支持多个JSONPath。
@extract =
EXTRACT
Id String,
Timestamp DateTime,
EventEntry String
FROM @"..."
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
@res =
SELECT Id, Timestamp, EventEntry,
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(EventEntry,
"EventId", "Payload[2]") AS Event
FROM @extract;
@res =
SELECT Id,
Timestamp,
Event["EventId"] AS EventId,
Event["Payload[2]"] AS Something
FROM @res;