U-SQL - 从复杂的 json 对象中提取数据

U-SQL - Extract data from complex json object

所以我有很多 json 文件结构如下:

{
    "Id": "2551faee-20e5-41e4-a7e6-57bd20b02a22",
    "Timestamp": "2016-12-06T08:09:57.5541438+01:00",
    "EventEntry": {
        "EventId": 1,
        "Payload": [
            "1a3e0c9e-ef69-4c6a-ac8c-9b2de2fbc701",
            "DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork.FetchVisionModelsForClientOnReferenceDateAsync(System.Int64 clientId, System.DateTime referenceDate, System.Threading.CancellationToken cancellationToken)",
            25,
            "DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork+<FetchVisionModelsForClientOnReferenceDateAsync>d__11.MoveNext\r\nDHS.PlanCare.Core.Extensions.IQueryableExtensions+<ExecuteAndThrowTaskCancelledWhenRequestedAsync>d__16`1.MoveNext\r\n",
            false,
            "2197, 6-12-2016 0:00:00, System.Threading.CancellationToken"
        ],
        "EventName": "Duration",
        "KeyWordsDescription": "Duration",
        "PayloadSchema": [
            "instanceSessionId",
            "member",
            "durationInMilliseconds",
            "minimalStacktrace",
            "hasFailed",
            "parameters"
        ]
    },
    "Session": {
        "SessionId": "0016e54b-6c4a-48bd-9813-39bb040f7736",
        "EnvironmentId": "C15E535B8D0BD9EF63E39045F1859C98FEDD47F2",
        "OrganisationId": "AC6752D4-883D-42EE-9FEA-F9AE26978E54"
    }
}

如何创建输出

的 u-sql 查询
Id, 
Timestamp, 
EventEntry.EventId and 
EventEntry.Payload[2] (value 25 in the example below)

我不知道如何扩展我的查询

@extract =
     EXTRACT 
         Timestamp DateTime
     FROM @"wasb://xxx/2016/12/06/0016e54b-6c4a-48bd-9813-39bb040f7736/yyy/{*}/{*}.json"
     USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@res =
    SELECT Timestamp
    FROM @extract;

OUTPUT @res TO "/output/result.csv" USING Outputters.Csv(); 

我看过一些例子,例如:

=> 这只查询文档的一个级别,我需要来自多个级别的数据。

U-SQL - Extract data from json-array => 这只查询文档的一个级别,我需要来自多个级别的数据。

您可能想看看这个 GIT 示例。 https://github.com/Azure/usql/blob/master/Examples/JsonSample/JsonSample/NestedJsonParsing.usql

这需要 2 个不同的数据元素并将它们组合起来,就像您拥有有效负载和有效负载架构一样。如果您使用 "Donut" 或 "Cake and Batter" 示例创建键值对,您可以将模式与有效负载匹配并使用交叉应用爆炸函数。

JSONTuple一次支持多个JSONPath。

@extract =
     EXTRACT
         Id String,
         Timestamp DateTime,
         EventEntry String
     FROM @"..."
     USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@res =
    SELECT Id, Timestamp, EventEntry,
    Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(EventEntry,
        "EventId", "Payload[2]") AS Event
    FROM @extract;

@res =
    SELECT Id,
    Timestamp,
    Event["EventId"] AS EventId,
    Event["Payload[2]"] AS Something
    FROM @res;