Azure Synapse Analytics Json 展平

Azure Synapse Analytics Json Flatten

我是 Azure Synapse 的新手,目前遇到以下问题:

我得到一个 json,如下所示:

{
"2022-02-01":[
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    },
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    }
],
  "2022-02-02": [
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    }
],
"2022-02-03": [
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    },
    {
      "shiftId": ,
      "employeeId": ,
      "duration": ""
    }
],
  "2022-02-4": []
}

现在我想转换它以便在视图中看到它。我已经尝试将数据流作为文档数组,但出现错误。

“在架构推断中检测到格式错误的记录。解析模式:FAILFAST”

我想要这样的东西:

date         shiftId   employeeId   duration
___________|_________|____________|_________
2022-02-01 | 1234    | 345345     | 420
2022-02-01 | 2345    | 345345     | 124
2022-02-02 | 5345    | 123567     | 424
2022-02-03 | 5675    | 987542     | 123
2022-02-03 | 9456    | 234466     | 754

Azure Synapse Analytics,专用 SQL 池实际上非常适合 JSON,支持 OPENJSON and JSON_VALUE,因此您可以使用带有 JSON 的存储过程作为一个参数。一个简单的例子:

SELECT
    k.[key] AS [shiftDate],
    JSON_VALUE( d.[value], '$.shiftId' ) shiftId,
    JSON_VALUE( d.[value], '$.employeeId' ) employeeId,
    JSON_VALUE( d.[value], '$.duration' ) duration
FROM OPENJSON( @json, '$' ) k
    CROSS APPLY OPENJSON( k.value, '$' ) d;

完整代码:

DECLARE @json NVARCHAR(MAX) = '{
    "2022-02-01": [
        {
            "shiftId": 1234,
            "employeeId": 345345,
            "duration": 420
        },
        {
            "shiftId": 2345,
            "employeeId": 345345,
            "duration": 124
        }
    ],
    "2022-02-02": [
        {
            "shiftId": 5345,
            "employeeId": 123567,
            "duration": 424
        }
    ],
    "2022-02-03": [
        {
            "shiftId": 5675,
            "employeeId": 987542,
            "duration": 123
        },
        {
            "shiftId": 9456,
            "employeeId": 234466,
            "duration": 754
        }
    ]
}'


SELECT
    k.[key] AS [shiftDate],
    JSON_VALUE( d.[value], '$.shiftId' ) shiftId,
    JSON_VALUE( d.[value], '$.employeeId' ) employeeId,
    JSON_VALUE( d.[value], '$.duration' ) duration
FROM OPENJSON( @json, '$' ) k
    CROSS APPLY OPENJSON( k.value, '$' ) d;

我的结果:

如果您想要更动态的东西,您可以使用 Synapse Notebook 或映射数据流。