U-SQL 从 Json 中提取包含数组的数据

U-SQL Extract data from Json that contain Array

我想从我的 json 记录中获取始终包含用户数组中相同的 PartnerId 和名称的记录。我目前正在尝试使用此代码:

@jsonFile =
    EXTRACT partnerId int,
            users string
    FROM @INPUT_FILE
    USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@followingUsersArray =
    SELECT partnerId,
           Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(users) AS following_array
    FROM @jsonFile;

@followingUsers =
    SELECT partnerId AS PartnerId,
           following_array["name"] AS FriendName
    FROM @followingUsersArray;

但我没有得到任何结果。这是我的 json 示例文件:

{
    "partnerId": 2,
    "users": [{
            "name": "Anna ROGOWSKA",
            "profile_image_url": "http://pbs.twimg.com/profile_images/884844399338901504/0OYl8JA6_normal.jpg",
            "created_at": "2012-09-30T19:52:15+02:00",
            "location": "Sopot,Poland",
            "id_str": "855093368"
        },
        {
            "name": "Anna BARAŃSKA",
            "profile_image_url": "http://pbs.twimg.com/profile_images/884844399338901504/0OYl8JA6_normal.jpg",
            "created_at": "2012-09-30T19:52:15+02:00",
            "location": "Sopot,Poland",
            "id_str": "855093368"
        }
    ]
}

我想要的结果是: 2,"Anna ROGOWSKA" 2,"Anna BARAŃSKA"

您应该利用 U-SQL 的 CROSS APPLY EXPLODE 功能。

我用你的 json 文件测试了这个并且它有效:

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];

USING Microsoft.Analytics.Samples.Formats.Json;

DECLARE @path string = @"C:\Users\testUser\Documents\Visual Studio 2015\Projects\USQL_Json\";
DECLARE @input string = @path + @"sample.json";
DECLARE @to string = @path + @"output.csv";

@jsonFile =
 EXTRACT partnerId int,
        users string
FROM @input
USING new JsonExtractor();

@followingUsers =
 SELECT partnerId AS PartnerId,
       JsonFunctions.JsonTuple(users).Values AS user_array
 FROM @jsonFile;

@tabUsers =
 SELECT PartnerId,
       JsonFunctions.JsonTuple(t_user)["name"] AS FriendName
 FROM @followingUsers
     CROSS APPLY
         EXPLODE(user_array) AS A(t_user);


OUTPUT @tabUsers
TO @to
USING Outputters.Csv();

输出是:

2,"Anna ROGOWSKA"
2,"Anna BARANSKA"