Presto 过滤器 json 数组

Presto filter json array

如何在 Presto 中过滤 json 数组?

WITH dataset AS (
  SELECT * FROM (VALUES
    (JSON '{"turnovers": [{"purpose": "Bla Bla", "amount": 3, "tag": "E"}, 
                          {"purpose": "Blub", "amount": 3, "tag": "F"}]}'),
    (JSON '{"turnovers": [{"purpose": "Palim", "amount": 3, "tag": "E"}, 
                          {"purpose": "Palim Palim", "amount": 3, "tag": "E"}]}')
  ) AS t (snapshot)
)
SELECT 
    json_extract(snapshot, '$.turnovers')
FROM 
    dataset

我只想获得带有标签 E 的营业额,而不是所有交易。在此示例中,应排除标记为 F 的一笔交易。

这可能吗?

我希望像这样使用某物,但这不起作用

WITH dataset AS (
  SELECT * FROM (VALUES
    (JSON '{"turnovers": [{"purpose": "Bla Bla", "amount": 3, "tag": "E"}, 
                          {"purpose": "Blub", "amount": 3, "tag": "F"}]}'),
    (JSON '{"turnovers": [{"purpose": "Palim", "amount": 3, "tag": "E"}, 
                          {"purpose": "Palim Palim", "amount": 3, "tag": "E"}]}')
  ) AS t (snapshot)
)
SELECT 
    filter(json_extract(snapshot, '$.turnovers'), x -> json_extract_scalar(x, '$.tag')='E')
FROM 
    dataset

您可以尝试将 json 路径与表达式 json_extract(snapshot, '$.turnovers[?(@.tag == "E")]') 一起使用,但如果它对您和我一样失败 - 将数据转换为行数组并过滤这些数组:

WITH dataset AS (
  SELECT * FROM (VALUES
    (JSON '{"turnovers": [{"purpose": "Bla Bla", "amount": 3, "tag": "E"},
                          {"purpose": "Blub", "amount": 3, "tag": "F"}]}'),
    (JSON '{"turnovers": [{"purpose": "Palim", "amount": 3, "tag": "E"},
                          {"purpose": "Palim Palim", "amount": 3, "tag": "E"}]}')
  ) AS t (snapshot)
)
SELECT
    filter(CAST(json_extract(snapshot, '$.turnovers') as ARRAY(ROW(purpose VARCHAR, amount INTEGER, tag VARCHAR))), x -> x.tag = 'E')
FROM
    dataset

如果需要,您可以在过滤后转换回 json。

这对我有用。 ARRAY(MAP(VARCHAR, JSON)) 非常灵活,也允许在 json 数组中嵌套 json。

WITH dataset AS (
  SELECT * FROM (VALUES
    (JSON '{"turnovers": [{"purpose": "Bla Bla", "amount": 3, "tag": "E"},
                          {"purpose": "Blub", "amount": 3, "tag": "F"}]}'),
    (JSON '{"turnovers": [{"purpose": "Palim", "amount": 3, "tag": "E"},
                          {"purpose": "Palim Palim", "amount": 3, "tag": "E"}]}')
  ) AS t (snapshot)
)
SELECT
    CAST(filter(CAST(json_extract(snapshot, '$.turnovers') AS ARRAY(MAP(VARCHAR, JSON))), x -> json_format(x['tag']) = '"E"') AS JSON)
FROM
    dataset