Athena - 如何通过嵌套 json 值查询?

Athena - How to query by nested json value?

在雅典娜,我们试图获得所有含有种子的水果。我们面临的问题是种子 true/false 嵌套在 json 中,我们似乎无法在 WHERE 子句中对其进行过滤。当我们从嵌套的 json 路径 SELECT 将其作为 hasSeeds 时,我们只能看到结果,但是 returns 所有种子结果 truefalse而不仅仅是 true

-- This returns all seeds, but we want to filter in the WHERE clause by hasSeeds true
SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit'

-- 3 attempts to filter by nested json value seeds true (not working)
SELECT * FROM
(SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit')
WHERE hasSeeds = true

SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit' and hasSeeds = true

SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds
FROM "food_table"
WHERE foodType = 'fruit' and json_extract(payload,'$.food.info.seeds') = true

知道我们如何才能让查询按预期工作,我们在嵌套 json 上按 hasSeeds true 过滤吗?

更新为 JSON 结构:

{
  "foodId": 1,
  "foodType": "fruit",
  "payload": {
    "food": {
      "name": "apple",
      "info": {
        "seeds": true,
        "calories": 95
      }
    }
  }
}

{
  "foodId": 2,
  "foodType": "fruit",
  "payload": {
    "food": {
      "name": "banana"
    }
  }
}

{
  "foodId": 3,
  "foodType": "vegetable",
  "payload": {
  }
}

{
  "foodId": 4,
  "foodType": "fruit",
  "payload": {
  }
}

尝试过此查询,但 returns 结果不正确:

WITH dataset(jsn) AS (
    values (JSON '{"payload":{"food":{"info":{"seeds":true}}}}')
)

SELECT foodId, foodType, json_extract(payload,'$.food.info.seeds') AS hasSeeds, jsn
FROM "food_table", dataset
WHERE cast(json_extract_scalar(jsn, '$.payload.food.info.seeds') AS BOOLEAN) = true

基本上,hasSeeds 返回为 true,这是正确的,但其他食品,如那些缺少有效负载的子项,仍然会返回 jsn 字段,而不是仅在 [=14 上查询=] 从句 payload.food.info.seeds = true

知道如何解决吗?

假设你有下一个json结构,你需要使用json_extract_scalar:

WITH dataset(jsn) AS (
    values (JSON '{"food":{"info":{"seeds":true}}}')
)

SELECT *
FROM dataset
WHERE cast(json_extract_scalar(jsn, '$.food.info.seeds') AS BOOLEAN) = true

UPD

基于提供的 json WHERE 子句应如下所示:

WHERE try(cast(json_extract_scalar(payload, '$.payload.food.info.seeds') AS BOOLEAN)) = true