Athena/Presto 数据发现查询以推荐 JSON 模式?
Athena/Presto data discovery query to recommend JSON schema?
我有一个 Athena table (raw
) 只有一列 (json
)。
我有以下查询输出 json 键的频率:
SELECT key, count(*)
FROM (
SELECT map_keys(cast(json_parse(json) AS map(varchar, json))) AS keys
FROM raw
)
CROSS JOIN UNNEST (keys) AS t (key)
GROUP BY key
我如何扩展此查询,以便它告诉我某个特定键是否具有包含任何非数字字符的值?
[找到答案后删除失败的尝试]
这个有效:
SELECT k, count(*) as isPresent, sum(isNumber) as isNumber,
count(*)-sum(isNumber) as notIsNumber from (
with dataset as (SELECT
cast(json_parse(json) AS map(varchar, varchar)) as kv FROM raw)
SELECT t.k, t.v,
IF(TRY(cast(t.v as double)) is null, 0, 1) as isNumber
from dataset cross join unnest(kv) as t(k, v)
) group by k
我有一个 Athena table (raw
) 只有一列 (json
)。
我有以下查询输出 json 键的频率:
SELECT key, count(*)
FROM (
SELECT map_keys(cast(json_parse(json) AS map(varchar, json))) AS keys
FROM raw
)
CROSS JOIN UNNEST (keys) AS t (key)
GROUP BY key
我如何扩展此查询,以便它告诉我某个特定键是否具有包含任何非数字字符的值?
[找到答案后删除失败的尝试]
这个有效:
SELECT k, count(*) as isPresent, sum(isNumber) as isNumber,
count(*)-sum(isNumber) as notIsNumber from (
with dataset as (SELECT
cast(json_parse(json) AS map(varchar, varchar)) as kv FROM raw)
SELECT t.k, t.v,
IF(TRY(cast(t.v as double)) is null, 0, 1) as isNumber
from dataset cross join unnest(kv) as t(k, v)
) group by k