在 Snowflake 中查询变体数据

Querying variant data in Snowflake

这是我在示例中使用的数据变体源 table。我想查询以将此数据解析为来自变体 src 的雪花 table。

{
    "col1": bool,
    "col2": null,
    "col3": "datetime",
    "col4": int,
    "col5": "string",
    "col6": "string",
    "array": [
        {
            "x": bool,
            "y": null,
            "v": "datetime",
            "z": int,
           "w": "string",
            "q": "string",
            "obj": {
                    "a": "bool",
                     "b": "float"
                   },
    "col7": "datetime"
}
]
}

-- 这是我试过的

SELECT 

     src:col1::string as col1,
     src:col2::string as col2,
     src:col3::string as col3,
     src:col4::string as col4,
     src:col5::string as col5,
     src:col6::string as col6,

     s.value:x::string as S_x,
     s.value:y::string as s_y,
     s.value:v::string as s_v,
     s.value:z::string as s_z,
     s.value:w::string as s_w,
     s.value:q::string as s_q,

     s.value:obj.value:a::string as s_obj_a,
     s.value:obj.value:b::string as s_obj_b,

     src:col7::string as col7 
FROM tblvariant
    , table(flatten(src:s)) s
    ;

一切正常,除了这两列 (a, b) 为空,而它们应该包含它们的数据。 有什么建议吗? 非常感谢!

您的样本 JSON 与您的 SQL 不匹配。 "stages" 和 "metadata" 在哪里?无论如何,问题似乎与额外的 "value" 关键字有关。

create or replace table tblvariant ( src variant )
as select parse_json (' 
{
    "col1": "bool",
    "col2": null,
    "col3": "datetime",
    "col4": "int",
    "col5": "string",
    "col6": "string",
    "stages": [
        {
            "x": "bool",
            "y": null,
            "v": "datetime",
            "z": "int",
           "w": "string",
            "q": "string",
            "obj": {
                    "a": "bool",
                     "b": "float"
                   },
    "col7": "datetime"
}
]
}' );

如您所见,我修改了您的示例 JSON 并将 "array" 重命名为 "stages"(根据您的 SQL)。 SQL 检索 a 和 b 的值:

SELECT 
     src:col1::string as col1,
     src:col2::string as col2,
     src:col3::string as col3,
     src:col4::string as col4,
     src:col5::string as col5,
     src:col6::string as col6,
     s.value:x::string as S_x,
     s.value:y::string as s_y,
     s.value:v::string as s_v,
     s.value:z::string as s_z,
     s.value:w::string as s_w,
     s.value:q::string as s_q,

     s.value:obj.a::string as s_obj_a,
     s.value:obj.b::string as s_obj_b,

     src:col7::string as col7 
FROM tblvariant
   , table(flatten(src:stages)) s
   -- , table(flatten(s.value:metadata)) m
    ;

s.value:obj .value:a ::string as s_obj_a,

s.value:obj .value:b ::string as s_obj_b,

可以使用 dot (.) notation. You do not need to use the GET_PATH (:) operator 访问对象的键来访问这些字段:

s.value:metadata.a::string as s_m_a,
s.value:metadata.b::string as s_m_b,

您也不需要在 stages 数组中的 metadata 对象上 运行 一秒 FLATTEN,除非您真的需要每个 metadata 键,假设 metadata 是对象类型而不是嵌套数组。如果您只想将值提取到与每个数组行相同的级别,只需使用上面的内容就足够了。