将镶木地板文件中的数据(列)部分加载到关系 table

Partial loading of data(columns) from a parquet file into relational table

我有一个 table 的数据类型 VARIANT,它包含一个镶木地板文件。

我有一个关系 table 格式

> CREATE TABLE IF NOT EXISTS covid_data_relational
> (
>   id int identity(1,1),
>  date_dt DATE,
>  state string,
>  value int,
>  population_percent float,
>  change_from_prior_day int,
>  seven_day_change_percent float
>  );

向此 table 中插入数据只会填充“日期”和“州”列的记录。任何格式为 cases.* 的列都不会被填充。

查询如下:

    insert into covid_data_relational(date_dt, state, value, population_percent, change_from_prior_day, seven_day_change_percent)
 select covid_data_raw:date::date as date_dt,
covid_data_raw:state::string as state,
covid_data_raw:cases.value::int as value,
covid_data_raw:cases.calculated.population_percent::float as population_percent,
covid_data_raw:cases.calculated.change_from_prior_day::int as change_from_prior_day,
covid_data_raw:cases.calculated.seven_day_change_percent::float as seven_day_change_percent
from covid_data_parquet;

感谢任何帮助!提前致谢

最好 post 将数据采样为文本而不是图像。人们可以更方便地复制粘贴样本进行测试。所以这没有经过测试,但它应该可以工作。

仔细查看 cases.calculated.change_from_prior_day 键。它不是祖父路径 cases.* 中的嵌套键。它是由一长串带点组成的hard-coded键。

要提取它,您需要指定键是包括点在内的整个字符串:

insert into covid_data_relational(date_dt, state, value, population_percent, change_from_prior_day, seven_day_change_percent)
 select covid_data_raw:date::date as date_dt,
covid_data_raw:state::string as state,
covid_data_raw:"cases.value"::int as value,
covid_data_raw:"cases.calculated.population_percent"::float as population_percent,
covid_data_raw:"cases.calculated.change_from_prior_day"::int as change_from_prior_day,
covid_data_raw:"cases.calculated.seven_day_change_percent"::float as seven_day_change_percent
from covid_data_parquet;