BigQuery table 的平面数据并将平面数据复制到新的 BigQuery table
Flat data of BigQuery table and copy the flatted data to a new BigQuery table
我是 BQ 新手。
我有一个 table,其中一些列是重复记录的。我正在尝试扁平化 table,所以它会是一种关系,并将扁平化的数据插入新的 BigQuery table。
可能吗?我应该怎么做?
以下适用于 BigQuery 标准 SQL
假设您有如下简单的 table
Row id repeated_record
--- -- ---------------
1 1 google
facebook
viant
2 2 dell
hp
您可以使用以下查询轻松模仿它
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, ['google', 'facebook', 'viant'] AS repeated_record UNION ALL
SELECT 2, ['dell', 'hp']
)
SELECT *
FROM `table-with-repeated-record`
所以现在,要让它变平 - 使用下面的查询
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, ['google', 'facebook', 'viant'] AS repeated_record UNION ALL
SELECT 2, ['dell', 'hp']
)
SELECT id, flatted_data
FROM `table-with-repeated-record`,
UNNEST(repeated_record) AS flatted_data
结果如下
Row id flatted_data
--- -- ------------
1 1 google
2 1 facebook
3 1 viant
4 2 dell
5 2 hp
下面是另一个例子
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, [STRUCT<line INT64, name STRING>(1, 'google'), (2, 'facebook'), (3, 'viant')] AS repeated_record UNION ALL
SELECT 2, [STRUCT<line INT64, name STRING>(5, 'dell'), (6, 'hp')]
)
SELECT *
FROM `table-with-repeated-record`
模仿下面table
Row id repeated_record.line repeated_record.name
--- -- -------------------- --------------------
1 1 1 google
2 facebook
3 viant
2 2 5 dell
6 hp
压平它的方法是:
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, [STRUCT<line INT64, name STRING>(1, 'google'), (2, 'facebook'), (3, 'viant')] AS repeated_record UNION ALL
SELECT 2, [STRUCT<line INT64, name STRING>(5, 'dell'), (6, 'hp')]
)
SELECT id, flatted_data.line, flatted_data.name
FROM `table-with-repeated-record`,
UNNEST(repeated_record) AS flatted_data
以
结尾
Row id line name
--- -- ---- ----
1 1 1 google
2 1 2 facebook
3 1 3 viant
4 2 5 dell
5 2 6 hp
do you have any idea how to do it without specify the data like ['google', 'facebook', 'viant'] etc? the size of the table is not constant, and it is changing from time to time, as well the data the is stored in the table, the only thing I know for sure is the columns
您应该只使用下面的内容(没有用作示例的虚拟数据,以便您能够使用查询)
#standardSQL
SELECT id, flatted_data.line, flatted_data.name
FROM `yourProject.yourDataset.yourTable`,
UNNEST(repeated_record) AS flatted_data
我是 BQ 新手。 我有一个 table,其中一些列是重复记录的。我正在尝试扁平化 table,所以它会是一种关系,并将扁平化的数据插入新的 BigQuery table。 可能吗?我应该怎么做?
以下适用于 BigQuery 标准 SQL
假设您有如下简单的 table
Row id repeated_record
--- -- ---------------
1 1 google
facebook
viant
2 2 dell
hp
您可以使用以下查询轻松模仿它
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, ['google', 'facebook', 'viant'] AS repeated_record UNION ALL
SELECT 2, ['dell', 'hp']
)
SELECT *
FROM `table-with-repeated-record`
所以现在,要让它变平 - 使用下面的查询
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, ['google', 'facebook', 'viant'] AS repeated_record UNION ALL
SELECT 2, ['dell', 'hp']
)
SELECT id, flatted_data
FROM `table-with-repeated-record`,
UNNEST(repeated_record) AS flatted_data
结果如下
Row id flatted_data
--- -- ------------
1 1 google
2 1 facebook
3 1 viant
4 2 dell
5 2 hp
下面是另一个例子
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, [STRUCT<line INT64, name STRING>(1, 'google'), (2, 'facebook'), (3, 'viant')] AS repeated_record UNION ALL
SELECT 2, [STRUCT<line INT64, name STRING>(5, 'dell'), (6, 'hp')]
)
SELECT *
FROM `table-with-repeated-record`
模仿下面table
Row id repeated_record.line repeated_record.name
--- -- -------------------- --------------------
1 1 1 google
2 facebook
3 viant
2 2 5 dell
6 hp
压平它的方法是:
#standardSQL
WITH `table-with-repeated-record` AS (
SELECT 1 AS id, [STRUCT<line INT64, name STRING>(1, 'google'), (2, 'facebook'), (3, 'viant')] AS repeated_record UNION ALL
SELECT 2, [STRUCT<line INT64, name STRING>(5, 'dell'), (6, 'hp')]
)
SELECT id, flatted_data.line, flatted_data.name
FROM `table-with-repeated-record`,
UNNEST(repeated_record) AS flatted_data
以
结尾Row id line name
--- -- ---- ----
1 1 1 google
2 1 2 facebook
3 1 3 viant
4 2 5 dell
5 2 6 hp
do you have any idea how to do it without specify the data like ['google', 'facebook', 'viant'] etc? the size of the table is not constant, and it is changing from time to time, as well the data the is stored in the table, the only thing I know for sure is the columns
您应该只使用下面的内容(没有用作示例的虚拟数据,以便您能够使用查询)
#standardSQL
SELECT id, flatted_data.line, flatted_data.name
FROM `yourProject.yourDataset.yourTable`,
UNNEST(repeated_record) AS flatted_data