如何在 Big Query 中更新 table,其中要更新的字段名称是另一个 table 中的值
How update a table in Big Query where the name of fields to update are values in another table
各位!
我需要一些想法,问题如下:
我有两个table:
Table 1:
+-------+------------+---------+
| ID | field_name | value |
+-------+------------+---------+
| 1 | usd | 10.08 |
| 1 | gross_amt | 52.0 |
| 1 | jpy | 30.05 |
| 2 | usd | 50.0 |
| 2 | eur | 50.0 |
| 3 | real_amt | 210.43 |
| 3 | total | 320 |
| 4 | jpy | 23.45 |
| 4 | name | john |
| 4 | city | utah |
+-------+------------+---------+
Table 2:
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID | name | last_name | date1 | counrty | city | usd | eur | jpy | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1 | jane | doe | 19900108 | usa | LA | 9.08 | 0.00 | 29.05 | 50.0 | 52.0 | 900.0 | ... | value200 |
| 2 | lane | smith | 19900108 | usa | LA | 40.8 | 40.0 | 0.00 | 100.0 | 70.0 | 290.0 | ... | value200 |
| 3 | mike | hoffa | 19900108 | usa | SF | 5.05 | 0.00 | 0.00 | 10.0 | 25.0 | 100.0 | ... | value200 |
| 4 | paul | doe | 19900108 | usa | NY | 1.00 | 0.00 | 29.05 | 45.0 | 55.0 | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
我需要用 table 1 的值更新 table 2 中字段的值,这些字段位于 field_name
列的 table 1 中column value
, 两个ID在两个tables中是相同的,除此之外,column value
in table 1的数据类型是string,但是columns的数据类型在 table 2 中更新是不同的,尤其是数字(numeric, int64, float64)
上面的table是一个例子,真题的table2有200个字段,在table1中一个ID最多可以修改40个值每天修改数千条记录
谢谢
我尝试了以下两种解决方案:
方案一(可以,但是很慢,记录很多):
DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;
SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);
WHILE i <= n DO
SET col = (SELECT col FROM `project.dataset.table1` LIMIT 1);
SET val = (SELECT val FROM `project.dataset.table1` LIMIT 1);
SET id = (SELECT id FROM `project.dataset.table1` LIMIT 1);
SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE SQLSCRIPT;
解决方案 2(我无法让它工作,它给了我以下错误):
[错误执行大查询]
[1]: https://i.stack.imgur.com/Pv44T.png
EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
FROM `project.dataset.table1`)) AS x);
一种方法是创建 table_1
的旋转(宽列)视图,然后更新 table_2
。
使用示例数据创建表
CREATE OR REPLACE TABLE `project.dataset.table_1` AS
SELECT 1 AS id, 'usd' AS field_name, 10.08 AS value UNION ALL
SELECT 1 AS id, 'gross_amt' AS field_name, 52.0 AS value UNION ALL
SELECT 1 AS id, 'jpy' AS field_name, 30.05 AS value UNION ALL
SELECT 2 AS id, 'usd' AS field_name, 50.0 AS value UNION ALL
SELECT 2 AS id, 'eur' AS field_name, 50.0 AS value UNION ALL
SELECT 3 AS id, 'real_amt' AS field_name, 210.43 AS value UNION ALL
SELECT 3 AS id, 'total' AS field_name, 320.66 AS value UNION ALL
SELECT 4 AS id, 'jpy' AS field_name, 23.45 AS value;
CREATE OR REPLACE TABLE `project.dataset.table_2` AS
SELECT 1 AS id, 9.08 as usd, 0.00 as eur, 29.05 AS jpy, 50.0 AS gross_amt,52.0 as real_amt, 900.0 AS total UNION ALL
SELECT 2, 40.8, 40.0, 0.00, 100.0, 70.0, 290.0 UNION ALL
SELECT 3, 5.05, 0.00, 0.00, 10.0, 25.0, 100.0 UNION ALL
SELECT 4, 1.00, 0.00, 29.05, 45.0, 55.0, 110.0;
现在,先创建一个旋转视图,然后使用MERGE
使用旋转视图进行更新
BEGIN
-- pivot rows from table (rows to columns) and create a view
EXECUTE IMMEDIATE '''
CREATE OR REPLACE VIEW `project.dataset.view_1` AS
SELECT id, ''' || (SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', value, 0.0)) AS " || field_name)
FROM `project.dataset.table_1`
) || '''
FROM `project.dataset.table_1`
GROUP BY 1
''';
-- update table_2 based on table_1 values
EXECUTE IMMEDIATE '''
MERGE `project.dataset.table_2` AS TABLE_2
USING `project.dataset.view_1` AS view_1
ON
table_2.id = view_1.id
WHEN MATCHED THEN
UPDATE SET
''' || (SELECT STRING_AGG(DISTINCT field_name || ' = view_1.' || field_name)
FROM `project.dataset.table_1`
);
END;
以下适用于 BigQuery 标准 SQL
EXECUTE IMMEDIATE '''
CREATE TEMP TABLE pivot1 AS
SELECT id, ''' || (
SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
FROM `project.dataset.table1`
JOIN (
SELECT column_name, data_type
FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE tablename = 'table2'
) ON field_name = column_name
) || '''
FROM `project.dataset.table1`
GROUP BY id
''';
EXECUTE IMMEDIATE '''
MERGE `project.dataset.table2` AS t2
USING pivot1 AS t1
ON t2.id = t1.id
WHEN MATCHED THEN
UPDATE SET
''' || (
SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
FROM `project.dataset.table1`
);
SELECT * FROM `project.dataset.table2` ORDER BY id;
如果应用于您问题中的样本数据(表 1 和表 2)- 输出为(更新已突出显示)
各位!
我需要一些想法,问题如下:
我有两个table:
Table 1:
+-------+------------+---------+
| ID | field_name | value |
+-------+------------+---------+
| 1 | usd | 10.08 |
| 1 | gross_amt | 52.0 |
| 1 | jpy | 30.05 |
| 2 | usd | 50.0 |
| 2 | eur | 50.0 |
| 3 | real_amt | 210.43 |
| 3 | total | 320 |
| 4 | jpy | 23.45 |
| 4 | name | john |
| 4 | city | utah |
+-------+------------+---------+
Table 2:
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID | name | last_name | date1 | counrty | city | usd | eur | jpy | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1 | jane | doe | 19900108 | usa | LA | 9.08 | 0.00 | 29.05 | 50.0 | 52.0 | 900.0 | ... | value200 |
| 2 | lane | smith | 19900108 | usa | LA | 40.8 | 40.0 | 0.00 | 100.0 | 70.0 | 290.0 | ... | value200 |
| 3 | mike | hoffa | 19900108 | usa | SF | 5.05 | 0.00 | 0.00 | 10.0 | 25.0 | 100.0 | ... | value200 |
| 4 | paul | doe | 19900108 | usa | NY | 1.00 | 0.00 | 29.05 | 45.0 | 55.0 | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
我需要用 table 1 的值更新 table 2 中字段的值,这些字段位于 field_name
列的 table 1 中column value
, 两个ID在两个tables中是相同的,除此之外,column value
in table 1的数据类型是string,但是columns的数据类型在 table 2 中更新是不同的,尤其是数字(numeric, int64, float64)
上面的table是一个例子,真题的table2有200个字段,在table1中一个ID最多可以修改40个值每天修改数千条记录
谢谢
我尝试了以下两种解决方案:
方案一(可以,但是很慢,记录很多):
DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;
SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);
WHILE i <= n DO
SET col = (SELECT col FROM `project.dataset.table1` LIMIT 1);
SET val = (SELECT val FROM `project.dataset.table1` LIMIT 1);
SET id = (SELECT id FROM `project.dataset.table1` LIMIT 1);
SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE SQLSCRIPT;
解决方案 2(我无法让它工作,它给了我以下错误):
[错误执行大查询] [1]: https://i.stack.imgur.com/Pv44T.png
EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
FROM `project.dataset.table1`)) AS x);
一种方法是创建 table_1
的旋转(宽列)视图,然后更新 table_2
。
使用示例数据创建表
CREATE OR REPLACE TABLE `project.dataset.table_1` AS
SELECT 1 AS id, 'usd' AS field_name, 10.08 AS value UNION ALL
SELECT 1 AS id, 'gross_amt' AS field_name, 52.0 AS value UNION ALL
SELECT 1 AS id, 'jpy' AS field_name, 30.05 AS value UNION ALL
SELECT 2 AS id, 'usd' AS field_name, 50.0 AS value UNION ALL
SELECT 2 AS id, 'eur' AS field_name, 50.0 AS value UNION ALL
SELECT 3 AS id, 'real_amt' AS field_name, 210.43 AS value UNION ALL
SELECT 3 AS id, 'total' AS field_name, 320.66 AS value UNION ALL
SELECT 4 AS id, 'jpy' AS field_name, 23.45 AS value;
CREATE OR REPLACE TABLE `project.dataset.table_2` AS
SELECT 1 AS id, 9.08 as usd, 0.00 as eur, 29.05 AS jpy, 50.0 AS gross_amt,52.0 as real_amt, 900.0 AS total UNION ALL
SELECT 2, 40.8, 40.0, 0.00, 100.0, 70.0, 290.0 UNION ALL
SELECT 3, 5.05, 0.00, 0.00, 10.0, 25.0, 100.0 UNION ALL
SELECT 4, 1.00, 0.00, 29.05, 45.0, 55.0, 110.0;
现在,先创建一个旋转视图,然后使用MERGE
使用旋转视图进行更新
BEGIN
-- pivot rows from table (rows to columns) and create a view
EXECUTE IMMEDIATE '''
CREATE OR REPLACE VIEW `project.dataset.view_1` AS
SELECT id, ''' || (SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', value, 0.0)) AS " || field_name)
FROM `project.dataset.table_1`
) || '''
FROM `project.dataset.table_1`
GROUP BY 1
''';
-- update table_2 based on table_1 values
EXECUTE IMMEDIATE '''
MERGE `project.dataset.table_2` AS TABLE_2
USING `project.dataset.view_1` AS view_1
ON
table_2.id = view_1.id
WHEN MATCHED THEN
UPDATE SET
''' || (SELECT STRING_AGG(DISTINCT field_name || ' = view_1.' || field_name)
FROM `project.dataset.table_1`
);
END;
以下适用于 BigQuery 标准 SQL
EXECUTE IMMEDIATE '''
CREATE TEMP TABLE pivot1 AS
SELECT id, ''' || (
SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
FROM `project.dataset.table1`
JOIN (
SELECT column_name, data_type
FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE tablename = 'table2'
) ON field_name = column_name
) || '''
FROM `project.dataset.table1`
GROUP BY id
''';
EXECUTE IMMEDIATE '''
MERGE `project.dataset.table2` AS t2
USING pivot1 AS t1
ON t2.id = t1.id
WHEN MATCHED THEN
UPDATE SET
''' || (
SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
FROM `project.dataset.table1`
);
SELECT * FROM `project.dataset.table2` ORDER BY id;
如果应用于您问题中的样本数据(表 1 和表 2)- 输出为(更新已突出显示)