通过 BigQuery 使用 unpredictable 列数查询和索引 table

Query and indexing table with unpredictable number of columns by BigQuery

非常感谢您在我之前的问题中得到的回答:

不过,我试着把它们放在我的正式任务中,但对我来说,它仍然很棘手。 根据这个任务,我的table看起来是这样的,不仅有7个而且可能有10个,1000个,N个值列和N个时间列:

我的最终结果需要是

“value”列从名称包含“value_”的列中获取,“time”列从名称包含“time_”的列中获取,最困难的是:“出现此值的位置”将是此值所在的位置出现了相应的id。

是否有任何可能的方法来创建此结果table?先谢谢大家了。

创建此示例的代码 table:

WITH my_dataset AS
 (SELECT '001' as id, 1 as value_1, 'a1' as time_1, 2 as value_2, 'a2' as time_2,3 as value_3, 'a3' as time_3, 4 as value_4, 
                    'a4' as time_4, 5 as value_5, 'a5' as time_5, 6 as value_6, 'a6' as time_6, 7 as value_7, 'a7' as time_7 
  UNION ALL
  SELECT '002', 8, 'a8', 9, 'a9', 10, 'a10', 11, 'a11', 12, 'a12', 13, 'a13', 14, 'a14' 
  UNION ALL
  SELECT '003', 15, 'a15', 16, 'a16', 17, 'a17', 18, 'a18', 19, 'a19', 20, 'a20', 21, 'a21' 
  UNION ALL
  SELECT '004', 22, 'a22', 23, 'a23', 24, 'a24', 25, 'a25', 26, 'a26', 27, 'a27', 28, 'a28'
  UNION ALL
  SELECT '005', 29, 'a29', 30, 'a30', 31, 'a31', 32, 'a32', 33, 'a33', 34, 'a34', 35, 'a35'
  UNION ALL
  SELECT '006', 36, 'a36', 37, 'a37', 38, 'a38', 39, 'a39', 40, 'a40', 41, 'a41', 42, 'a42'
  UNION ALL
  SELECT '007', 43, 'a43', 44, 'a44', 45, 'a45', 46, 'a46', 47, 'a47', 48, 'a48', 49, 'a49')

SELECT * FROM my_dataset 

考虑以下方法

select * from (
  select id, val, split(col, '_')[offset(0)] as col, split(col, '_')[offset(1)] as pos
  from my_dataset t, unnest([to_json_string(t)]) json,
  unnest(`bqutil.fn.json_extract_keys`(json)) col with offset 
  join unnest(`bqutil.fn.json_extract_values`(json)) val with offset 
  using(offset)
  where starts_with(col, 'value_') or starts_with(col, 'time_')
)
pivot (min(val) for col in ('value', 'time'))         

如果应用于您问题中的示例数据 - 输出为

这里有一个脚本方法供您比较查询大数据时的性能table:

declare n int64;
declare query_str string;
set n = (select cast((array_length(regexp_extract_all(to_json_string(`<project>.<dataset>.<table>`),"\":"))-1)/2 as int64) total_columns from `<project>.<dataset>.<table>` limit 1);
set query_str = '''SELECT id,value_1 as value, time_1 as time, 1 as position FROM `<project>.<dataset>.<table>`''';

while n > 1 do
    set query_str = concat(query_str, "union all SELECT id,value_", n, "  as value, time_", n ," as time, ", n ," as position FROM `<project>.<dataset>.<table>`");
    set n = (n-1);
end while; 

EXECUTE IMMEDIATE query_str;