在 BigQuery 上随机播放 table 中的特定列

Shuffle a specific column in a table on BigQuery

我有一个 table 看起来像这样:

    id label
     1   A
     2   A
     3   A
     4   B
     5   C
     6   C
     7   A
     8   A
     9   C
     10  B

我想获取另一列 label_shuffled,它是现有的列标签,但已打乱顺序。我需要它既高效又快速。

期望的输出:

    id label  label_shuffled
     1   A         A
     2   A         B
     3   A         C
     4   B         A
     5   C         C
     6   C         A
     7   A         C
     8   A         A
     9   C         B
     10  B         A

有什么建议吗?

一个选项是使用window函数ROW_NUMBER随机枚举行然后加入:

WITH suffle AS (
  SELECT
    id,
    label,
    ROW_NUMBER() OVER () row_number,
    ROW_NUMBER() OVER (ORDER BY RAND()) row_number_suffled
  FROM labels
)
SELECT 
  l.id,
  l.label,
  s.label as label_suffled
FROM suffle l 
JOIN suffle s on l.row_number = s.row_number_suffled