Google BigQuery 中的去相关子查询？

Question

几个小时以来，我一直在为一个问题而苦苦挣扎。我发现自己掉进了多个兔子洞，进入了 DeCorrelated SubQueries 的领域，坦率地说，这超出了我的范围...

我有两个 table，我试图从两者中拉出，但没有共同的列来加入反对。我需要从 table 1 中获取 a 值，在 table 2 中找到最接近的值（即较低的值），然后从 table 2.

中提取相关数据

table_1

id	score
1	99.983545
2	98.674359
3	97.832475
4	96.184545
5	93.658572
6	89.963544
7	87.427353
8	82.883345

table_2

average_level	percentile
99.743545	99
97.994359	98
97.212485	97
96.987545	96
95.998573	95
88.213584	94
87.837384	93
80.982147	92

从上面的两个table中我需要：

取id打分
找出最接近average_level分数的
包括相关的 average_level 和百分位数

希望的输出看起来像这样...

id	score	average_level	percentile
1	99.983545	99.743545	99
2	98.674359	97.994359	98
3	97.832475	97.212485	97
4	96.184545	95.998573	95
5	93.658572	88.213584	94
6	89.963544	88.213584	94
7	87.427353	80.982147	92
8	82.883345	80.982147	92

非常感谢任何帮助或建议

Answer 1

如果我们先说table 得分第二个 avarage 你可以试试这个。

select *
from Score s
inner join average a on a.Percentile = (select top(1) al.Percentile from average al order by Abs(average_level - s.score))

enter image description here

Answer 2

您可以通过使用 table_1.score >= table_2.average_level 连接两个表然后获取 max(average_level) 和 max(average_level) - 这将是 table_2 中最接近但较低或相等的值 - 并按 table_1:

中的字段分组

SELECT TABLE_1.ID, TABLE_1.SCORE, 
MAX(TABLE_2.AVERAGE_LEVEL) AS AVERAGE_LEVEL, 
MAX(TABLE_2.PERCENTILE) AS PERCENTILE
FROM TABLE_1 INNER JOIN TABLE_2
ON TABLE_1.SCORE >= TABLE_2.AVERAGE_LEVEL
GROUP BY TABLE_1.ID, TABLE_1.SCORE
ORDER BY TABLE_1.ID

我添加了 fiddle 示例 here，它还包括@Ömer 的回答

Google BigQuery 中的去相关子查询？

DeCorrelated SubQueries in Google BigQuery?

sql

google-bigquery