BigQuery:为每个 user_id 选择第二大的

BigQuery: Selecting the 2nd largest for each user_id

Table:

    | User_ID |  Red | Blue | Green |  Rating |
    |   a     |   23 |  33  |   42  |    99   |
    |   a     |   56 |  45  |   62  |    45   |
    |   a     |   23 |  49  |   28  |    67   |
    |   b     |   39 |  59  |   10  |    87   |
    |   b     |   18 |  28  |   59  |    38   |
    |   b     |   40 |  50  |   38  |    94   |

架构看起来像这样。红色、蓝色和绿色是 RGB 数字。评分是每个用户对颜色的喜爱程度。

我需要 3 个查询的帮助:

  1. 确定每个用户最喜欢的颜色(a:第 1 行,b:第 6 行)
  2. 确定每个用户第二喜欢的颜色(a:第 3 行,b:第 4 行)
  3. 每个用户最喜欢的前 2 种颜色的评分总和。

谢谢!

//编辑:

尝试了以下查询:

    SELECT distinct(User_ID), Red, Blue, Green, Rating 
    FROM `test_colour` 
    WHERE Rating = (SELECT MAX(RATING) FROM `test_colour` )
    Group by 1,2,3,4,5 

以上仅returns评分最高的行

    SELECT distinct(User_ID), Red, Blue, Green, MAX(Rating)
    FROM `test_colour` 
    Group by 1,2,3,4

以上 returns 所有行..

以下是 BigQuery Standard SQL 的内容,一次给你所有 3 个问题!

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'a' User_ID, 23 Red, 33 Blue, 42 Green, 99 Rating UNION ALL
  SELECT 'a', 56, 45, 62, 45 UNION ALL
  SELECT 'a', 23, 49, 28, 67 UNION ALL
  SELECT 'b', 39, 59, 10, 87 UNION ALL
  SELECT 'b', 18, 28, 59, 38 UNION ALL
  SELECT 'b', 40, 50, 38, 94 
)
SELECT User_ID, 
  favorites[SAFE_OFFSET(0)] first, 
  favorites[SAFE_OFFSET(1)] second,
  favorites[SAFE_OFFSET(0)].Rating + favorites[SAFE_OFFSET(1)].Rating TotalRating
FROM (
  SELECT User_ID, ARRAY_AGG(STRUCT(Red, Blue, Green, Rating) ORDER BY Rating DESC LIMIT 2) favorites
  FROM `project.dataset.table` 
  GROUP BY User_ID
)  

可以很好地练习,让您了解它是如何工作的:o)

上面例子的结果如下

Row User_ID first.Red   first.Blue  first.Green first.Rating    second.Red  second.Blue second.Green    second.Rating   TotalRating  
1   a       23          33          42          99              23          49          28              67              166  
2   b       40          50          38          94              39          59          10              87              181