Presto 查询:在地图中查找具有最大值的键

Presto query: Find the key with maximum value in a map

我有一个table

Name  pets
--------------
Andy  {dog:2, cat:1, bird:4}
John  {tiger:3, elephant:1, fish:2}
Mary  {dog:2, pig:2}

我想要为每个人找到最多数量的宠物类型。如果出现平局,则为每只宠物复制该行。结果应如下所示:

Name  max_pet
------------------
Andy  bird
John  tiger
Mary  dog
Mary  pig

目前,我导出 table 并在 python 中执行。但我想知道我可以使用 Presto/SQL 查询来实现吗?谢谢!

有几种方法可以做到这一点。一种方法是使用 UNNEST to convert the map into rows, with one row per map entry. You can then use the rank() window 函数为每个名称的宠物分配排名,之后您 select 仅排名最高的项目。

WITH people (name, pets) AS (
  VALUES
    ('Andy', map_from_entries(array[('dog', 2), ('cat', 1), ('bird', 4)])),
    ('John', map_from_entries(array[('tiger', 3), ('elephant', 1), ('fish', 2)])),
    ('Mary', map_from_entries(array[('dog', 2), ('pig', 2)]))
)
SELECT name, pet AS max_pet
FROM (
    SELECT name, pet, count,
           rank() OVER (PARTITION BY name ORDER BY count DESC) rnk
    FROM people
    CROSS JOIN UNNEST(pets) AS t (pet, count)
)
WHERE rnk = 1;
 name | max_pet 
------+---------
 Andy | bird    
 John | tiger   
 Mary | dog     
 Mary | pig     
(4 rows)

使用UNNEST很容易理解,但如果你需要将它与其他操作结合使用,或者如果你有重名,效果就不佳。

另一种方法是使用 map_entries(), use filter() to select the pet(s) with a count that equals the maximum count, then use transform() to only return the pet name. At this point, you have an array of the maximum pets. You can then UNNEST it into multiple rows, or keep it as an array for further processing. filter() and transform() utilize a lambda expression 将地图转换为数组,这是对 SQL.

的 Presto 特定扩展
WITH people (name, pets) AS (
  VALUES
    ('Andy', map_from_entries(array[('dog', 2), ('cat', 1), ('bird', 4)])),
    ('John', map_from_entries(array[('tiger', 3), ('elephant', 1), ('fish', 2)])),
    ('Mary', map_from_entries(array[('dog', 2), ('pig', 2)]))
)
SELECT
    name,
    transform(
        filter(
            map_entries(pets),
            e -> e[2] = array_max(map_values(pets))),
        e -> e[1]) AS max_pets
FROM people;
 name |  max_pets  
------+------------
 Andy | [bird]     
 John | [tiger]    
 Mary | [dog, pig] 
(3 rows)