Hive SQL 计数项目,如果它等于 1,则说明使用了哪个项目
Hive SQL count items and if it equals one tell what item was used
我正在查询使用桌面设备或移动设备的用户,我想知道他们在一天内是否只使用了其中一个或两个,如果他们只使用了一个类别,那是哪个类别。
我编译了下面的查询,但我在循环中 运行 因为在我计算 'dev_count'.
之前我不能在 GROUP BY 子句中使用 'dev2'
SELECT day, user,
COUNT(DISTINCT CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END) AS dev_count,
CASE
WHEN dev_count = 1
THEN
CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END
ELSE ""
END AS dev2
FROM table1
WHERE
...
GROUP BY day, user, dev2
ORDER BY day;
有办法摆脱它吗?
你能试试这个吗?
我使用 subqry 来计算使用的设备数量,然后在外部查询中我使用 case when to use it to determine its mobile or desktop.
SELECT day, user,
dev_count,
CASE
WHEN dev_count = 1
THEN
CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END
ELSE ""
END AS dev2
FROM
(SELECT day, user,
COUNT( CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END) AS dev_count
,dev
FROM table1
WHERE
...
GROUP BY day, user, dev) rs
ORDER BY day;
您可以使用 collect_set() 聚合函数获取 UNIQUE 使用设备的数组 types/etc,然后您可以使用 array_contains() 检查数组包含的内容和大小 ()检查它是否包含多个元素,如果它包含单个元素,则获取数组[0]。
select day, user,
size(device_type_used_array) as dev_count,
case when size(device_type_used_array)=1
then device_type_used_array[0]
else NULL --more than one, put what you want instead of NULL
end as one_category_used
from
(
SELECT day, user,
collect_set(CASE WHEN dev = "SmartPhone" OR
dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END) AS device_type_used_array
FROM table1
WHERE
...
GROUP BY day, user
)s
order by day;
device_type_used_array contains unique device type used per day. For not unique array use collect_list(), you can use both if necessary
我正在查询使用桌面设备或移动设备的用户,我想知道他们在一天内是否只使用了其中一个或两个,如果他们只使用了一个类别,那是哪个类别。
我编译了下面的查询,但我在循环中 运行 因为在我计算 'dev_count'.
之前我不能在 GROUP BY 子句中使用 'dev2'SELECT day, user,
COUNT(DISTINCT CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END) AS dev_count,
CASE
WHEN dev_count = 1
THEN
CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END
ELSE ""
END AS dev2
FROM table1
WHERE
...
GROUP BY day, user, dev2
ORDER BY day;
有办法摆脱它吗?
你能试试这个吗?
我使用 subqry 来计算使用的设备数量,然后在外部查询中我使用 case when to use it to determine its mobile or desktop.
SELECT day, user,
dev_count,
CASE
WHEN dev_count = 1
THEN
CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END
ELSE ""
END AS dev2
FROM
(SELECT day, user,
COUNT( CASE
WHEN dev = "SmartPhone" OR dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END) AS dev_count
,dev
FROM table1
WHERE
...
GROUP BY day, user, dev) rs
ORDER BY day;
您可以使用 collect_set() 聚合函数获取 UNIQUE 使用设备的数组 types/etc,然后您可以使用 array_contains() 检查数组包含的内容和大小 ()检查它是否包含多个元素,如果它包含单个元素,则获取数组[0]。
select day, user,
size(device_type_used_array) as dev_count,
case when size(device_type_used_array)=1
then device_type_used_array[0]
else NULL --more than one, put what you want instead of NULL
end as one_category_used
from
(
SELECT day, user,
collect_set(CASE WHEN dev = "SmartPhone" OR
dev = "Tablet"
THEN "Mobile"
ELSE "Desktop"
END) AS device_type_used_array
FROM table1
WHERE
...
GROUP BY day, user
)s
order by day;
device_type_used_array contains unique device type used per day. For not unique array use collect_list(), you can use both if necessary