SQL 通过浏览器获取唯一客户端的数量
SQL get number of unique client by browser
我正在使用 AWS Athena 来解析我的 Application Load Balancer 日志。
我正在尝试获取浏览器列表以及每个浏览器的 唯一 用户的数量。
我已成功获取此列表,但用户数不正确。我不知道如何按 IP 对用户进行分组。
1 Google Chrome 9000000
2 Apple Safari 8000000
3 Unknown 5000000
4 Mozilla Firefox 2000000
5 Internet Explorer 10000
6 Outlook 10000
7 Opera 88
8 Edge 7
这里是查询
SELECT DISTINCT
CASE
WHEN user_agent LIKE '%edge%'THEN 'Edge'
WHEN user_agent LIKE '%MSIE%' THEN
'Internet Explorer'
WHEN user_agent LIKE '%Firefox%' THEN
'Mozilla Firefox'
WHEN user_agent LIKE '%Chrome%' THEN
'Google Chrome'
WHEN user_agent LIKE '%Safari%' THEN
'Apple Safari'
WHEN user_agent LIKE '%Opera%' THEN
'Opera'
WHEN user_agent LIKE '%Outlook%' THEN
'Outlook'
ELSE 'Unknown'
END AS browser , COUNT(client_ip) AS Number
FROM alb_logs
WHERE parse_datetime(time,'yyyy-MM-DD''T''HH:mm:ss.SSSSSS''Z')
BETWEEN parse_datetime('2018-01-01-00:00:00','yyyy-MM-DD-HH:mm:ss')
AND parse_datetime('2018-07-18-00:00:00','yyyy-MM-DD-HH:mm:ss')
GROUP BY CASE
WHEN user_agent LIKE '%edge%'THEN 'Edge'
WHEN user_agent LIKE '%MSIE%' THEN
'Internet Explorer'
WHEN user_agent LIKE '%Firefox%' THEN
'Mozilla Firefox'
WHEN user_agent LIKE '%Chrome%' THEN
'Google Chrome'
WHEN user_agent LIKE '%Safari%' THEN
'Apple Safari'
WHEN user_agent LIKE '%Opera%' THEN
'Opera'
WHEN user_agent LIKE '%Outlook%' THEN
'Outlook'
ELSE 'Unknown'
END
ORDER BY Number DESC
我缺少某种 group by client_ip
,但结果是错误的...
您需要 COUNT(DISTINCT client_ip)
聚合,也不需要 SELECT DISTINCT
,像这样
SELECT CASE WHEN user_agent ... END AS browser, COUNT(DISTINCT client_ip) AS Number
FROM alb_logs
WHERE ...
GROUP BY 1
ORDER BY 2 DESC
我正在使用 AWS Athena 来解析我的 Application Load Balancer 日志。
我正在尝试获取浏览器列表以及每个浏览器的 唯一 用户的数量。
我已成功获取此列表,但用户数不正确。我不知道如何按 IP 对用户进行分组。
1 Google Chrome 9000000
2 Apple Safari 8000000
3 Unknown 5000000
4 Mozilla Firefox 2000000
5 Internet Explorer 10000
6 Outlook 10000
7 Opera 88
8 Edge 7
这里是查询
SELECT DISTINCT
CASE
WHEN user_agent LIKE '%edge%'THEN 'Edge'
WHEN user_agent LIKE '%MSIE%' THEN
'Internet Explorer'
WHEN user_agent LIKE '%Firefox%' THEN
'Mozilla Firefox'
WHEN user_agent LIKE '%Chrome%' THEN
'Google Chrome'
WHEN user_agent LIKE '%Safari%' THEN
'Apple Safari'
WHEN user_agent LIKE '%Opera%' THEN
'Opera'
WHEN user_agent LIKE '%Outlook%' THEN
'Outlook'
ELSE 'Unknown'
END AS browser , COUNT(client_ip) AS Number
FROM alb_logs
WHERE parse_datetime(time,'yyyy-MM-DD''T''HH:mm:ss.SSSSSS''Z')
BETWEEN parse_datetime('2018-01-01-00:00:00','yyyy-MM-DD-HH:mm:ss')
AND parse_datetime('2018-07-18-00:00:00','yyyy-MM-DD-HH:mm:ss')
GROUP BY CASE
WHEN user_agent LIKE '%edge%'THEN 'Edge'
WHEN user_agent LIKE '%MSIE%' THEN
'Internet Explorer'
WHEN user_agent LIKE '%Firefox%' THEN
'Mozilla Firefox'
WHEN user_agent LIKE '%Chrome%' THEN
'Google Chrome'
WHEN user_agent LIKE '%Safari%' THEN
'Apple Safari'
WHEN user_agent LIKE '%Opera%' THEN
'Opera'
WHEN user_agent LIKE '%Outlook%' THEN
'Outlook'
ELSE 'Unknown'
END
ORDER BY Number DESC
我缺少某种 group by client_ip
,但结果是错误的...
您需要 COUNT(DISTINCT client_ip)
聚合,也不需要 SELECT DISTINCT
,像这样
SELECT CASE WHEN user_agent ... END AS browser, COUNT(DISTINCT client_ip) AS Number
FROM alb_logs
WHERE ...
GROUP BY 1
ORDER BY 2 DESC