确定每个月第一次在公司消费的客户 (mySQL)
Determine customers that have spent money at the company for the first time, each month (mySQL)
我下面有一个table,我正在尝试确定每个月第一次在公司消费的用户数量。
我想要的结果是 table 以新用户、月份和年份作为列。
在人们对此 post 投反对票之前,我已经查看了各种 post,但似乎无法找到解决此问题的类似方法。我在下面包含的代码基于我设法从相关 posts.
中拼凑出来的内容
这是原文table:
+---------------------+-------------+-----------------+
| datetime | customer_id | amount |
+---------------------+-------------+-----------------+
| 2018-03-01 03:00:00 | 3786 | 14 |
+---------------------+-------------+-----------------+
| 2018-03-02 17:00:00 | 5678 | 25 |
+---------------------+-------------+-----------------+
| 2018-08-17 19:00:00 | 5267 | 45 |
+---------------------+-------------+-----------------+
| 2018-08-25 08:00:00 | 3456 | 78 |
+---------------------+-------------+-----------------+
| 2018-08-25 17:00:00 | 3456 | 25 |
+---------------------+-------------+-----------------+
| 2019-05-25 14:00:00 | 3456 | 15 |
+---------------------+-------------+-----------------+
| 2019-07-02 14:00:00 | 88889 | 45 |
+---------------------+-------------+-----------------+
| 2019-08-25 08:00:00 | 1234 | 88 |
+---------------------+-------------+-----------------+
| 2019-08-30 09:31:00 | 1234 | 30 |
+---------------------+-------------+-----------------+
| 2019-08-30 12:00:00 | 9876 | 55 |
+---------------------+-------------+-----------------+
| 2019-09-01 13:00:00 | 88889 | 23 |
+---------------------+-------------+-----------------+
这是 CREATE 语句:
CREATE TABLE IF NOT EXISTS `spend` ( `datetime` datetime NOT NULL, `customer_id` int(11) NOT NULL, `amount` int(11) NOT NULL, PRIMARY KEY (`datetime`)) DEFAULT CHARSET=utf8mb4;
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-01 03:00:00', 3786, 14);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-02 17:00:00', 5678, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-17 19:00:00', 5267, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 08:00:00', 3456, 78);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 17:00:00', 3456, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-05-25 14:00:00', 3456, 15);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-07-02 14:00:00', 88889, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-25 08:00:00', 1234, 88);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 09:31:00', 1234, 30);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 12:00:00', 9876, 55);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-09-01 13:00:00', 88889, 23);
这是我想出的代码:
SELECT S.datetime, S.customer_id, S.amount
FROM spend S
INNER JOIN
(SELECT customer_id, MIN(datetime) AS first_occurence
FROM spend
GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
这是结果 table:
+------------------+-------------+-------+
| datetime | customer_id |amount |
+------------------+-------------+-------+
| 01/03/2018 03:00 | 3786 | 14 |
+------------------+-------------+-------+
| 02/03/2018 17:00 | 5678 | 25 |
+------------------+-------------+-------+
| 17/08/2018 19:00 | 5267 | 45 |
+------------------+-------------+-------+
| 25/08/2018 08:00 | 3456 | 78 |
+------------------+-------------+-------+
| 02/07/2019 14:00 | 88889 | 45 |
+------------------+-------------+-------+
| 25/08/2019 08:00 | 1234 | 88 |
+------------------+-------------+-------+
| 30/08/2019 12:00 | 9876 | 55 |
+------------------+-------------+-------+
这是 table 的示例:
+-----------+-------+------+
| new_users | month | year |
+-----------+-------+------+
| 2 | 3 | 2018 |
+-----------+-------+------+
| 3 | 8 | 2018 |
+-----------+-------+------+
| 1 | 5 | 2019 |
+-----------+-------+------+
| 1 | 7 | 2019 |
+-----------+-------+------+
| 3 | 8 | 2019 |
+-----------+-------+------+
| 1 | 9 | 2019 |
+-----------+-------+------+
你的开始是正确的。现在将其用作子查询以按月获取计数。
SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
SELECT S.datetime, S.customer_id, S.amount
FROM spend S
INNER JOIN
(SELECT customer_id, MIN(datetime) AS first_occurence
FROM spend
GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
) AS x
GROUP BY month, year
ORDER BY year, month
实际上,您甚至不需要在子查询中加入,因为您没有在最终结果中使用第一次购买的金额。
SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
SELECT customer_id, MIN(datetime) AS datetime
FROM spend
GROUP BY customer_id
) AS x
GROUP BY month, year
ORDER BY year, month
您不需要两级深子查询。您可以使用 MIN()
简单地找到客户第一次花钱的时间,然后简单地从该最小日期时间值中提取 YEAR()
和 MONTH()
,以计算用户数量:
SELECT
YEAR(min_dt) y,
MONTH(min_dt) m,
COUNT(*) AS new_customers
FROM
(
SELECT customer_id, MIN(datetime) AS min_dt
FROM spend
GROUP BY customer_id
) t
GROUP BY y, m
结果
| y | m | new_customers |
| ---- | --- | ------------- |
| 2018 | 3 | 2 |
| 2018 | 8 | 2 |
| 2019 | 7 | 1 |
| 2019 | 8 | 2 |
用ROW_NUMBER()window函数:
select
count(*) new_users,
month(t.datetime) month,
year(t.datetime) year
from (
select *,
row_number() over (partition by customer_id order by datetime) rn
from spend
) t
where t.rn = 1
group by year, month
order by year, month
查看 demo 样本数据。
结果:
| new_users | month | year |
| --------- | ----- | ---- |
| 2 | 3 | 2018 |
| 2 | 8 | 2018 |
| 1 | 7 | 2019 |
| 2 | 8 | 2019 |
你也可以
select
count(*) new_users,
month(datetime) month,
year(datetime) year
from spend
where datetime in (select min(datetime) from spend group by customer_id)
group by year, month
order by year, month;
我下面有一个table,我正在尝试确定每个月第一次在公司消费的用户数量。
我想要的结果是 table 以新用户、月份和年份作为列。
在人们对此 post 投反对票之前,我已经查看了各种 post,但似乎无法找到解决此问题的类似方法。我在下面包含的代码基于我设法从相关 posts.
中拼凑出来的内容这是原文table:
+---------------------+-------------+-----------------+
| datetime | customer_id | amount |
+---------------------+-------------+-----------------+
| 2018-03-01 03:00:00 | 3786 | 14 |
+---------------------+-------------+-----------------+
| 2018-03-02 17:00:00 | 5678 | 25 |
+---------------------+-------------+-----------------+
| 2018-08-17 19:00:00 | 5267 | 45 |
+---------------------+-------------+-----------------+
| 2018-08-25 08:00:00 | 3456 | 78 |
+---------------------+-------------+-----------------+
| 2018-08-25 17:00:00 | 3456 | 25 |
+---------------------+-------------+-----------------+
| 2019-05-25 14:00:00 | 3456 | 15 |
+---------------------+-------------+-----------------+
| 2019-07-02 14:00:00 | 88889 | 45 |
+---------------------+-------------+-----------------+
| 2019-08-25 08:00:00 | 1234 | 88 |
+---------------------+-------------+-----------------+
| 2019-08-30 09:31:00 | 1234 | 30 |
+---------------------+-------------+-----------------+
| 2019-08-30 12:00:00 | 9876 | 55 |
+---------------------+-------------+-----------------+
| 2019-09-01 13:00:00 | 88889 | 23 |
+---------------------+-------------+-----------------+
这是 CREATE 语句:
CREATE TABLE IF NOT EXISTS `spend` ( `datetime` datetime NOT NULL, `customer_id` int(11) NOT NULL, `amount` int(11) NOT NULL, PRIMARY KEY (`datetime`)) DEFAULT CHARSET=utf8mb4;
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-01 03:00:00', 3786, 14);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-02 17:00:00', 5678, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-17 19:00:00', 5267, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 08:00:00', 3456, 78);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 17:00:00', 3456, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-05-25 14:00:00', 3456, 15);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-07-02 14:00:00', 88889, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-25 08:00:00', 1234, 88);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 09:31:00', 1234, 30);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 12:00:00', 9876, 55);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-09-01 13:00:00', 88889, 23);
这是我想出的代码:
SELECT S.datetime, S.customer_id, S.amount
FROM spend S
INNER JOIN
(SELECT customer_id, MIN(datetime) AS first_occurence
FROM spend
GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
这是结果 table:
+------------------+-------------+-------+
| datetime | customer_id |amount |
+------------------+-------------+-------+
| 01/03/2018 03:00 | 3786 | 14 |
+------------------+-------------+-------+
| 02/03/2018 17:00 | 5678 | 25 |
+------------------+-------------+-------+
| 17/08/2018 19:00 | 5267 | 45 |
+------------------+-------------+-------+
| 25/08/2018 08:00 | 3456 | 78 |
+------------------+-------------+-------+
| 02/07/2019 14:00 | 88889 | 45 |
+------------------+-------------+-------+
| 25/08/2019 08:00 | 1234 | 88 |
+------------------+-------------+-------+
| 30/08/2019 12:00 | 9876 | 55 |
+------------------+-------------+-------+
这是 table 的示例:
+-----------+-------+------+
| new_users | month | year |
+-----------+-------+------+
| 2 | 3 | 2018 |
+-----------+-------+------+
| 3 | 8 | 2018 |
+-----------+-------+------+
| 1 | 5 | 2019 |
+-----------+-------+------+
| 1 | 7 | 2019 |
+-----------+-------+------+
| 3 | 8 | 2019 |
+-----------+-------+------+
| 1 | 9 | 2019 |
+-----------+-------+------+
你的开始是正确的。现在将其用作子查询以按月获取计数。
SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
SELECT S.datetime, S.customer_id, S.amount
FROM spend S
INNER JOIN
(SELECT customer_id, MIN(datetime) AS first_occurence
FROM spend
GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
) AS x
GROUP BY month, year
ORDER BY year, month
实际上,您甚至不需要在子查询中加入,因为您没有在最终结果中使用第一次购买的金额。
SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
FROM (
SELECT customer_id, MIN(datetime) AS datetime
FROM spend
GROUP BY customer_id
) AS x
GROUP BY month, year
ORDER BY year, month
您不需要两级深子查询。您可以使用 MIN()
简单地找到客户第一次花钱的时间,然后简单地从该最小日期时间值中提取 YEAR()
和 MONTH()
,以计算用户数量:
SELECT
YEAR(min_dt) y,
MONTH(min_dt) m,
COUNT(*) AS new_customers
FROM
(
SELECT customer_id, MIN(datetime) AS min_dt
FROM spend
GROUP BY customer_id
) t
GROUP BY y, m
结果
| y | m | new_customers |
| ---- | --- | ------------- |
| 2018 | 3 | 2 |
| 2018 | 8 | 2 |
| 2019 | 7 | 1 |
| 2019 | 8 | 2 |
用ROW_NUMBER()window函数:
select
count(*) new_users,
month(t.datetime) month,
year(t.datetime) year
from (
select *,
row_number() over (partition by customer_id order by datetime) rn
from spend
) t
where t.rn = 1
group by year, month
order by year, month
查看 demo 样本数据。
结果:
| new_users | month | year |
| --------- | ----- | ---- |
| 2 | 3 | 2018 |
| 2 | 8 | 2018 |
| 1 | 7 | 2019 |
| 2 | 8 | 2019 |
你也可以
select
count(*) new_users,
month(datetime) month,
year(datetime) year
from spend
where datetime in (select min(datetime) from spend group by customer_id)
group by year, month
order by year, month;