找出条件 mysql 5.7 中每个用户的时差
find out time difference for every user in condition mysql 5.7
这是我的fiddlehttps://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=7c549a3de0c8002ec43381462ba6a801
假设我有这样的数据
CREATE TABLE test (
ID INT,
user_id INT,
createdAt DATE,
status_id INT
);
INSERT INTO test VALUES
(1, 12, '2020-01-01', 4),
(2, 12, '2020-01-03', 7),
(3, 12, '2020-01-06', 7),
(4, 13, '2020-01-02', 5),
(5, 13, '2020-01-03', 6),
(6, 14, '2020-03-03', 8),
(7, 13, '2020-03-04', 4),
(8, 15, '2020-04-04', 7),
(9, 14, '2020-03-02', 6),
(10, 14, '2020-03-10', 5),
(11, 13, '2020-04-10', 8);
select * from test
order by createdAt;
这是 table 之后 select (*)
+----+---------+------------+-----------+
| ID | user_id | createdAt | status_id |
+----+---------+------------+-----------+
| 1 | 12 | 2020-01-01 | 4 |
| 4 | 13 | 2020-01-02 | 5 |
| 2 | 12 | 2020-01-03 | 7 |
| 5 | 13 | 2020-01-03 | 6 |
| 3 | 12 | 2020-01-06 | 7 |
| 9 | 14 | 2020-03-02 | 6 |
| 6 | 14 | 2020-03-03 | 8 |
| 7 | 13 | 2020-03-04 | 4 |
| 10 | 14 | 2020-03-10 | 5 |
| 8 | 15 | 2020-04-04 | 7 |
| 11 | 13 | 2020-04-10 | 8 |
+----+---------+------------+-----------+
id是交易的id,user_Id是进行交易的用户的id,createdAt是交易发生的日期,status_id是交易的状态(如果status_Id 为 7,则交易被拒绝或不批准。
所以在这种情况下,我想找出在“2020-02-01”到“2020-04-01”之间的时间范围内每个重复用户的每笔批准交易的时间差,重复用户是用户谁在时间范围结束前进行交易,并且至少在该时间范围内再次进行 1 次交易,在这种情况下,用户在 '2020-04-01' 之前进行批准交易并且至少再次进行 1 次批准交易在“2020-02-01”和“2020-04-01”之间。
根据解释,我使用了这个查询
SELECT SUM(transactions) AS transactions,
MIN(`MIN`) AS `MIN`,
MAX(`MAX`) AS `MAX`,
SUM(total) / SUM(transactions) AS `AVG`
FROM (
SELECT user_id,
COUNT(*) AS transactions,
MIN(diff) AS `MIN`,
MAX(diff) AS `MAX`,
SUM(diff) AS total
FROM (
SELECT user_id, DATEDIFF((SELECT MIN(t2.createdAt)
FROM test t2
WHERE t2.user_id = t1.user_id
AND t1.createdAt < t2.createdAt
AND t2.status_id in (4, 5, 6, 8)
), t1.createdAt) AS diff
FROM test t1
WHERE status_id in (4, 5, 6, 8)
HAVING SUM(status_id != 7 and createdAt < '2020-04-01') > 1
AND SUM(status_id != 7 AND createdAt BETWEEN '2020-02-01'
AND '2020-04-01')
) DiffTable
WHERE diff IS NOT NULL
GROUP BY user_id
) totals
它说
In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'db_314931870.t1.user_id'; this is incompatible with sql_mode=only_full_group_by
预期结果
+-----+-----+---------+
| MIN | MAX | AVG |
+-----+-----+---------+
| 1 | 61 | 21,6667 |
+-----+-----+---------+
解释:min(最小值)是 1 天的差异,发生在 users_id 14 人在“2020-03-02”进行审批交易并在“2020-03-03”再次进行审批交易,max(最大)是61时间差发生在users_Id13人在'2020-01-03'做审批交易
并在'2020-03-04'再次进行审批交易,平均时间差是时间范围内所有时间差的总和:计算时间范围内发生的交易
SELECT MIN(DATEDIFF(t2.createdAt, t1.createdAt)) min_diff,
MAX(DATEDIFF(t2.createdAt, t1.createdAt)) max_diff,
AVG(DATEDIFF(t2.createdAt, t1.createdAt)) avg_diff
FROM test t1
JOIN test t2 ON t1.user_id = t2.user_id
AND t1.createdAt < t2.createdAt
AND 7 NOT IN (t1.status_id, t2.status_id)
JOIN (SELECT t3.user_id
FROM test t3
WHERE t3.status_id != 7
GROUP BY t3.user_id
HAVING SUM(t3.createdAt < '2020-04-01')
AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
WHERE NOT EXISTS (SELECT NULL
FROM test t5
WHERE t1.user_id = t5.user_id
AND t5.status_id != 7
AND t1.createdAt < t5.createdAt
AND t5.createdAt < t2.createdAt)
fiddle 有简短的解释。
这是我的fiddlehttps://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=7c549a3de0c8002ec43381462ba6a801
假设我有这样的数据
CREATE TABLE test (
ID INT,
user_id INT,
createdAt DATE,
status_id INT
);
INSERT INTO test VALUES
(1, 12, '2020-01-01', 4),
(2, 12, '2020-01-03', 7),
(3, 12, '2020-01-06', 7),
(4, 13, '2020-01-02', 5),
(5, 13, '2020-01-03', 6),
(6, 14, '2020-03-03', 8),
(7, 13, '2020-03-04', 4),
(8, 15, '2020-04-04', 7),
(9, 14, '2020-03-02', 6),
(10, 14, '2020-03-10', 5),
(11, 13, '2020-04-10', 8);
select * from test
order by createdAt;
这是 table 之后 select (*)
+----+---------+------------+-----------+
| ID | user_id | createdAt | status_id |
+----+---------+------------+-----------+
| 1 | 12 | 2020-01-01 | 4 |
| 4 | 13 | 2020-01-02 | 5 |
| 2 | 12 | 2020-01-03 | 7 |
| 5 | 13 | 2020-01-03 | 6 |
| 3 | 12 | 2020-01-06 | 7 |
| 9 | 14 | 2020-03-02 | 6 |
| 6 | 14 | 2020-03-03 | 8 |
| 7 | 13 | 2020-03-04 | 4 |
| 10 | 14 | 2020-03-10 | 5 |
| 8 | 15 | 2020-04-04 | 7 |
| 11 | 13 | 2020-04-10 | 8 |
+----+---------+------------+-----------+
id是交易的id,user_Id是进行交易的用户的id,createdAt是交易发生的日期,status_id是交易的状态(如果status_Id 为 7,则交易被拒绝或不批准。
所以在这种情况下,我想找出在“2020-02-01”到“2020-04-01”之间的时间范围内每个重复用户的每笔批准交易的时间差,重复用户是用户谁在时间范围结束前进行交易,并且至少在该时间范围内再次进行 1 次交易,在这种情况下,用户在 '2020-04-01' 之前进行批准交易并且至少再次进行 1 次批准交易在“2020-02-01”和“2020-04-01”之间。
根据解释,我使用了这个查询
SELECT SUM(transactions) AS transactions,
MIN(`MIN`) AS `MIN`,
MAX(`MAX`) AS `MAX`,
SUM(total) / SUM(transactions) AS `AVG`
FROM (
SELECT user_id,
COUNT(*) AS transactions,
MIN(diff) AS `MIN`,
MAX(diff) AS `MAX`,
SUM(diff) AS total
FROM (
SELECT user_id, DATEDIFF((SELECT MIN(t2.createdAt)
FROM test t2
WHERE t2.user_id = t1.user_id
AND t1.createdAt < t2.createdAt
AND t2.status_id in (4, 5, 6, 8)
), t1.createdAt) AS diff
FROM test t1
WHERE status_id in (4, 5, 6, 8)
HAVING SUM(status_id != 7 and createdAt < '2020-04-01') > 1
AND SUM(status_id != 7 AND createdAt BETWEEN '2020-02-01'
AND '2020-04-01')
) DiffTable
WHERE diff IS NOT NULL
GROUP BY user_id
) totals
它说
In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'db_314931870.t1.user_id'; this is incompatible with sql_mode=only_full_group_by
预期结果
+-----+-----+---------+
| MIN | MAX | AVG |
+-----+-----+---------+
| 1 | 61 | 21,6667 |
+-----+-----+---------+
解释:min(最小值)是 1 天的差异,发生在 users_id 14 人在“2020-03-02”进行审批交易并在“2020-03-03”再次进行审批交易,max(最大)是61时间差发生在users_Id13人在'2020-01-03'做审批交易 并在'2020-03-04'再次进行审批交易,平均时间差是时间范围内所有时间差的总和:计算时间范围内发生的交易
SELECT MIN(DATEDIFF(t2.createdAt, t1.createdAt)) min_diff,
MAX(DATEDIFF(t2.createdAt, t1.createdAt)) max_diff,
AVG(DATEDIFF(t2.createdAt, t1.createdAt)) avg_diff
FROM test t1
JOIN test t2 ON t1.user_id = t2.user_id
AND t1.createdAt < t2.createdAt
AND 7 NOT IN (t1.status_id, t2.status_id)
JOIN (SELECT t3.user_id
FROM test t3
WHERE t3.status_id != 7
GROUP BY t3.user_id
HAVING SUM(t3.createdAt < '2020-04-01')
AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
WHERE NOT EXISTS (SELECT NULL
FROM test t5
WHERE t1.user_id = t5.user_id
AND t5.status_id != 7
AND t1.createdAt < t5.createdAt
AND t5.createdAt < t2.createdAt)
fiddle 有简短的解释。