根据 mysql 中子集中另一列的最大值从子集中选择列
Selecting columns from a subset based on the max of another column in the subset in mysql
生物学家和 mySQL(版本 5.7.13)初学者,我目前正面临一项我无法解决的任务。我有一个 table 记录个人的目击事件以及时间,数据的摘录如下所示:
Table "tblSightings"
+---------------+---------+-----------+---------------------+
| id_individual | project | id_survey | Surveydatetime |
+---------------+---------+-----------+---------------------+
| A | 1 | S1 | 2016-11-18 15:54:00 |
| B | 1 | S1 | 2016-11-18 15:54:00 |
| C | 1 | S1 | 2016-11-18 15:54:00 |
| A | 1 | S2 | 2016-11-06 13:33:00 |
| B | 1 | S2 | 2016-11-06 13:33:00 |
| X | 1 | S2 | 2016-11-06 13:33:00 |
| A | 2 | S3 | 2015-05-01 12:48:00 |
+---------------+---------+-----------+---------------------+
我想做的是创建一个查询,列出最近一次目击的个人(id_individual + 项目的最高调查日期时间)以及相应的 id_survey 和所有其他个人在那次目击中与它一起被目击 (GROUP_CONCAT(id_individual))。基于此处示例数据的预期结果为:
+---------------+---------+---------------+------------+---------------------+
| id_individual | project | id_survey | associates | latest |
+---------------+---------+---------------+------------+---------------------+
| A | 1 | S1 | B C | 2016-11-18 15:54:00 |
| B | 1 | S1 | A C | 2016-11-18 15:54:00 |
| C | 1 | S1 | A B | 2016-11-18 15:54:00 |
| X | 1 | S2 | A B | 2016-11-06 13:33:00 |
| A | 2 | S3 | | 2015-05-01 12:48:00 |
+---------------+---------+---------------+------------+---------------------+
我确实弄清楚了如何使用
为每个人获取最新的 Surveydatetime
SELECT
id_individual,
project,
MAX(Surveydatetime) AS latest
FROM tblSightings
GROUP BY id_individual, project;
但我无法弄清楚如何为 "latest" 列获取相应的 "id_survey",因此也无法弄清楚如何从目击中获取所有 id_individual GROUP_CONCAT 用于所需结果中的关联列。当我在 SELECT 中包含 id_survey 时它不起作用,因为我还必须将它放在 GROUP BY 中,从而再次为每个人生成多行。到目前为止,我发现 "max of subsets" 的大多数答案都是使用 SELECT 语句进行 INNER JOIN,但我根本无法让它工作...
非常感谢任何帮助!谢谢!
试试这个:
Select
t2.id_individual, t2.project, t2.survey id_survey,
(
Select GROUP_CONCAT(tt.id_individual)
From tblsightings tt
Where tt.project = t2.project and tt.id_survey = t2.survey and tt.id_individual <> t2.id_individual
) associates,
t2.maxdate latest
From
(
Select t1.project, t1.id_individual, maxdate,
(
Select id_survey
From tblsightings tt
Where tt.project = t1.project and tt.id_individual = t1.id_individual and tt.surveydatetime = t1.maxdate
) survey
From
(
Select project, id_individual, max(surveydatetime) maxdate
From tblsightings t1
Group by project, id_individual
) t1
) t2
Order by t2.project, t2.id_individual
我使用的数据:
CREATE TABLE tblsightings
(
id_individual varchar(100),
surveydatetime varchar(100),
id_survey varchar(100),
project varchar(100)
);
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("A","2016-11-18 15:54:00","S1","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("B","2016-11-18 15:54:00","S1","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("C","2016-11-18 15:54:00","S1","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("A","2016-11-06 13:33:00","S2","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("B","2016-11-06 13:33:00","S2","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("X","2016-11-06 13:33:00","S2","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("A","2015-05-01 12:48:00","S3","2");
这是编写此查询的一种方法:
SELECT t1.id_individual, t1.project, ts.id_survey, t1.latest,
GROUP_CONCAT(t2.id_individual) AS associates
FROM tblSightings ts
INNER JOIN
( SELECT
id_individual,
project, MAX(Surveydatetime) AS latest
FROM tblSightings
GROUP BY id_individual, project
) t1
ON t1.id_individual = ts.id_individual
AND t1.project = ts.project
AND t1.latest = ts.Surveydatetime
LEFT JOIN tblSightings t2
ON ts.id_survey = t2.id_survey
AND ts.project = t2.project
AND t1.latest = t2.Surveydatetime
AND t1.id_individual != t2.id_individual
GROUP BY t1.id_individual, t1.project, ts.id_survey, t1.latest
ORDER BY t1.latest DESC, t1.project, t1.id_individual, ts.id_survey;
解释:
要获得给定格式的结果,我们需要 JOIN
相同的 table 三次。第一个是 INNER JOIN
,用于获取每个项目每个人具有最高时间戳的记录的 id_survey
。第二个是确定给定个人是否有任何同事。由于可能根本没有任何关联(如 S3
所示),我们在这里使用 LEFT JOIN
代替。我们还确保此 LEFT JOIN
仅对那些 id_individual
进行操作,这些人与正在处理其记录的个人不同,但他们属于同一项目和调查。
生物学家和 mySQL(版本 5.7.13)初学者,我目前正面临一项我无法解决的任务。我有一个 table 记录个人的目击事件以及时间,数据的摘录如下所示:
Table "tblSightings"
+---------------+---------+-----------+---------------------+
| id_individual | project | id_survey | Surveydatetime |
+---------------+---------+-----------+---------------------+
| A | 1 | S1 | 2016-11-18 15:54:00 |
| B | 1 | S1 | 2016-11-18 15:54:00 |
| C | 1 | S1 | 2016-11-18 15:54:00 |
| A | 1 | S2 | 2016-11-06 13:33:00 |
| B | 1 | S2 | 2016-11-06 13:33:00 |
| X | 1 | S2 | 2016-11-06 13:33:00 |
| A | 2 | S3 | 2015-05-01 12:48:00 |
+---------------+---------+-----------+---------------------+
我想做的是创建一个查询,列出最近一次目击的个人(id_individual + 项目的最高调查日期时间)以及相应的 id_survey 和所有其他个人在那次目击中与它一起被目击 (GROUP_CONCAT(id_individual))。基于此处示例数据的预期结果为:
+---------------+---------+---------------+------------+---------------------+
| id_individual | project | id_survey | associates | latest |
+---------------+---------+---------------+------------+---------------------+
| A | 1 | S1 | B C | 2016-11-18 15:54:00 |
| B | 1 | S1 | A C | 2016-11-18 15:54:00 |
| C | 1 | S1 | A B | 2016-11-18 15:54:00 |
| X | 1 | S2 | A B | 2016-11-06 13:33:00 |
| A | 2 | S3 | | 2015-05-01 12:48:00 |
+---------------+---------+---------------+------------+---------------------+
我确实弄清楚了如何使用
为每个人获取最新的 SurveydatetimeSELECT
id_individual,
project,
MAX(Surveydatetime) AS latest
FROM tblSightings
GROUP BY id_individual, project;
但我无法弄清楚如何为 "latest" 列获取相应的 "id_survey",因此也无法弄清楚如何从目击中获取所有 id_individual GROUP_CONCAT 用于所需结果中的关联列。当我在 SELECT 中包含 id_survey 时它不起作用,因为我还必须将它放在 GROUP BY 中,从而再次为每个人生成多行。到目前为止,我发现 "max of subsets" 的大多数答案都是使用 SELECT 语句进行 INNER JOIN,但我根本无法让它工作...
非常感谢任何帮助!谢谢!
试试这个:
Select
t2.id_individual, t2.project, t2.survey id_survey,
(
Select GROUP_CONCAT(tt.id_individual)
From tblsightings tt
Where tt.project = t2.project and tt.id_survey = t2.survey and tt.id_individual <> t2.id_individual
) associates,
t2.maxdate latest
From
(
Select t1.project, t1.id_individual, maxdate,
(
Select id_survey
From tblsightings tt
Where tt.project = t1.project and tt.id_individual = t1.id_individual and tt.surveydatetime = t1.maxdate
) survey
From
(
Select project, id_individual, max(surveydatetime) maxdate
From tblsightings t1
Group by project, id_individual
) t1
) t2
Order by t2.project, t2.id_individual
我使用的数据:
CREATE TABLE tblsightings
(
id_individual varchar(100),
surveydatetime varchar(100),
id_survey varchar(100),
project varchar(100)
);
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("A","2016-11-18 15:54:00","S1","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("B","2016-11-18 15:54:00","S1","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("C","2016-11-18 15:54:00","S1","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("A","2016-11-06 13:33:00","S2","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("B","2016-11-06 13:33:00","S2","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("X","2016-11-06 13:33:00","S2","1");
INSERT INTO tblsightings (id_individual,surveydatetime,id_survey,project) VALUES ("A","2015-05-01 12:48:00","S3","2");
这是编写此查询的一种方法:
SELECT t1.id_individual, t1.project, ts.id_survey, t1.latest,
GROUP_CONCAT(t2.id_individual) AS associates
FROM tblSightings ts
INNER JOIN
( SELECT
id_individual,
project, MAX(Surveydatetime) AS latest
FROM tblSightings
GROUP BY id_individual, project
) t1
ON t1.id_individual = ts.id_individual
AND t1.project = ts.project
AND t1.latest = ts.Surveydatetime
LEFT JOIN tblSightings t2
ON ts.id_survey = t2.id_survey
AND ts.project = t2.project
AND t1.latest = t2.Surveydatetime
AND t1.id_individual != t2.id_individual
GROUP BY t1.id_individual, t1.project, ts.id_survey, t1.latest
ORDER BY t1.latest DESC, t1.project, t1.id_individual, ts.id_survey;
解释:
要获得给定格式的结果,我们需要 JOIN
相同的 table 三次。第一个是 INNER JOIN
,用于获取每个项目每个人具有最高时间戳的记录的 id_survey
。第二个是确定给定个人是否有任何同事。由于可能根本没有任何关联(如 S3
所示),我们在这里使用 LEFT JOIN
代替。我们还确保此 LEFT JOIN
仅对那些 id_individual
进行操作,这些人与正在处理其记录的个人不同,但他们属于同一项目和调查。