SQLite 5 表,连接,求和,GroupBy
SQLite 5 tables, Join, Sum, GroupBy
我正在拼命处理一个问题,但无法解决。我有一个包含五个 table 和相应列的 SQLite 数据库:
Tab1 = {Job_ID, Company_ID, Source_ID}
Tab1_Category = {JOb_ID, CAtegory_ID}
类别 = {ID, First_level, Second_level}
Tab2 = {Job_ID, Log_Date, 点击次数, 应用程序}
来源 = {ID, 姓名}
我创建了一个示例数据库:
CREATE TABLE TAB1 (
`job_id` INTEGER,
`company_id` INTEGER,
`source_id` INTEGER
);
INSERT INTO TAB1
(`job_id`, `company_id`, `source_id`)
VALUES
('1', '222', '2'),
('2', '222', '1'),
('3', '222', '1'),
('4', '222', '1'),
('5', '255', '3');
CREATE TABLE TAB1_CATEGORY (
`job_id` INTEGER,
`category_id` INTEGER
);
INSERT INTO TAB1_CATEGORY
(`job_id`, `category_id`)
VALUES
('1', '31'),
('2', '36'),
('3', '33'),
('3', '35'),
('4', '32'),
('4', '31'),
('5', '34');
CREATE TABLE CATEGORY (
`id` INTEGER,
`first_level` VARCHAR(3),
`second_level` VARCHAR(3)
);
INSERT INTO CATEGORY
(`id`, `first_level`, `second_level`)
VALUES
('30', 'sss', 'aaa'),
('31', 'sss', 'aaa'),
('32', 'sss', 'bbb'),
('33', 'ggg', 'ccc'),
('34', 'ggg', 'ddd'),
('35', 'ggg', 'eee'),
('36', 'hhh', 'fff');
CREATE TABLE SOURCE (
`id` INTEGER,
`name` VARCHAR(3)
);
INSERT INTO SOURCE
(`id`, `name`)
VALUES
('1', 'mmm'),
('2', 'nnn'),
('3', 'ooo');
CREATE TABLE TAB2 (
`job_id` INTEGER,
`log_date` VARCHAR(10),
`clicks` INTEGER,
`applications` INTEGER
);
INSERT INTO TAB2
(`job_id`, `log_date`, `clicks`, `applications`)
VALUES
('1', '01-01-1999', '6', '2'),
('1', '02-01-1999', '7', '3'),
('1', '03-01-1999', '9', '1'),
('2', '02-01-1999', '4', '1'),
('2', '05-01-1999', '8', '2'),
('3', '03-01-1999', '9', '0'),
('4', '05-01-1999', '5', '3'),
('4', '06-01-1999', '4', '1'),
('5', '01-01-1999', '1', '0'),
('5', '03-01-1999', '3', '1');
我需要一次查询得到以下结果>
- 所有 JOB_ID (Tab1) 和 Company_ID (Tab1) 的列表,其中 First_level(来自 table 类别)是“ggg”或“sss”和名称(来自 table 来源)是 "mmm"
- 每个 Job_ID
的点击总和和应用总和 (Tab2)
- 不同 Second_level 的总和(来自 table 类别)
- 每个 company_ID 的申请总和(一个 company_ID 可以有很多 Job_ids)
这是我目前所做的,但没有按照我想要的方式工作>
SELECT t1.job_id, t1.company_id,
SUM(t2.clicks), SUM(t2.applications), COUNT(DISTINCT c.second_level)
FROM TAB1 t1
JOIN SOURCE s ON s.id = t1.source_id
JOIN TAB1_CATEGORY tc ON t1.job_id = tc.job_id
JOIN CATEGORY c ON tc.category_id = c.id
JOIN TAB2 t2 ON t1.job_id = t2.job_id
WHERE c.first_level IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.job_id
我得到的是所有 clicks/applications 的总和,而不是每个 job_id。 :
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
3
222
18
0
2
4
222
18
8
2
这就是我想要得到的:
job_id
company_id
SUM(t2.clicks)
SUM(t2.applications)
COUNT(DISTINCT c.second_level)
Total Appl per company
3
222
9
0
2
4
4
222
9
4
2
4
首先,您必须在 TAB2
内聚合,然后加入(使用 INNER
加入)。
您还需要 SUM()
window 列的函数 Total Appl per company
:
SELECT t1.JOB_ID, t1.COMPANY_ID,
t2.total_clicks, t2.total_apps,
COUNT(DISTINCT c.SECOND_LEVEL) count_second_level,
SUM(t2.total_apps) OVER (PARTITION BY t1.COMPANY_ID) [Total Appl per company]
FROM TAB1 t1
INNER JOIN SOURCE s ON s.ID = t1.SOURCE_ID
INNER JOIN TAB1_CATEGORY tc ON t1.JOB_ID = tc.JOB_ID
INNER JOIN CATEGORY c ON tc.CATEGORY_ID = c.ID
INNER JOIN (
SELECT JOB_ID, SUM(CLICKS) total_clicks, SUM(APPLICATIONS) total_apps
FROM TAB2
GROUP BY JOB_ID
) t2 ON t1.JOB_ID = t2.JOB_ID
WHERE c.FIRST_LEVEL IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.JOB_ID, t1.COMPANY_ID, t2.total_clicks, t2.total_apps
参见demo。
结果:
> job_id | company_id | total_clicks | total_apps | count_second_level | Total Appl per company
> -----: | ---------: | -----------: | ---------: | -----------------: | ---------------------:
> 3 | 222 | 9 | 0 | 2 | 4
> 4 | 222 | 9 | 4 | 2 | 4
我正在拼命处理一个问题,但无法解决。我有一个包含五个 table 和相应列的 SQLite 数据库:
Tab1 = {Job_ID, Company_ID, Source_ID}
Tab1_Category = {JOb_ID, CAtegory_ID}
类别 = {ID, First_level, Second_level}
Tab2 = {Job_ID, Log_Date, 点击次数, 应用程序}
来源 = {ID, 姓名}
我创建了一个示例数据库:
CREATE TABLE TAB1 (
`job_id` INTEGER,
`company_id` INTEGER,
`source_id` INTEGER
);
INSERT INTO TAB1
(`job_id`, `company_id`, `source_id`)
VALUES
('1', '222', '2'),
('2', '222', '1'),
('3', '222', '1'),
('4', '222', '1'),
('5', '255', '3');
CREATE TABLE TAB1_CATEGORY (
`job_id` INTEGER,
`category_id` INTEGER
);
INSERT INTO TAB1_CATEGORY
(`job_id`, `category_id`)
VALUES
('1', '31'),
('2', '36'),
('3', '33'),
('3', '35'),
('4', '32'),
('4', '31'),
('5', '34');
CREATE TABLE CATEGORY (
`id` INTEGER,
`first_level` VARCHAR(3),
`second_level` VARCHAR(3)
);
INSERT INTO CATEGORY
(`id`, `first_level`, `second_level`)
VALUES
('30', 'sss', 'aaa'),
('31', 'sss', 'aaa'),
('32', 'sss', 'bbb'),
('33', 'ggg', 'ccc'),
('34', 'ggg', 'ddd'),
('35', 'ggg', 'eee'),
('36', 'hhh', 'fff');
CREATE TABLE SOURCE (
`id` INTEGER,
`name` VARCHAR(3)
);
INSERT INTO SOURCE
(`id`, `name`)
VALUES
('1', 'mmm'),
('2', 'nnn'),
('3', 'ooo');
CREATE TABLE TAB2 (
`job_id` INTEGER,
`log_date` VARCHAR(10),
`clicks` INTEGER,
`applications` INTEGER
);
INSERT INTO TAB2
(`job_id`, `log_date`, `clicks`, `applications`)
VALUES
('1', '01-01-1999', '6', '2'),
('1', '02-01-1999', '7', '3'),
('1', '03-01-1999', '9', '1'),
('2', '02-01-1999', '4', '1'),
('2', '05-01-1999', '8', '2'),
('3', '03-01-1999', '9', '0'),
('4', '05-01-1999', '5', '3'),
('4', '06-01-1999', '4', '1'),
('5', '01-01-1999', '1', '0'),
('5', '03-01-1999', '3', '1');
我需要一次查询得到以下结果>
- 所有 JOB_ID (Tab1) 和 Company_ID (Tab1) 的列表,其中 First_level(来自 table 类别)是“ggg”或“sss”和名称(来自 table 来源)是 "mmm"
- 每个 Job_ID 的点击总和和应用总和 (Tab2)
- 不同 Second_level 的总和(来自 table 类别)
- 每个 company_ID 的申请总和(一个 company_ID 可以有很多 Job_ids)
这是我目前所做的,但没有按照我想要的方式工作>
SELECT t1.job_id, t1.company_id,
SUM(t2.clicks), SUM(t2.applications), COUNT(DISTINCT c.second_level)
FROM TAB1 t1
JOIN SOURCE s ON s.id = t1.source_id
JOIN TAB1_CATEGORY tc ON t1.job_id = tc.job_id
JOIN CATEGORY c ON tc.category_id = c.id
JOIN TAB2 t2 ON t1.job_id = t2.job_id
WHERE c.first_level IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.job_id
我得到的是所有 clicks/applications 的总和,而不是每个 job_id。 :
job_id | company_id | SUM(t2.clicks) | SUM(t2.applications) | COUNT(DISTINCT c.second_level) |
---|---|---|---|---|
3 | 222 | 18 | 0 | 2 |
4 | 222 | 18 | 8 | 2 |
这就是我想要得到的:
job_id | company_id | SUM(t2.clicks) | SUM(t2.applications) | COUNT(DISTINCT c.second_level) | Total Appl per company |
---|---|---|---|---|---|
3 | 222 | 9 | 0 | 2 | 4 |
4 | 222 | 9 | 4 | 2 | 4 |
首先,您必须在 TAB2
内聚合,然后加入(使用 INNER
加入)。
您还需要 SUM()
window 列的函数 Total Appl per company
:
SELECT t1.JOB_ID, t1.COMPANY_ID,
t2.total_clicks, t2.total_apps,
COUNT(DISTINCT c.SECOND_LEVEL) count_second_level,
SUM(t2.total_apps) OVER (PARTITION BY t1.COMPANY_ID) [Total Appl per company]
FROM TAB1 t1
INNER JOIN SOURCE s ON s.ID = t1.SOURCE_ID
INNER JOIN TAB1_CATEGORY tc ON t1.JOB_ID = tc.JOB_ID
INNER JOIN CATEGORY c ON tc.CATEGORY_ID = c.ID
INNER JOIN (
SELECT JOB_ID, SUM(CLICKS) total_clicks, SUM(APPLICATIONS) total_apps
FROM TAB2
GROUP BY JOB_ID
) t2 ON t1.JOB_ID = t2.JOB_ID
WHERE c.FIRST_LEVEL IN ('ggg', 'sss') AND s.NAME ='mmm'
GROUP BY t1.JOB_ID, t1.COMPANY_ID, t2.total_clicks, t2.total_apps
参见demo。
结果:
> job_id | company_id | total_clicks | total_apps | count_second_level | Total Appl per company
> -----: | ---------: | -----------: | ---------: | -----------------: | ---------------------:
> 3 | 222 | 9 | 0 | 2 | 4
> 4 | 222 | 9 | 4 | 2 | 4