减少没有链接的 CROSS JOIN + LEFT JOIN 的行数
Reducing row count of CROSS JOIN + LEFT JOIN where there is no linkage
我已经为此苦苦挣扎了一段时间。我有一个SQLFiddle with the same approximate contents of this question.
我有三个 table,items
,profiles
,以及链接 table,posts
,前两个 [=34] 的键=]s,带有示例数据的模式:
create table items (
item_id int unsigned primary key auto_increment,
title varchar(255)
);
insert into items (item_id, title) VALUES(1, 'Item One');
insert into items (item_id, title) VALUES(2, 'Item Two');
insert into items (item_id, title) VALUES(3, 'Item Three');
insert into items (item_id, title) VALUES(4, 'Item Four');
insert into items (item_id, title) VALUES(5, 'Item Five');
create table profiles (
profile_id int unsigned primary key auto_increment,
profile_name varchar(255)
);
insert into profiles (profile_id, profile_name) VALUES(1, 'Bob');
insert into profiles (profile_id, profile_name) VALUES(1, 'Mark');
insert into profiles (profile_id, profile_name) VALUES(1, 'Nancy');
create table posts (
post_id int unsigned primary key auto_increment,
item_id int unsigned, -- Relates to items.item_id
profile_id int unsigned, -- Relates to profile.profile_id,
post_date DATETIME
);
insert into posts (item_id, profile_id, post_date) values(1, 1, NOW());
insert into posts (item_id, profile_id, post_date) values(2, 2, NOW());
insert into posts (item_id, profile_id, post_date) values(2, 2, NOW());
我正在使用以下查询生成几乎正确的结果:
SELECT
`items`.`item_id`,
`items`.`title`,
`profiles`.`profile_id`,
`profiles`.`profile_name`,
`posts`.`post_id`,
`posts`.`post_date`
FROM `items`
CROSS JOIN `profiles`
LEFT JOIN `posts` ON `items`.`item_id` = `posts`.`item_id`
AND `posts`.`profile_id` = `profiles`.`profile_id`;
对于我的特定应用程序,这是次优的。我得到了很多我的特定实现不需要的 'extra' 行。最终结果看起来像这样:
+------------|------------|---------|-----------+
| Item Name | Profile ID | Post ID | Post Date |
+------------+------------+---------+-----------+
| Item One | 1 | 1 | 2015-... | -- Bob Posted this
| Item One | 2 | NULL | NULL | -- No one else did
| Item One | 3 | NULL | NULL |
| Item Two | 1 | 2 | 2015-... | -- Bob posted this
| Item Two | 2 | 3 | 2015-... | -- So did mark
| Item Two | 3 | NULL | NULL | -- Nancy didn't
| Item Three | 1 | NULL | NULL |
| Item Three | 2 | NULL | NULL |
| Item Three | 3 | 4 | 2015-... | -- Only nancy posted #3
| Item Four | 1 | NULL | NULL | -- No one posted #4
| Item Four | 2 | NULL | NULL |
| Item Four | 3 | NULL | NULL |
| Item Five | 1 | NULL | NULL | -- No one posted #5
| Item Five | 2 | NULL | NULL |
| Item Five | 3 | NULL | NULL |
+------------+------------+---------+-----------+
这完全按照我的要求进行 - 每个项目都返回三次(对应于配置文件计数)。然而,如果在第 4 项和第 5 项没有链接的情况下,它们只返回一次并带有 NULL profile_id,将是理想的,如下所示:
+------------|------------|---------|-----------+
| Item Name | Profile ID | Post ID | Post Date |
+------------+------------+---------+-----------+
| Item One | 1 | 1 | 2015-... | -- Bob Posted this
| Item One | 2 | NULL | NULL | -- No one else did
| Item One | 3 | NULL | NULL |
| Item Two | 1 | 2 | 2015-... | -- Bob posted this
| Item Two | 2 | 3 | 2015-... | -- So did mark
| Item Two | 3 | NULL | NULL | -- Nancy didn't
| Item Three | 1 | NULL | NULL |
| Item Three | 2 | NULL | NULL |
| Item Three | 3 | 4 | 2015-... | -- Nancy posted #3
| Item Four | NULL | NULL | NULL | -- **No one posted #3 and #4
| Item Five | NULL | NULL | NULL | -- Only need #3 and #4 once**
+------------+------------+---------+-----------+
虽然在这个例子中这只意味着少了 4 行,但在我的实际应用程序中,有很多项目,但配置文件和帖子并不多。所以这个小改动可以显着减少服务器端语言处理。
任何人都可以指出正确的方向,只在我有某种类型的链接的地方限制交叉连接吗?
SELECT `items`.`item_id`,
`items`.`title`,
`profiles`.`profile_id`,
`profiles`.`profile_name`,
`posts`.`post_id`,
`posts`.`post_date`
FROM `items`
LEFT JOIN
`profiles`
ON EXISTS
(
SELECT NULL
FROM `posts`
WHERE `posts`.`item_id` = `items`.`item_id`
)
LEFT JOIN
`posts`
ON `items`.`item_id` = `posts`.`item_id`
AND `posts`.`profile_id` = `profiles`.`profile_id`
ORDER BY
`items`.`item_id`, `profiles`.`profile_id`
我已经为此苦苦挣扎了一段时间。我有一个SQLFiddle with the same approximate contents of this question.
我有三个 table,items
,profiles
,以及链接 table,posts
,前两个 [=34] 的键=]s,带有示例数据的模式:
create table items (
item_id int unsigned primary key auto_increment,
title varchar(255)
);
insert into items (item_id, title) VALUES(1, 'Item One');
insert into items (item_id, title) VALUES(2, 'Item Two');
insert into items (item_id, title) VALUES(3, 'Item Three');
insert into items (item_id, title) VALUES(4, 'Item Four');
insert into items (item_id, title) VALUES(5, 'Item Five');
create table profiles (
profile_id int unsigned primary key auto_increment,
profile_name varchar(255)
);
insert into profiles (profile_id, profile_name) VALUES(1, 'Bob');
insert into profiles (profile_id, profile_name) VALUES(1, 'Mark');
insert into profiles (profile_id, profile_name) VALUES(1, 'Nancy');
create table posts (
post_id int unsigned primary key auto_increment,
item_id int unsigned, -- Relates to items.item_id
profile_id int unsigned, -- Relates to profile.profile_id,
post_date DATETIME
);
insert into posts (item_id, profile_id, post_date) values(1, 1, NOW());
insert into posts (item_id, profile_id, post_date) values(2, 2, NOW());
insert into posts (item_id, profile_id, post_date) values(2, 2, NOW());
我正在使用以下查询生成几乎正确的结果:
SELECT
`items`.`item_id`,
`items`.`title`,
`profiles`.`profile_id`,
`profiles`.`profile_name`,
`posts`.`post_id`,
`posts`.`post_date`
FROM `items`
CROSS JOIN `profiles`
LEFT JOIN `posts` ON `items`.`item_id` = `posts`.`item_id`
AND `posts`.`profile_id` = `profiles`.`profile_id`;
对于我的特定应用程序,这是次优的。我得到了很多我的特定实现不需要的 'extra' 行。最终结果看起来像这样:
+------------|------------|---------|-----------+
| Item Name | Profile ID | Post ID | Post Date |
+------------+------------+---------+-----------+
| Item One | 1 | 1 | 2015-... | -- Bob Posted this
| Item One | 2 | NULL | NULL | -- No one else did
| Item One | 3 | NULL | NULL |
| Item Two | 1 | 2 | 2015-... | -- Bob posted this
| Item Two | 2 | 3 | 2015-... | -- So did mark
| Item Two | 3 | NULL | NULL | -- Nancy didn't
| Item Three | 1 | NULL | NULL |
| Item Three | 2 | NULL | NULL |
| Item Three | 3 | 4 | 2015-... | -- Only nancy posted #3
| Item Four | 1 | NULL | NULL | -- No one posted #4
| Item Four | 2 | NULL | NULL |
| Item Four | 3 | NULL | NULL |
| Item Five | 1 | NULL | NULL | -- No one posted #5
| Item Five | 2 | NULL | NULL |
| Item Five | 3 | NULL | NULL |
+------------+------------+---------+-----------+
这完全按照我的要求进行 - 每个项目都返回三次(对应于配置文件计数)。然而,如果在第 4 项和第 5 项没有链接的情况下,它们只返回一次并带有 NULL profile_id,将是理想的,如下所示:
+------------|------------|---------|-----------+
| Item Name | Profile ID | Post ID | Post Date |
+------------+------------+---------+-----------+
| Item One | 1 | 1 | 2015-... | -- Bob Posted this
| Item One | 2 | NULL | NULL | -- No one else did
| Item One | 3 | NULL | NULL |
| Item Two | 1 | 2 | 2015-... | -- Bob posted this
| Item Two | 2 | 3 | 2015-... | -- So did mark
| Item Two | 3 | NULL | NULL | -- Nancy didn't
| Item Three | 1 | NULL | NULL |
| Item Three | 2 | NULL | NULL |
| Item Three | 3 | 4 | 2015-... | -- Nancy posted #3
| Item Four | NULL | NULL | NULL | -- **No one posted #3 and #4
| Item Five | NULL | NULL | NULL | -- Only need #3 and #4 once**
+------------+------------+---------+-----------+
虽然在这个例子中这只意味着少了 4 行,但在我的实际应用程序中,有很多项目,但配置文件和帖子并不多。所以这个小改动可以显着减少服务器端语言处理。
任何人都可以指出正确的方向,只在我有某种类型的链接的地方限制交叉连接吗?
SELECT `items`.`item_id`,
`items`.`title`,
`profiles`.`profile_id`,
`profiles`.`profile_name`,
`posts`.`post_id`,
`posts`.`post_date`
FROM `items`
LEFT JOIN
`profiles`
ON EXISTS
(
SELECT NULL
FROM `posts`
WHERE `posts`.`item_id` = `items`.`item_id`
)
LEFT JOIN
`posts`
ON `items`.`item_id` = `posts`.`item_id`
AND `posts`.`profile_id` = `profiles`.`profile_id`
ORDER BY
`items`.`item_id`, `profiles`.`profile_id`