什么是数据库中的半连接?
What is semi-join in database?
我在尝试理解半连接的概念以及它与传统连接的区别时遇到了问题。我已经尝试了一些文章,但对解释不满意,有人可以帮助我理解吗?
据我了解,半连接是左连接或右连接:
What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?
所以左(半)联接和 "conventional" 联接之间的区别在于您只检索左 table 的数据(您的联接条件匹配)。而使用完整(外部)连接(我认为这就是传统连接的意思),您可以检索条件匹配的两个 table 的数据。
简单的例子。让我们select 名学生使用左外连接取得成绩:
SELECT DISTINCT s.id
FROM students s
LEFT JOIN grades g ON g.student_id = s.id
WHERE g.student_id IS NOT NULL
现在与左半连接相同:
SELECT s.id
FROM students s
WHERE EXISTS (SELECT 1 FROM grades g
WHERE g.student_id = s.id)
后者通常更有效(取决于具体的 DBMS 和查询优化器)。
据我所知SQL支持SEMIJOIN/ANTISEMI
的方言有U-SQL/ClouderaImpala.
Semijoins are U-SQL’s way filter a rowset based on the inclusion of its rows in another rowset. Other SQL dialects express this with the SELECT * FROM A WHERE A.key IN (SELECT B.key FROM B) pattern.
更多信息Semi Join and Anti Join Should Have Their Own Syntax in SQL:
“Semi” means that we don’t really join the right hand side, we only check if a join would yield results for any given tuple.
-- IN
SELECT *
FROM Employee
WHERE DeptName IN (
SELECT DeptName
FROM Dept
)
-- EXISTS
SELECT *
FROM Employee
WHERE EXISTS (
SELECT 1
FROM Dept
WHERE Employee.DeptName = Dept.DeptName
)
编辑:
另一种支持 SEMI/ANTISEMI 连接的方言是 KQL:
kind=leftsemi (or kind=rightsemi)
Returns all the records from the left side that have matches from the right. The result table contains columns from the left side only.
let t1 = datatable(key:long, value:string)
[1, "a",
2, "b",
3, "c"];
let t2 = datatable(key:long)
[1,3];
t1 | join kind=leftsemi (t2) on key
输出:
key value
1 a
3 c
我在尝试理解半连接的概念以及它与传统连接的区别时遇到了问题。我已经尝试了一些文章,但对解释不满意,有人可以帮助我理解吗?
据我了解,半连接是左连接或右连接:
What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?
所以左(半)联接和 "conventional" 联接之间的区别在于您只检索左 table 的数据(您的联接条件匹配)。而使用完整(外部)连接(我认为这就是传统连接的意思),您可以检索条件匹配的两个 table 的数据。
简单的例子。让我们select 名学生使用左外连接取得成绩:
SELECT DISTINCT s.id
FROM students s
LEFT JOIN grades g ON g.student_id = s.id
WHERE g.student_id IS NOT NULL
现在与左半连接相同:
SELECT s.id
FROM students s
WHERE EXISTS (SELECT 1 FROM grades g
WHERE g.student_id = s.id)
后者通常更有效(取决于具体的 DBMS 和查询优化器)。
据我所知SQL支持SEMIJOIN/ANTISEMI
的方言有U-SQL/ClouderaImpala.
Semijoins are U-SQL’s way filter a rowset based on the inclusion of its rows in another rowset. Other SQL dialects express this with the SELECT * FROM A WHERE A.key IN (SELECT B.key FROM B) pattern.
更多信息Semi Join and Anti Join Should Have Their Own Syntax in SQL:
“Semi” means that we don’t really join the right hand side, we only check if a join would yield results for any given tuple.
-- IN
SELECT *
FROM Employee
WHERE DeptName IN (
SELECT DeptName
FROM Dept
)
-- EXISTS
SELECT *
FROM Employee
WHERE EXISTS (
SELECT 1
FROM Dept
WHERE Employee.DeptName = Dept.DeptName
)
编辑:
另一种支持 SEMI/ANTISEMI 连接的方言是 KQL:
kind=leftsemi (or kind=rightsemi)
Returns all the records from the left side that have matches from the right. The result table contains columns from the left side only.
let t1 = datatable(key:long, value:string)
[1, "a",
2, "b",
3, "c"];
let t2 = datatable(key:long)
[1,3];
t1 | join kind=leftsemi (t2) on key
输出:
key value
1 a
3 c