sql 反连接优化

Question

我有一个类别的递归 table 和一个公司 table，其字段如下：

category(id, name, parent) // parent is foreign key to category id :)
company(id, category_1, category_2, category_3) // category_* is foreign key to category id

类别树最大深度=3 ;

类别 cx -> 类别 cy -> 类别 cz

知道公司类别总是链接到最后一个类别 (c3)，我想要公司链接到的所有类别（c1z、c2z、c3z、c1y、c2y、c3y、c1x、c2x、c3x）我的搜索引擎。 //c1y 是 category_1 的父级，c1x 是类别 1 的父级...

我提出的最佳查询是：

SELECT
  ID,
  NAME
FROM category c3
WHERE ID IN (
    select category_1 from company where id=:companyId
  union
    select category_2 from company where id=:companyId
  union
    select category_3 from company where id=:companyId
  union
        select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
        )
  union 
        select parent from category where id in (
          select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
          )
        )
  )

里面有很多重复的东西。一个用于 category_* in company。一个用于重复多次。

有什么方法可以删除所有这些重复项？

--更新--

假设我们使用两个 table 来解决 category-* 字段，那么具有 3 个级别的类别的递归问题呢？

例如，如果只有一个类别，它看起来像

SELECT
  ID,
  NAME
FROM category
WHERE ID IN (
  select category_1 from company where id=:companyId
  union
  select parent from category where id in (
    select category_1 from company where id=:companyId
  )
  union
  select parent from category where id in (
    select parent from category where id in (
      select category_1 from company where id=:companyId
    )
  )
);

Answer 1

如果你想加入数据，使用类似这样的东西（SQL 服务器示例）：

DECLARE @category TABLE (id INT IDENTITY(1,1), name VARCHAR(30), parent INT) -- parent is foreign key to category id :)
DECLARE @company TABLE (id INT IDENTITY(1,1), category_1 INT, category_2 INT, category_3 INT) --category_* is foreign key to category->id


INSERT INTO @category (name, parent )
VALUES('Top category', null), ('Cars', 1)

INSERT INTO @company (category_1, category_2 , category_3 )
VALUES(2, null, null), (2, 2, null), (2, 2, 2)


SELECT t1.*, t2.*
FROM @category AS t1 INNER JOIN @company AS t2 ON t1.id = t2.category_1 or t1.id = t2.category_2  or t1.id = t2.category_3

以上代码产生：

id  name    parent  id  category_1  category_2  category_3
2   Cars    1   1   2   NULL    NULL
2   Cars    1   2   2   2   NULL
2   Cars    1   3   2   2   2

但是，这样的数据库结构是错误的！

而不是一个 table

company(id, category_1, category_2, category_3)

创建两个 tables

company(id, name)
comp_cat(id, comp_id, cat_id)

为什么？我不想直接回答，所以我问你：1）当公司与超过3个类别相关时会发生什么？ 2) 如果没有设置第二类和第三类，为什么要保存空值？

如果是SQL服务器，你可以使用Common Table Expressions:

;WITH CTE AS
(
    SELECT id, category_1 AS cat_id
    FROM @company 
    WHERE NOT category_1 IS NULL
    UNION ALL
    SELECT id, category_2 AS cat_id
    FROM @company 
    WHERE NOT category_2 IS NULL
    UNION ALL
    SELECT id, category_3 AS cat_id
    FROM @company 
    WHERE NOT category_3 IS NULL
)
SELECT DISTINCT t1.*, t2.*
FROM CTE AS t1 INNER JOIN @category AS t2 ON t1.cat_id = t2.id

干杯，马切

Answer 2

我使用 common table expressions 进行查询。这是我提出的最后一个查询

 with cte as (
    select category_1 as id  from company where id=:companyId and category_1 is not null
    union
    select category_2 as id from company where id=:companyId and category_2 is not null
    union
    select category_3 as id from company where id=:companyId and category_3 is not null
  ) select id, name FROM category WHERE id IN (
      select id from cte 
    union
      select parent from category where id in (select id from cte)
    union
      select parent from category where id in (
       select parent from category where id in (select id from cte)
     )
   );

这是我能想到的最好的方法。感谢@Maciej 指明方向，感谢@Nicholai 提供有关 DBMS 支持的信息。

只有有办法像 matlab 那样将行转置为列...:P

sql 反连接优化

sql optimization with anti join

sql

anti-join

query-optimization