sql 反连接优化

sql optimization with anti join

我有一个类别的递归 table 和一个公司 table,其字段如下:

category(id, name, parent) // parent is foreign key to category id :)
company(id, category_1, category_2, category_3) // category_* is foreign key to category id

类别树最大深度=3 ;

类别 cx -> 类别 cy -> 类别 cz

知道公司类别总是链接到最后一个类别 (c3),我想要公司链接到的所有类别(c1z、c2z、c3z、c1y、c2y、c3y、c1x、c2x、c3x)我的搜索引擎。 //c1y 是 category_1 的父级,c1x 是类别 1 的父级...

我提出的最佳查询是:

SELECT
  ID,
  NAME
FROM category c3
WHERE ID IN (
    select category_1 from company where id=:companyId
  union
    select category_2 from company where id=:companyId
  union
    select category_3 from company where id=:companyId
  union
        select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
        )
  union 
        select parent from category where id in (
          select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
          )
        )
  )

里面有很多重复的东西。一个用于 category_* in company。一个用于重复多次。

有什么方法可以删除所有这些重复项?

--更新--

假设我们使用两个 table 来解决 category-* 字段,那么具有 3 个级别的类别的递归问题呢?

例如,如果只有一个类别,它看起来像

SELECT
  ID,
  NAME
FROM category
WHERE ID IN (
  select category_1 from company where id=:companyId
  union
  select parent from category where id in (
    select category_1 from company where id=:companyId
  )
  union
  select parent from category where id in (
    select parent from category where id in (
      select category_1 from company where id=:companyId
    )
  )
);

如果你想加入数据,使用类似这样的东西(SQL 服务器示例):

DECLARE @category TABLE (id INT IDENTITY(1,1), name VARCHAR(30), parent INT) -- parent is foreign key to category id :)
DECLARE @company TABLE (id INT IDENTITY(1,1), category_1 INT, category_2 INT, category_3 INT) --category_* is foreign key to category->id


INSERT INTO @category (name, parent )
VALUES('Top category', null), ('Cars', 1)

INSERT INTO @company (category_1, category_2 , category_3 )
VALUES(2, null, null), (2, 2, null), (2, 2, 2)


SELECT t1.*, t2.*
FROM @category AS t1 INNER JOIN @company AS t2 ON t1.id = t2.category_1 or t1.id = t2.category_2  or t1.id = t2.category_3 

以上代码产生:

id  name    parent  id  category_1  category_2  category_3
2   Cars    1   1   2   NULL    NULL
2   Cars    1   2   2   2   NULL
2   Cars    1   3   2   2   2

但是,这样的数据库结构是错误的!

而不是一个 table

company(id, category_1, category_2, category_3)

创建两个 tables

company(id, name)
comp_cat(id, comp_id, cat_id)

为什么?我不想直接回答,所以我问你:1)当公司与超过3个类别相关时会发生什么? 2) 如果没有设置第二类和第三类,为什么要保存空值?

如果是SQL服务器,你可以使用Common Table Expressions:

;WITH CTE AS
(
    SELECT id, category_1 AS cat_id
    FROM @company 
    WHERE NOT category_1 IS NULL
    UNION ALL
    SELECT id, category_2 AS cat_id
    FROM @company 
    WHERE NOT category_2 IS NULL
    UNION ALL
    SELECT id, category_3 AS cat_id
    FROM @company 
    WHERE NOT category_3 IS NULL
)
SELECT DISTINCT t1.*, t2.*
FROM CTE AS t1 INNER JOIN @category AS t2 ON t1.cat_id = t2.id 

干杯, 马切

我使用 common table expressions 进行查询。这是我提出的最后一个查询

 with cte as (
    select category_1 as id  from company where id=:companyId and category_1 is not null
    union
    select category_2 as id from company where id=:companyId and category_2 is not null
    union
    select category_3 as id from company where id=:companyId and category_3 is not null
  ) select id, name FROM category WHERE id IN (
      select id from cte 
    union
      select parent from category where id in (select id from cte)
    union
      select parent from category where id in (
       select parent from category where id in (select id from cte)
     )
   );

这是我能想到的最好的方法。感谢@Maciej 指明方向,感谢@Nicholai 提供有关 DBMS 支持的信息。

只有有办法像 matlab 那样将行转置为列...:P