SQL 识别缺失记录的查询(不在子查询中)

SQL Query to Identify Missing Records (NOT IN subquery)

我正在尝试编写一个查询,它将 return 为文档类型 6553 而不是 6554 生成的打印作业的每个 ID(文档类型 6553 和 6554 之间的共享值) - 这些是指总是一起生成。

我尝试了以下操作,这需要相当长的时间才能 运行 并产生一些对我来说似乎太高的值:

select ID from PrintQueueShadow
where DocumentType = '6553'
and CreateDate > getdate() - 7  --No more than 7 days old
and ID not in
(select ID from PrintQueueShadow
where CreateDate > getdate() - 7
and DocumentType = '6554')  --Checking against every ID for Doc Type 6554s sent in the last 7 days

如有任何帮助,我们将不胜感激。谢谢!

您的逻辑看起来是正确的,但我们可以改用现有逻辑来改进您的内容。另外,我们可以尝试建议一个可能加快查询速度的索引。

SELECT ID
FROM PrintQueueShadow pqs1
WHERE DocumentType = '6553' AND CreateDate > GETDATE() - 7 AND
      NOT EXISTS (
          SELECT 1
          FROM PrintQueueShadow pqs2
          WHERE pqs2.ID = pqs1.ID AND
                CreateDate > GETDATE() - 7 AND 
                DocumentType = '6554'
      );

使用 EXISTS 可能比 WHERE IN (...) 执行得更好,因为前者让数据库在子查询中找到匹配记录后立即停止搜索。

以上查询可能受益于以下索引:

CREATE INDEX idx ON PrintQueueShadow (ID, DocumentType, CreateDate);

您也可以尝试对上述索引中的三列进行排列。

您可以尝试使用自连接。

SELECT DISTINCT PQS_1.ID 
FROM PrintQueueShadow AS PQS_1
LEFT JOIN PrintQueueShadow AS PQS_2
     ON  PQS_1.ID = PQS_2.ID
     AND PQS_2.CreateDate > GETDATE() - 7
     AND PQS_2.DocumentType = '6554' --Checking against every ID for Doc Type 6554s sent in the last 7 days
WHERE PQS_1.DocumentType = '6553'
  AND PQS_1.CreateDate > GETDATE() - 7  --No more than 7 days old
  AND PQS_2.ID IS NULL -- Excluding ID which have been found in Doc Type 6553

我还添加了 DISTINCT 语句,以防在给定的 DocumentType

中存在重复的 ID