SQL 服务器 - 在连接非索引字段时用 OR 代替

SQL Server - Substitute for OR when JOINing on non-indexed field

背景:

我们的一位客户有一个系统,他们在其中记录了代理人与客户之间的 phone 对话,他们与客户签订了各种合同。录音存储在服务器上,其位置保存在数据库中的录音 table 中。然后,代理可以 "attach" 对合同进行记录,这会在 ContractRecordings table 中创建一个条目。我需要创建一份报告,显示哪些录音未附加到合同中,但 table 的设计方式使这比预期的更难。

Recording
------------------------------
recordingId : INT PK, IDENTITY
agentId : INT, FK
filename : NVARCHAR(255)


ContractRecording
----------------------------------
recordingId : INT PK, IDENTITY
contractNumber : INT
created : DATETIME
username : NVARCHAR(20)
note : NVARCHAR(MAX)
fileLocation : NVARCHAR(max)

如果 ContractRecording.recordingId 是对 Recording.recordingId 的外键引用,这将很容易,但事实并非如此。这是它自己的身份密钥。 table 之间唯一的 link 是文件位置,但 Recording.filename 仅存储文件名,而 ContractRecording.fileLocation 存储完整路径。是的,我知道,但我没有设计这些 tables。幸运的是,有一个模式,完整路径来自代理人的姓名和录音日期,我们可以从录音 table 中的数据中得知这两者。当然,还有另一个问题:文件路径的格式在大约一年前发生了变化,一些录音以旧格式存储,一些以新格式存储。

旧格式:C:\John-Recordings15-0811.wav

新格式:C:\Recordings\John Smith15-0811.wav

问题:

为了link这两个table,我必须在录音的完整路径上加入它们,这必须在录音table上手动构建并且可以采用两种格式之一。我最初尝试在 JOIN 子句中使用 OR,但这需要大约 8 分钟才能 return 返回大约 15k 行,这不是 acceptable。然后我尝试使用两个 LEFT OUTER JOIN——每个条件一个——但是这花了十分钟来提取似乎相同的数据。我想那是因为我加入了一个未编入索引的自定义字段。将其拆分为两个 SELECT 并使用 UNION 会导致重复行,每个查询将为每个记录 return 一行。我是否有任何其他选项可以将此查询缩短到几秒钟以内?这是我使用 OR 子句的原始查询。

SELECT * FROM
    (SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
        ,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathold OR cr.filelocation = rec.fullpathnew) main
ORDER BY main.name, main.recordtime

报告需要为记录中的所有记录显示一行 table(除非一个记录附加到多个合同,在这种情况下,它应该为每个配对显示一行),数据来自 ContractRecording如果有任何行与任一文件位置格式匹配,则显示。

如果绝对必要,我不反对通过代码从 table 和 link 中提取所有数据,但那将是最后的手段。

更新:

根据要求,这是用于分析的查询的 UNION 版本。如前所述,它 return 每一对都有两行 - 一行有数据,另一行没有。这是因为至少两个 JOIN 中的一个总是没有匹配项,但我只想在另一个 JOIN 确实有匹配项时忽略它们。如果 JOIN 都不匹配,我也只想显示一次。与使用其他可能性相比,我不太相信使用 UNION 可以达到我想要的结果,所以我没有采用这种方法。

SELECT * FROM
    ((SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathold)
    UNION
    (SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathnew)) main
ORDER BY main.name, main.recordtime

您可以尝试使用 LIKE

WITH AgentRecordings AS
(
    SELECT  
        a.name,
        r.recordingId AS rawrecordingid,
        r.filename,
        r.recordtime,
        CONCAT(
            'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + FILENAME,
            'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
        ) AS filepaths
    FROM
        Agents a
        JOIN Recording r ON a.agentId = r.agentId
)
SELECT
    cr.recordingid AS "attachedrecordingid"
    ,rec.recordingid AS "rawrecordingid"
    ,cr.contractnumber
    ,cr.created
    ,rec.name
    ,cr.note
    ,cr.filelocation
    ,rec.filename
    ,rec.recordtime
FROM
    AgentRecordings rec
    LEFT JOIN ContractRecording cr ON rec.filepaths LIKE '%' + cr.filelocation + '%'

如果这有帮助..我也会尝试创建一个临时 table 而不是使用 cte 看看是否有更多帮助。

您也可以尝试将两个 OR 语句拆分为 2 个 cte,并使用联合来组合找到的记录 ID

WITH fullpathnew AS
(
    SELECT  cr.recordingid AS "attachedrecordingid",
            rec.recordingid AS "rawrecordingid",
            cr.contractnumber,
            cr.created,
            cr.note,
            cr.filelocation
    FROM    Agents a
            JOIN Recording r ON a.agentId = r.agentId
            JOIN ContractRecording cr ON cr.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
),
fullpathold AS
(
    SELECT  cr.recordingid AS "attachedrecordingid",
            rec.recordingid AS "rawrecordingid",
            cr.contractnumber,
            cr.created,
            cr.note,
            cr.filelocation
    FROM    Agents a
            JOIN Recording r ON a.agentId = r.agentId
            JOIN ContractRecording cr ON cr.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
)
combinedCtes AS
(
    SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathnew
    UNION SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathold
)
SELECT  cte.attachedrecordingid
        ,r.recordingid AS "rawrecordingid"
        ,cte.contractnumber
        ,cte.created
        ,a.name
        ,cte.note
        ,cte.filelocation
        ,r.filename
        ,r.recordtime
FROM    Agents a
        JOIN Recording r ON r.agentId = a.agentId
        LEFT JOIN combinedCtes cte ON r.recordingid = cte.rawrecordingid

您的 UNION 需要在子查询 select 中,然后您可以左连接到该子查询

SELECT  j.attachedrecordingid
        ,r.recordingid AS rawrecordingid
        ,j.contractnumber
        ,j.created
        ,a.NAME
        ,j.note
        ,j.filelocation
        ,r.filename      
        ,r.recordtime
FROM    Agents a
        JOIN Recording r ON a.agentId = r.agentId
        LEFT JOIN(
            SELECT  cr.recordingid AS "attachedrecordingid"
                    ,rec.recordingid AS "rawrecordingid"
                    ,cr.contractnumber
                    ,cr.created
                    ,cr.note
                    ,cr.filelocation
            FROM    Agents a 
                    JOIN Recording r
                    JOIN ContractRecording cr 
                        ON cr1.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
            UNION
            SELECT  cr.recordingid AS "attachedrecordingid"
                    ,rec.recordingid AS "rawrecordingid"
                    ,cr.contractnumber
                    ,cr.created
                    ,cr.note
                    ,cr.filelocation
            FROM    Agents a 
                    JOIN Recording r
                    JOIN ContractRecording cr 
                        ON cr1.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename


        ) j ON r.recordingId = j.rawrecordingid
ORDER BY a.name, r.recordtime