SQL 服务器 - 在连接非索引字段时用 OR 代替
SQL Server - Substitute for OR when JOINing on non-indexed field
背景:
我们的一位客户有一个系统,他们在其中记录了代理人与客户之间的 phone 对话,他们与客户签订了各种合同。录音存储在服务器上,其位置保存在数据库中的录音 table 中。然后,代理可以 "attach" 对合同进行记录,这会在 ContractRecordings table 中创建一个条目。我需要创建一份报告,显示哪些录音未附加到合同中,但 table 的设计方式使这比预期的更难。
Recording
------------------------------
recordingId : INT PK, IDENTITY
agentId : INT, FK
filename : NVARCHAR(255)
ContractRecording
----------------------------------
recordingId : INT PK, IDENTITY
contractNumber : INT
created : DATETIME
username : NVARCHAR(20)
note : NVARCHAR(MAX)
fileLocation : NVARCHAR(max)
如果 ContractRecording.recordingId 是对 Recording.recordingId 的外键引用,这将很容易,但事实并非如此。这是它自己的身份密钥。 table 之间唯一的 link 是文件位置,但 Recording.filename 仅存储文件名,而 ContractRecording.fileLocation 存储完整路径。是的,我知道,但我没有设计这些 tables。幸运的是,有一个模式,完整路径来自代理人的姓名和录音日期,我们可以从录音 table 中的数据中得知这两者。当然,还有另一个问题:文件路径的格式在大约一年前发生了变化,一些录音以旧格式存储,一些以新格式存储。
旧格式:C:\John-Recordings15-0811.wav
新格式:C:\Recordings\John Smith15-0811.wav
问题:
为了link这两个table,我必须在录音的完整路径上加入它们,这必须在录音table上手动构建并且可以采用两种格式之一。我最初尝试在 JOIN 子句中使用 OR,但这需要大约 8 分钟才能 return 返回大约 15k 行,这不是 acceptable。然后我尝试使用两个 LEFT OUTER JOIN——每个条件一个——但是这花了十分钟来提取似乎相同的数据。我想那是因为我加入了一个未编入索引的自定义字段。将其拆分为两个 SELECT 并使用 UNION 会导致重复行,每个查询将为每个记录 return 一行。我是否有任何其他选项可以将此查询缩短到几秒钟以内?这是我使用 OR 子句的原始查询。
SELECT * FROM
(SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM ContractRecording cr
RIGHT OUTER JOIN
(SELECT
recordingid
,a.name
,filename
,retain
,r.recordtime
,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
FROM Recording r
JOIN Agents a
ON r.agentid = a.agentid) rec
ON cr.filelocation = rec.fullpathold OR cr.filelocation = rec.fullpathnew) main
ORDER BY main.name, main.recordtime
报告需要为记录中的所有记录显示一行 table(除非一个记录附加到多个合同,在这种情况下,它应该为每个配对显示一行),数据来自 ContractRecording如果有任何行与任一文件位置格式匹配,则显示。
如果绝对必要,我不反对通过代码从 table 和 link 中提取所有数据,但那将是最后的手段。
更新:
根据要求,这是用于分析的查询的 UNION 版本。如前所述,它 return 每一对都有两行 - 一行有数据,另一行没有。这是因为至少两个 JOIN 中的一个总是没有匹配项,但我只想在另一个 JOIN 确实有匹配项时忽略它们。如果 JOIN 都不匹配,我也只想显示一次。与使用其他可能性相比,我不太相信使用 UNION 可以达到我想要的结果,所以我没有采用这种方法。
SELECT * FROM
((SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM ContractRecording cr
RIGHT OUTER JOIN
(SELECT
recordingid
,a.name
,filename
,retain
,r.recordtime
,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
FROM Recording r
JOIN Agents a
ON r.agentid = a.agentid) rec
ON cr.filelocation = rec.fullpathold)
UNION
(SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM ContractRecording cr
RIGHT OUTER JOIN
(SELECT
recordingid
,a.name
,filename
,retain
,r.recordtime
,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
FROM Recording r
JOIN Agents a
ON r.agentid = a.agentid) rec
ON cr.filelocation = rec.fullpathnew)) main
ORDER BY main.name, main.recordtime
您可以尝试使用 LIKE
WITH AgentRecordings AS
(
SELECT
a.name,
r.recordingId AS rawrecordingid,
r.filename,
r.recordtime,
CONCAT(
'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + FILENAME,
'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
) AS filepaths
FROM
Agents a
JOIN Recording r ON a.agentId = r.agentId
)
SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM
AgentRecordings rec
LEFT JOIN ContractRecording cr ON rec.filepaths LIKE '%' + cr.filelocation + '%'
如果这有帮助..我也会尝试创建一个临时 table 而不是使用 cte 看看是否有更多帮助。
您也可以尝试将两个 OR 语句拆分为 2 个 cte,并使用联合来组合找到的记录 ID
WITH fullpathnew AS
(
SELECT cr.recordingid AS "attachedrecordingid",
rec.recordingid AS "rawrecordingid",
cr.contractnumber,
cr.created,
cr.note,
cr.filelocation
FROM Agents a
JOIN Recording r ON a.agentId = r.agentId
JOIN ContractRecording cr ON cr.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
),
fullpathold AS
(
SELECT cr.recordingid AS "attachedrecordingid",
rec.recordingid AS "rawrecordingid",
cr.contractnumber,
cr.created,
cr.note,
cr.filelocation
FROM Agents a
JOIN Recording r ON a.agentId = r.agentId
JOIN ContractRecording cr ON cr.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
)
combinedCtes AS
(
SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathnew
UNION SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathold
)
SELECT cte.attachedrecordingid
,r.recordingid AS "rawrecordingid"
,cte.contractnumber
,cte.created
,a.name
,cte.note
,cte.filelocation
,r.filename
,r.recordtime
FROM Agents a
JOIN Recording r ON r.agentId = a.agentId
LEFT JOIN combinedCtes cte ON r.recordingid = cte.rawrecordingid
您的 UNION
需要在子查询 select 中,然后您可以左连接到该子查询
SELECT j.attachedrecordingid
,r.recordingid AS rawrecordingid
,j.contractnumber
,j.created
,a.NAME
,j.note
,j.filelocation
,r.filename
,r.recordtime
FROM Agents a
JOIN Recording r ON a.agentId = r.agentId
LEFT JOIN(
SELECT cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,cr.note
,cr.filelocation
FROM Agents a
JOIN Recording r
JOIN ContractRecording cr
ON cr1.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
UNION
SELECT cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,cr.note
,cr.filelocation
FROM Agents a
JOIN Recording r
JOIN ContractRecording cr
ON cr1.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
) j ON r.recordingId = j.rawrecordingid
ORDER BY a.name, r.recordtime
背景:
我们的一位客户有一个系统,他们在其中记录了代理人与客户之间的 phone 对话,他们与客户签订了各种合同。录音存储在服务器上,其位置保存在数据库中的录音 table 中。然后,代理可以 "attach" 对合同进行记录,这会在 ContractRecordings table 中创建一个条目。我需要创建一份报告,显示哪些录音未附加到合同中,但 table 的设计方式使这比预期的更难。
Recording
------------------------------
recordingId : INT PK, IDENTITY
agentId : INT, FK
filename : NVARCHAR(255)
ContractRecording
----------------------------------
recordingId : INT PK, IDENTITY
contractNumber : INT
created : DATETIME
username : NVARCHAR(20)
note : NVARCHAR(MAX)
fileLocation : NVARCHAR(max)
如果 ContractRecording.recordingId 是对 Recording.recordingId 的外键引用,这将很容易,但事实并非如此。这是它自己的身份密钥。 table 之间唯一的 link 是文件位置,但 Recording.filename 仅存储文件名,而 ContractRecording.fileLocation 存储完整路径。是的,我知道,但我没有设计这些 tables。幸运的是,有一个模式,完整路径来自代理人的姓名和录音日期,我们可以从录音 table 中的数据中得知这两者。当然,还有另一个问题:文件路径的格式在大约一年前发生了变化,一些录音以旧格式存储,一些以新格式存储。
旧格式:C:\John-Recordings15-0811.wav
新格式:C:\Recordings\John Smith15-0811.wav
问题:
为了link这两个table,我必须在录音的完整路径上加入它们,这必须在录音table上手动构建并且可以采用两种格式之一。我最初尝试在 JOIN 子句中使用 OR,但这需要大约 8 分钟才能 return 返回大约 15k 行,这不是 acceptable。然后我尝试使用两个 LEFT OUTER JOIN——每个条件一个——但是这花了十分钟来提取似乎相同的数据。我想那是因为我加入了一个未编入索引的自定义字段。将其拆分为两个 SELECT 并使用 UNION 会导致重复行,每个查询将为每个记录 return 一行。我是否有任何其他选项可以将此查询缩短到几秒钟以内?这是我使用 OR 子句的原始查询。
SELECT * FROM
(SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM ContractRecording cr
RIGHT OUTER JOIN
(SELECT
recordingid
,a.name
,filename
,retain
,r.recordtime
,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
FROM Recording r
JOIN Agents a
ON r.agentid = a.agentid) rec
ON cr.filelocation = rec.fullpathold OR cr.filelocation = rec.fullpathnew) main
ORDER BY main.name, main.recordtime
报告需要为记录中的所有记录显示一行 table(除非一个记录附加到多个合同,在这种情况下,它应该为每个配对显示一行),数据来自 ContractRecording如果有任何行与任一文件位置格式匹配,则显示。
如果绝对必要,我不反对通过代码从 table 和 link 中提取所有数据,但那将是最后的手段。
更新:
根据要求,这是用于分析的查询的 UNION 版本。如前所述,它 return 每一对都有两行 - 一行有数据,另一行没有。这是因为至少两个 JOIN 中的一个总是没有匹配项,但我只想在另一个 JOIN 确实有匹配项时忽略它们。如果 JOIN 都不匹配,我也只想显示一次。与使用其他可能性相比,我不太相信使用 UNION 可以达到我想要的结果,所以我没有采用这种方法。
SELECT * FROM
((SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM ContractRecording cr
RIGHT OUTER JOIN
(SELECT
recordingid
,a.name
,filename
,retain
,r.recordtime
,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
FROM Recording r
JOIN Agents a
ON r.agentid = a.agentid) rec
ON cr.filelocation = rec.fullpathold)
UNION
(SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM ContractRecording cr
RIGHT OUTER JOIN
(SELECT
recordingid
,a.name
,filename
,retain
,r.recordtime
,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
FROM Recording r
JOIN Agents a
ON r.agentid = a.agentid) rec
ON cr.filelocation = rec.fullpathnew)) main
ORDER BY main.name, main.recordtime
您可以尝试使用 LIKE
WITH AgentRecordings AS
(
SELECT
a.name,
r.recordingId AS rawrecordingid,
r.filename,
r.recordtime,
CONCAT(
'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + FILENAME,
'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
) AS filepaths
FROM
Agents a
JOIN Recording r ON a.agentId = r.agentId
)
SELECT
cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,rec.name
,cr.note
,cr.filelocation
,rec.filename
,rec.recordtime
FROM
AgentRecordings rec
LEFT JOIN ContractRecording cr ON rec.filepaths LIKE '%' + cr.filelocation + '%'
如果这有帮助..我也会尝试创建一个临时 table 而不是使用 cte 看看是否有更多帮助。
您也可以尝试将两个 OR 语句拆分为 2 个 cte,并使用联合来组合找到的记录 ID
WITH fullpathnew AS
(
SELECT cr.recordingid AS "attachedrecordingid",
rec.recordingid AS "rawrecordingid",
cr.contractnumber,
cr.created,
cr.note,
cr.filelocation
FROM Agents a
JOIN Recording r ON a.agentId = r.agentId
JOIN ContractRecording cr ON cr.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
),
fullpathold AS
(
SELECT cr.recordingid AS "attachedrecordingid",
rec.recordingid AS "rawrecordingid",
cr.contractnumber,
cr.created,
cr.note,
cr.filelocation
FROM Agents a
JOIN Recording r ON a.agentId = r.agentId
JOIN ContractRecording cr ON cr.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
)
combinedCtes AS
(
SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathnew
UNION SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathold
)
SELECT cte.attachedrecordingid
,r.recordingid AS "rawrecordingid"
,cte.contractnumber
,cte.created
,a.name
,cte.note
,cte.filelocation
,r.filename
,r.recordtime
FROM Agents a
JOIN Recording r ON r.agentId = a.agentId
LEFT JOIN combinedCtes cte ON r.recordingid = cte.rawrecordingid
您的 UNION
需要在子查询 select 中,然后您可以左连接到该子查询
SELECT j.attachedrecordingid
,r.recordingid AS rawrecordingid
,j.contractnumber
,j.created
,a.NAME
,j.note
,j.filelocation
,r.filename
,r.recordtime
FROM Agents a
JOIN Recording r ON a.agentId = r.agentId
LEFT JOIN(
SELECT cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,cr.note
,cr.filelocation
FROM Agents a
JOIN Recording r
JOIN ContractRecording cr
ON cr1.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
UNION
SELECT cr.recordingid AS "attachedrecordingid"
,rec.recordingid AS "rawrecordingid"
,cr.contractnumber
,cr.created
,cr.note
,cr.filelocation
FROM Agents a
JOIN Recording r
JOIN ContractRecording cr
ON cr1.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
) j ON r.recordingId = j.rawrecordingid
ORDER BY a.name, r.recordtime