为什么此 SQL 代码偶尔会产生孤立记录？

Question

免责声明：我不是 SQL 专家。我试图在将记录插入父 table 之前将记录插入子 table。（说完之后我开始怀疑这是否是个好主意。）父 table 记录包含对子 table 记录的引用，并且 said-reference 不能为空。这需要我先插入子 table，然后在二次插入期间链接到父 table。

无论如何，出于某种原因，此代码在 IdentifyingData（子）table 中随机生成孤立记录，例如，它们在 FraudScore（父）table 中没有条目，即使它们应该。

这就是我感到困惑的原因。为了解决这个问题，我开始将 @tempFraudScore table 的内容转储到物理审计 table 中，这样我就可以准确地看到数据转换过程中发生了什么。当我将下面从 @tempFraudScore 插入 FraudScore 的代码切换为从审计 table 插入时，所有子记录都成功创建了父记录。这对我来说毫无意义。

insert into IdentifyingData (EntryDateTime, IdentifyingDataTypeId, Value, Source)
select distinct GETDATE(), tfs.IdentifyingDataTypeId, tfs.Value, 'SSIS'
from @tempFraudScore tfs
where not exists (
    select id.IdentifyingDataTypeId, id.Value
    from IdentifyingData id
    where tfs.IdentifyingDataTypeId = id.IdentifyingDataTypeId
        and tfs.Value = id.Value
);

update tfs
set tfs.IdentifyingDataId = id.Id
from @tempFraudScore tfs
    inner join IdentifyingData id on
        tfs.Value = id.Value and
        tfs.IdentifyingDataTypeId = id.IdentifyingDataTypeId;

insert into FraudScore (EntryDateTime, FraudCriteriaId, AccountId, IdentifyingDataId, Score, Source)
select distinct
    GETDATE() EntryDateTime,
    tfs.FraudCriteriaId,
    tfs.AccountId,
    tfs.IdentifyingDataId,
    tfs.Score,
    'SSIS'
from @tempFraudScore tfs
    inner join FraudCriteria fc on
        tfs.FraudCriteriaId = fc.Id
            and fc.UniqueEntryPeriod = 0
where not exists (
    select fs.AccountId, fs.FraudCriteriaId, fs.IdentifyingDataId
    from FraudScore fs
    where tfs.AccountId = fs.AccountId
        and tfs.FraudCriteriaId = fs.FraudCriteriaId
        and tfs.IdentifyingDataId = fs.IdentifyingDataId
);

@tempFraudScore 预填充了除 IdentifyingDataId 之外的所有必要字段；必须首先插入到 IdentifyingData 中，然后使用创建的 ID 更新变量 table 来创建它。下面是变量的结构 table:

declare @tempFraudScore table(
    FraudCriteriaId int,
    AccountId bigint,
    IdentifyingDataId bigint,
    IdentifyingDataTypeId smallint,
    Value varchar(100),
    Score int
);

有人能告诉我是什么导致了这些孤立的 IdentifyingData 记录吗？我是否应该重新考虑这两个 table 之间的关系是如何构建的？我正在尝试做一些事情，以便一旦某个 IdentifyingData 记录被放入系统，它就不会被复制；它只会被新创建的 FraudScore 记录引用。

编辑附件是审计 table 的屏幕截图，它显示了单个值的数据转换进度（这些记录的值列是相同的值；为了隐私起见，我将其模糊处理）。请注意，尽管消息 "Post-FraudScore Insert"，但相关记录实际上从未插入到 FraudScore table.

Edit2 (2/6/2018)：为了解决此问题，我已将以下代码添加到存储过程中。我有一个值 (99999) 出现在 _Audit table 的值列中，但没有出现在第二个 table 的值列中，尽管代码只是将所有数据转储到这两个 table来自同一来源！我不确定它是否重要，但是这个存储过程是从 SSIS 包的执行 SQL 任务启动的，隔离级别为 "Serializable"。也就是说，我没有在代码中的任何地方明确使用事务，并且该执行 SQL 任务的 TransactionOption 设置为 "Supported"。我不知道这是否与问题有关。

insert into FraudScoreIdentifyingData_Audit
select 'Post-IdentifyingData Update', GETDATE(), FraudCriteriaId, AccountId, IdentifyingDataId, IdentifyingDataTypeId, Value, Score
from @tempFraudScore;

insert into FraudScoreIdentifyingData
select GETDATE(), FraudCriteriaId, AccountId, IdentifyingDataId, IdentifyingDataTypeId, Value, Score, 1
from @tempFraudScore;

这是两个 table 的架构：

Answer 1

无法说出问题的原因。

Parent Table=FraudScore

Child Table=IdentifyingData

它们有什么关系？首先你在FraudScore中插入记录，如果你有多个插入，则使用输出子句，在IdentifyingData

中插入记录

但这是使用 OUTPUT clause 的理想情况，即使问题因此没有解决。

    --data type similar to IdentifyingData
declare @tbl table(Id int,Value int,IdentifyingDataTypeId int)
declare @CurrentDateTime datetime=GETDATE()

begin try
begin transaction

insert into IdentifyingData (EntryDateTime, IdentifyingDataTypeId
, Value, Source)
OUTPUT INSERTED.Id, INSERTED.Value, INSERTED.IdentifyingDataTypeId  
        INTO @tbl  
select distinct @CurrentDateTime, tfs.IdentifyingDataTypeId
, tfs.Value, 'SSIS'
from @tempFraudScore tfs
where not exists (
    select id.IdentifyingDataTypeId, id.Value
    from IdentifyingData id
    where tfs.IdentifyingDataTypeId = id.IdentifyingDataTypeId
        and tfs.Value = id.Value
);


update tfs
set tfs.IdentifyingDataId = id.Id
from @tempFraudScore tfs
    inner join @tbl id on
        tfs.Value = id.Value and
        tfs.IdentifyingDataTypeId = id.IdentifyingDataTypeId;

insert into FraudScore (EntryDateTime, FraudCriteriaId, AccountId, 
IdentifyingDataId, Score, Source)
select distinct
    @CurrentDateTime EntryDateTime,
    tfs.FraudCriteriaId,
    tfs.AccountId,
    tfs.IdentifyingDataId,
    tfs.Score,
    'SSIS'
from @tempFraudScore tfs
    inner join FraudCriteria fc on
        tfs.FraudCriteriaId = fc.Id
            and fc.UniqueEntryPeriod = 0
where not exists (
    select fs.AccountId, fs.FraudCriteriaId, fs.IdentifyingDataId
    from FraudScore fs
    where tfs.AccountId = fs.AccountId
        and tfs.FraudCriteriaId = fs.FraudCriteriaId
        and tfs.IdentifyingDataId = fs.IdentifyingDataId
);
COMMIT
end TRY
begin CATCH
if(@@trancount>0)
ROLLBACK
end CATCH

Answer 2

原来我的一个大型存储过程中隐藏了一个删除语句，该语句编写不正确导致了问题。

在寻找这个问题的原因时，我还有一位 DBA 和我坐在一起，他确定了我的 SSIS 进程的一部分正在重组索引；但它是这样做的，因为包继续运行并填充所有必要的基础表（包括带有孤立记录的表）。据他介绍，重组或重建表上的索引，同时尝试向这些表中添加或删除记录也可能导致此问题；尽管在我的具体情况下，这是错误编写的单个删除语句。

为什么此 SQL 代码偶尔会产生孤立记录？

Why is this SQL code sporadically producing orphaned records?

tsql

database

sql-server

duplicates

orphan