tSQLt.AssertEqualsTable 涉及大数据集时需要数小时才能完成

tSQLt.AssertEqualsTable takes hours to complete when big data set involves

EXEC tSQLt.AssertEqualsTable 'expected', 'actual';

我正在比较预期与实际 tables。这个 table 有 100k 多条记录,如果 80k 条记录匹配,20k 条记录不匹配,那么 tsqlt 将像下面每个失败的行一样绘制,并且需要数小时才能完成。如何避免 tsqlt 在输出中显示不匹配的记录?

这些 table 将导致以下失败消息:

failed: unexpected/missing resultset rows!
|_m_|col1|col2|col3|
+---+----+----+----+
|<  |2   |B   |b   |
|<  |3   |C   |c   |
|=  |1   |A   |a   |
|>  |3   |X   |c   |

100,000 行远远超过单元测试的合理要求。

您应该能够用比这少得多的行来测试所有场景。

但是如果 tSQLt.AssertEqualsTable 不能满足您的需求,因为它会打印出可能很大的字符串,您可以自己进行检查和断言 - 例如如下所示(假设 ActualExpected 具有相同的列架构定义)。

DECLARE @expectedRows INT,
        @actualRows   INT,
        @expectedChk  INT,
        @actualChk    INT;

SELECT @expectedRows = COUNT(*),
       @expectedChk = CHECKSUM_AGG(binary_checksum(*))
FROM   Expected

SELECT @actualRows = COUNT(*),
       @actualChk = CHECKSUM_AGG(binary_checksum(*))
FROM   Actual

EXEC tSQLt.AssertEquals
  @expectedRows,
  @actualRows,
  'Mismatched rowcount between expected and actual'

EXEC tSQLt.AssertEquals
  @expectedChk,
  @actualChk,
  'Mismatched checksum between expected and actual'

--Row count the same and checksum the same. Do more rigorous check.
IF EXISTS(SELECT *
          FROM   (SELECT 1 AS [️], *
                  FROM   Expected) E
                 FULL JOIN (SELECT 1 AS [️], *
                            FROM   Actual) A
                        ON EXISTS(SELECT A.*
                                  INTERSECT
                                  SELECT E.*)
          WHERE  A.[️] IS NULL
                  OR E.[️] IS NULL)
  EXEC tSQLt.Fail
    'Mismatched row content between expected and actual';