使用标量 UDF 将多个表中的多行用于持久计算列

Question

我正在尝试在 Order Transactions Table 中创建一个新字段作为 Persisted 计算列，使用标量 UDF 值作为字段的值。

我知道 Persisted 列的要求是该值是确定性的，这意味着我拥有的多个 table UDF 是不确定的，因为它未使用源 table.

中的字段

函数:

USE [MyDatabase]
GO
/****** Object:  UserDefinedFunction [dbo].[fnCalcOutstandingBalance]    
Script Date: 08/10/2018 14:01:18 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[fnCalcOutstandingBalance](@ItemReferance int)

RETURNS INT
WITH SCHEMABINDING 
AS
Begin
DECLARE @AcceptedQty INT
DECLARE @SumOfQty INT
DECLARE @Result INT

SELECT @AcceptedQty = 

    ISNULL([Accepted Quantity],0)
    FROM 
    dbo.[Order Transactions Table]
    WHERE @ItemReferance = [Item Referance] 

SELECT @SumOfQty =
    ISNULL(sum(Quantity),0)
    FROM dbo.[Delivery Table]
    GROUP BY [Item Referance]
    HAVING @ItemReferance = [Item Referance]

    SET @Result = ISNULL(@AcceptedQty,0) - ISNULL(@SumOfQty,0)

return @Result

END

我正在寻找一种解决方法，以便能够在订单交易中使用上述函数生成的值 Table。

添加列:

ALTER TABLE [Order Transactions Table]
ADD CalcOB AS [dbo].[fnCalcOutstandingBalance]([Item Referance]) PERSISTED

我已经测试了这个函数，它在 select 中作为一个独立的函数调用正常工作。问题是我需要在计算列中使用它而不是虚拟列。

Answer 1

您可以在 UDF 中尝试 WITH SCHEMABINDING。
这意味着如果不删除 UDF（和计算列等）就无法更改基础表

没有这个，肯定会阻止PERSISTED。

您是否意识到使用像这样的 UDF 对性能和并发性的巨大影响？

It's a cursor（对每一行，逐一进行汇总）
You have odd concurrent behaviours

评论后

CREATE VIEW dbo.SomeView
AS
SELECT
   ott.Col1, ott.Col2, ...,
   OutstandingBalance = ISNULL(ott.[Accepted Quantity],0) - ISNULL(SUM(dt.Quantity),0)
FROM
   dbo.[Order Transactions Table] ott
   LEFT JOIN
   dbo.[Delivery Table] dt ON ott.[Item Referance] = dt.[Item Referance]
GROUP BY
   ott.Col1, ott.Col2, ott.[Accepted Quantity], ...

您可以对视图进行架构绑定，但不能使用 LEFT JOIN 对其进行索引

Answer 2

@gbn 的回答非常正确，但请允许我加 0.02 美元。因为您的标量 UDF 访问 tables，所以我相信您将无法保留此列。也就是说，让我们 100% 清楚：

按照您描述的方式添加计算列绝对没有任何好处，会失去很多。

首先，即使您可以保留此列，任何访问此 table 的查询都会变慢，在某些情况下甚至会变慢。 T-SQL 计算列的标量 UDF，作为约束或默认值使引用 table 不可并行化的查询成为可能；仅串行执行！此外，一旦引入 T-SQL 标量 UDF，可用的优化将变得非常有限。再次-坏，坏主意。

正如 gbn 所说 - 索引视图是可行的方法（如果您可以丢失左连接）。另一种选择是在需要该值时使用内联 Table 值函数；它将比计算列执行得更好（前提是您添加了适当的索引。该函数如下所示：

CREATE FUNCTION dbo.fnCalcOutstandingBalance(@ItemReferance int)
RETURNS TABLE WITH SCHEMABINDING  AS RETURN
SELECT   Result = ISNULL(sum(Quantity),0) -
         (
           SELECT ISNULL([Accepted Quantity],0)
           FROM   dbo.[Order Transactions Table]
           WHERE  @ItemReferance = [Item Referance] 
         )
FROM     dbo.[Delivery Table]
GROUP BY [Item Referance]
HAVING   @ItemReferance = [Item Referance];

要利用此功能，您需要了解 APPLY。这里有一些很好的读物，说明为什么 T-SQL 标量 UDF 对于计算列和约束来说很糟糕。

A Computed Column with a [scalar udf] might Impact Query Performance – 程坤 (SQLCAT)

Another Hidden Parallelism Killer: Scalar UDFs In Check Constraints – 埃里克·达林

Another reason why scalar functions in computed columns is a bad idea – 埃里克·达林

Beware-row-row-operations-udf-clothing – 布赖恩·莫兰

Be careful with constraints calling UDFs – 蒂博尔·卡拉齐

Why does the Execution Plan include a scalar udf call for a persisted computed column? – 堆栈溢出

Answer 3

对于任何有兴趣的人，我已经设法通过使用游标（谢谢 @gbn）找到解决这个问题的方法来处理现有的计算数据并用相应的计算值填充新字段 (CalculatedOB)。

我使用了触发器（[订单交易 Table]。[接受数量] 和 [交货 Table]。 [数量]) 以处理未结余额的任何未来变化。

游标和所有触发器都使用 fnCalcOutstandingBalance() 函数计算值。

填充现有数据的光标：

declare @refid int;
declare @Result int;
declare refcursor cursor for
select [Item Referance] from [Order Transactions Table];

open refcursor

fetch next from refcursor into @refid

while @@FETCH_STATUS = 0
begin 
print @refid

fetch next from refcursor into @refid
set @Result = [dbo].[fnCalcOutstandingBalance](@refid)

update [Order Transactions Table] set CalculateOB = @Result 
    where [Item Referance] = @refid
end 

close refcursor;
deallocate refcursor;

更新触发器示例：

CREATE TRIGGER [dbo].[UPDATE_AcceptedQty]
ON [dbo].[Order Transactions Table]
for update
AS

DECLARE @ItemRef int;
declare @result int;

IF UPDATE ([Accepted Quantity])
Begin

SELECT @ItemRef=i.[Item Referance] from INSERTED i;

SET @result = [dbo].[fnCalcOutstandingBalance](@ItemRef)

UPDATE [Order Transactions Table] set CalculateOB = @Result 
    where [Item Referance] = @ItemRef

END

GO

这两种技术的结合使我能够在不受确定性要求或性能影响的情况下模拟计算列的功能。

非常感谢@gbn 和@Alan Burstein 的贡献！

使用标量 UDF 将多个表中的多行用于持久计算列

Using multiple rows from multiple tables for Persisted Computed Column with a Scalar UDF

tsql

sql-server

migration

user-defined-functions

persisted-column