U-SQL 构建错误,equijoin 有不同的类型

U-SQL build error, equijoin have different types

我正在尝试创建一个 USQL 作业并从将从中检索它们的 CSV 定义我的列,但是我总是在 JOIN 部分遇到问题,因为我匹配的列属于不同类型.这很奇怪,因为我将它们定义为同一类型。查看问题所在的屏幕截图:

这是完整的 USQL:

@guestCheck = 
    EXTRACT GuestCheckID int,
            POSCheckGUID Guid,
            POSCheckNumber int?,
            OwnerEmployeeID int,
            CreatedDateTime DateTime?,
            ClosedDateTime DateTime?,
            TicketReference string,
            CheckAmount decimal?,
            POSTerminalID int,
            CheckState string,
            LocationID int?,
            TableID int?,
            Covers int?,
            PostedDateTime DateTime?,
            OrderChannelID int?,
            MealPeriodID int?,
            RVCLocationID int?,
            ReopenedTerminalID int?,
            ReopenedEmployeeID int?,
            ReopenedDateTime DateTime?,
            ClosedBusDate int?,
            PostedBusDate int?,
            BusHour byte?,
            TaxExempt bool?,
            TaxExemptReference string
    FROM "/GuestCheck/GuestCheck-incomplete.csv"
    USING Extractors.Csv();

@guestCheckAncillaryAmount =
    EXTRACT CheckAncillaryAmountID int,
            GuestCheckID int,
            GuestCheckItemID int?,
            AncillaryAmountTypeID int,
            Amount decimal,
            FirstDetail int?,
            LastDetail int?,
            IsReturn bool?,
            ReturnReasonID int?,
            AncillaryReasonID int?,
            AncillaryNote string,
            ClosedBusDate int?,
            PostedBusDate int?,
            BusHour byte?,
            LocationID int?,
            RVCLocationID int?,
            IsDelisted bool?,
            Exempted bool?
    FROM "/GuestCheck/GuestCheckAncillaryAmount.csv"
    USING Extractors.Csv();

@ancillaryAmountType = 
    EXTRACT AncillaryAmountTypeID int,
            AncillaryAmountCategoryID int,
            CustomerID int,
            CheckTitle string,
            ReportTitle string,
            Percentage decimal,
            FixedAmount decimal,
            IncludeOnCheck bool,
            AutoCalculate bool,
            StoreAtCheckLevel bool?,
            DateTimeModified DateTime?,
            CheckTitleToken Guid?,
            ReportTitleToken Guid?,
            DeletedFlag bool,
            MaxUsageQty int?,
            ApplyToBasePriceOnly bool?,
            Exclusive bool,
            IsItem bool,
            MinValue decimal,
            MaxValue decimal,
            ItemGroupID int?,
            LocationID int,
            ApplicationOrder int?,
            RequiresReason bool,
            Exemptable bool?
    FROM "/GuestCheck/AncillaryAmountType.csv"
    USING Extractors.Csv();

@read =
    SELECT t.POSCheckGUID,
           t.POSCheckNumber,
           t.CheckAmount,
           aat.AncillaryAmountTypeID,
           aat.CheckTitle,
           gcd.Amount
    FROM @guestCheck AS t         
         LEFT JOIN
             @guestCheckAncillaryAmount AS gcd
         ON t.GuestCheckID == gcd.GuestCheckID
         LEFT JOIN
             @ancillaryAmountType AS aat
         ON gcd.AncillaryAmountTypeID == aat.AncillaryAmountTypeID
    WHERE aat.AncillaryAmountCategoryID IN(2, 4, 8);

OUTPUT @read
TO "/GuestCheckOutput/output.csv"
USING Outputters.Csv();

的确,U-SQL是强类型的,intint?是不同的类型。您需要在中间行集中进行转换:

@ancillaryAmountType2 =
SELECT (int?) aat.AncillaryAmountTypeID AS AncillaryAmountTypeID,
       aat.AncillaryAmountCategoryID,
       aat.CheckTitle
FROM @ancillaryAmountType AS aat;

或者,更好的是,使用维度建模最佳实践,并避免为空 "dimensions",原因如 http://blog.chrisadamson.com/2013/01/avoid-null-in-dimensions.html 中所述。

这与 EXTRACT table 定义中指定的列的可空性无关,因为正如 OP 在其代码中所示,两个连接列都未指定为EXTRACT 定义中的 null(即带有 ?)。这与多个外部连接以及所谓的空供应 table.

有关

如果你从逻辑上考虑一下,假设你有3个table,TableA有3条记录,TableB有2条记录,TableC有1条记录,像这样:

如果您从 tableA 和 left outer join 到 tableB 开始,您本能地知道您将获得三个记录,但是 tableB 的 x 列将为空第 x 列;这是您的空供应 table 以及可空性的来源。

谢天谢地,修复是一样的;更早更改列的可空性或指定替换值,例如 -1.

@t3 =
    SELECT (int?) x AS x, 2 AS a
    FROM dbo.tmpC;

// OR

// Use conditional operator to supply substitute values
@t3 =
    SELECT x == null ? -1 : x AS x, 2 AS a
    FROM dbo.tmpC;

但是您的特定查询还有另一个问题。在大多数关系数据库中,将 WHERE 子句添加到 left outer join 右侧的 table 会将连接转换为 inner join,这在 U- SQL。您可能需要考虑您尝试获得的真实结果并考虑重写您的查询。

HTH