U-SQL user defined combiner returns 错误数据

U-SQL user defined combiner returns wrong data

我尝试使用数据湖分析并创建了一个简单的 USQL 组合器,但据我所知,它 return 是错误的数据。它 returns 3 行 :

2R

3R

3R

但我希望它应该 return 6 行:1L、2L、3L、1R、2R、3R。

这是代码:

@T1 = SELECT * FROM (VALUES ("1"), ("2"), ("3")) AS T(DummyValue);
@T2 = SELECT * FROM (VALUES ("1"), ("2"), ("3")) AS T(DummyValue);

@Result =
 COMBINE @T1 AS fis
 WITH @T2 AS frs
 ON fis.DummyValue == frs.DummyValue
 PRODUCE DummyValue string
 USING new Demo.MyCombiner();

OUTPUT @Result TO "/o.csv" USING Outputters.Csv();


 [SqlUserDefinedCombiner(Mode = CombinerMode.Full)]
 public class MyCombiner : ICombiner {

  public override IEnumerable<IRow> Combine(IRowset left, IRowset right, IUpdatableRow output) {
   var CopyLeft = left.Rows.ToList();
   var CopyRight = right.Rows.ToList();

   foreach (var Item in CopyLeft) {
    var X = Item.Get<string>("DummyValue");
    output.Set<string>("DummyValue", X + "L");
   }
   foreach (var Item in CopyRight) {
    var X = Item.Get<string>("DummyValue");
    output.Set<string>("DummyValue", X + "R");
   }

   yield return output.AsReadOnly();
  }

 }

您正在执行的操作实际上是一个 UNION ALL,因此可以使用基数 U-SQL 更简单地完成此操作,例如:

@Result =
    SELECT DummyValue + "L" AS DummyValue
    FROM @T1
    UNION ALL
    SELECT DummyValue + "R" AS DummyValue
    FROM @T2;

假设您出于某些特定原因想要使用自定义 COMBINER,那么您只为两个循环调用 yield return... 一次,所以这就是您只获得三行的原因。也许您可以告诉我们更多关于您正在尝试做什么的信息?

但是,如果您真的需要 UNION ALL 使用自定义组合器,那么这对我有用:

[SqlUserDefinedCombiner]
    public class MyCombiner : ICombiner
    {
        public override IEnumerable<IRow> Combine(IRowset left, IRowset right, IUpdatableRow output)
        {
            foreach (IRow rowR in right.Rows)
            {
                output.Set<string>("NewValue", rowR.Get<string>("DummyValue").ToString() + "R");
                yield return output.AsReadOnly();
            }

            foreach (IRow rowL in left.Rows)
            {
                output.Set<string>("NewValue", rowL.Get<string>("DummyValue").ToString() + "L");
                yield return output.AsReadOnly();
            }
        }
    }

U-SQL 调用我的自定义组合器:

@Result =
    COMBINE @T1 AS fis WITH @T2 AS frs
    ON fis.DummyValue == frs.DummyValue
    PRODUCE NewValue string
    USING new Demo.MyCombiner();

我的结果: