U-SQL user defined combiner returns 错误数据
U-SQL user defined combiner returns wrong data
我尝试使用数据湖分析并创建了一个简单的 USQL 组合器,但据我所知,它 return 是错误的数据。它 returns 3 行 :
2R
3R
3R
但我希望它应该 return 6 行:1L、2L、3L、1R、2R、3R。
这是代码:
@T1 = SELECT * FROM (VALUES ("1"), ("2"), ("3")) AS T(DummyValue);
@T2 = SELECT * FROM (VALUES ("1"), ("2"), ("3")) AS T(DummyValue);
@Result =
COMBINE @T1 AS fis
WITH @T2 AS frs
ON fis.DummyValue == frs.DummyValue
PRODUCE DummyValue string
USING new Demo.MyCombiner();
OUTPUT @Result TO "/o.csv" USING Outputters.Csv();
[SqlUserDefinedCombiner(Mode = CombinerMode.Full)]
public class MyCombiner : ICombiner {
public override IEnumerable<IRow> Combine(IRowset left, IRowset right, IUpdatableRow output) {
var CopyLeft = left.Rows.ToList();
var CopyRight = right.Rows.ToList();
foreach (var Item in CopyLeft) {
var X = Item.Get<string>("DummyValue");
output.Set<string>("DummyValue", X + "L");
}
foreach (var Item in CopyRight) {
var X = Item.Get<string>("DummyValue");
output.Set<string>("DummyValue", X + "R");
}
yield return output.AsReadOnly();
}
}
您正在执行的操作实际上是一个 UNION ALL
,因此可以使用基数 U-SQL 更简单地完成此操作,例如:
@Result =
SELECT DummyValue + "L" AS DummyValue
FROM @T1
UNION ALL
SELECT DummyValue + "R" AS DummyValue
FROM @T2;
假设您出于某些特定原因想要使用自定义 COMBINER,那么您只为两个循环调用 yield return...
一次,所以这就是您只获得三行的原因。也许您可以告诉我们更多关于您正在尝试做什么的信息?
但是,如果您真的需要 UNION ALL
使用自定义组合器,那么这对我有用:
[SqlUserDefinedCombiner]
public class MyCombiner : ICombiner
{
public override IEnumerable<IRow> Combine(IRowset left, IRowset right, IUpdatableRow output)
{
foreach (IRow rowR in right.Rows)
{
output.Set<string>("NewValue", rowR.Get<string>("DummyValue").ToString() + "R");
yield return output.AsReadOnly();
}
foreach (IRow rowL in left.Rows)
{
output.Set<string>("NewValue", rowL.Get<string>("DummyValue").ToString() + "L");
yield return output.AsReadOnly();
}
}
}
U-SQL 调用我的自定义组合器:
@Result =
COMBINE @T1 AS fis WITH @T2 AS frs
ON fis.DummyValue == frs.DummyValue
PRODUCE NewValue string
USING new Demo.MyCombiner();
我的结果:
我尝试使用数据湖分析并创建了一个简单的 USQL 组合器,但据我所知,它 return 是错误的数据。它 returns 3 行 :
2R
3R
3R
但我希望它应该 return 6 行:1L、2L、3L、1R、2R、3R。
这是代码:
@T1 = SELECT * FROM (VALUES ("1"), ("2"), ("3")) AS T(DummyValue);
@T2 = SELECT * FROM (VALUES ("1"), ("2"), ("3")) AS T(DummyValue);
@Result =
COMBINE @T1 AS fis
WITH @T2 AS frs
ON fis.DummyValue == frs.DummyValue
PRODUCE DummyValue string
USING new Demo.MyCombiner();
OUTPUT @Result TO "/o.csv" USING Outputters.Csv();
[SqlUserDefinedCombiner(Mode = CombinerMode.Full)]
public class MyCombiner : ICombiner {
public override IEnumerable<IRow> Combine(IRowset left, IRowset right, IUpdatableRow output) {
var CopyLeft = left.Rows.ToList();
var CopyRight = right.Rows.ToList();
foreach (var Item in CopyLeft) {
var X = Item.Get<string>("DummyValue");
output.Set<string>("DummyValue", X + "L");
}
foreach (var Item in CopyRight) {
var X = Item.Get<string>("DummyValue");
output.Set<string>("DummyValue", X + "R");
}
yield return output.AsReadOnly();
}
}
您正在执行的操作实际上是一个 UNION ALL
,因此可以使用基数 U-SQL 更简单地完成此操作,例如:
@Result =
SELECT DummyValue + "L" AS DummyValue
FROM @T1
UNION ALL
SELECT DummyValue + "R" AS DummyValue
FROM @T2;
假设您出于某些特定原因想要使用自定义 COMBINER,那么您只为两个循环调用 yield return...
一次,所以这就是您只获得三行的原因。也许您可以告诉我们更多关于您正在尝试做什么的信息?
但是,如果您真的需要 UNION ALL
使用自定义组合器,那么这对我有用:
[SqlUserDefinedCombiner]
public class MyCombiner : ICombiner
{
public override IEnumerable<IRow> Combine(IRowset left, IRowset right, IUpdatableRow output)
{
foreach (IRow rowR in right.Rows)
{
output.Set<string>("NewValue", rowR.Get<string>("DummyValue").ToString() + "R");
yield return output.AsReadOnly();
}
foreach (IRow rowL in left.Rows)
{
output.Set<string>("NewValue", rowL.Get<string>("DummyValue").ToString() + "L");
yield return output.AsReadOnly();
}
}
}
U-SQL 调用我的自定义组合器:
@Result =
COMBINE @T1 AS fis WITH @T2 AS frs
ON fis.DummyValue == frs.DummyValue
PRODUCE NewValue string
USING new Demo.MyCombiner();
我的结果: