错误 2103:做多头交易
ERROR 2103: doing work on Longs
我有资料
store trn_date dept_id sale_amt
1 2014-12-15 101 10007655
1 2014-12-15 101 10007654
1 2014-12-15 101 10007544
6 2014-12-15 104 100086544
8 2014-12-14 101 1000000
8 2014-12-15 101 100865761
我正在尝试使用以下代码汇总数据 -
加载数据(尝试使用 HCatLoader() 和 PigStorage() 两种方式)
data = LOAD 'data' USING org.apache.hcatalog.pig.HCatLoader();
group_table = GROUP data BY (store, tran_date, dept_id);
group_gen = FOREACH grp_table GENERATE
FLATTEN(group) AS (store, tran_date, dept_id),
SUM(table.sale_amt) AS tota_sale_amt;
下面是我在运行作业时得到的错误堆栈跟踪
================================================================================
Pig Stack Trace
---------------
ERROR 2103: Problem doing work on Longs
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: grouped_all: Local Rearrange[tuple]{tuple}(false) - scope-1317 Operator Key: scope-1317): org.apache.pig.backend.executionengine.ExecException: ERROR 2103: Problem doing work on Longs
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:183)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:161)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1645)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1611)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2103: Problem doing work on Longs
at org.apache.pig.builtin.AlgebraicLongMathBase.doTupleWork(AlgebraicLongMathBase.java:84)
at org.apache.pig.builtin.AlgebraicLongMathBase$Intermediate.exec(AlgebraicLongMathBase.java:108)
at org.apache.pig.builtin.AlgebraicLongMathBase$Intermediate.exec(AlgebraicLongMathBase.java:102)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:369)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:333)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number
at org.apache.pig.builtin.AlgebraicLongMathBase.doTupleWork(AlgebraicLongMathBase.java:77)
================================================================================
当我在寻找解决方案时,许多人说这是由于使用 HCatalog Loader 加载数据造成的。所以我尝试使用 "PigStorage()".
加载数据
仍然出现相同的错误。
这可能是因为您在配置单元中存储数据的方式。如果要在任何列上发生任何聚合,请提及它是 数据类型整数或数字。
基本上每个聚合函数 returns 数据及其默认数据类型,
Like -
AVG returns DOUBLE
SUM returns DOUBLE
COUNT returns LONG
我不认为这是将它存储到配置单元时的问题,因为您已经尝试过 PigStore() 这意味着,这只是您将其传递给聚合时的数据类型问题。
在将数据传递给聚合之前尝试更改数据类型并尝试。
我有资料
store trn_date dept_id sale_amt
1 2014-12-15 101 10007655
1 2014-12-15 101 10007654
1 2014-12-15 101 10007544
6 2014-12-15 104 100086544
8 2014-12-14 101 1000000
8 2014-12-15 101 100865761
我正在尝试使用以下代码汇总数据 - 加载数据(尝试使用 HCatLoader() 和 PigStorage() 两种方式)
data = LOAD 'data' USING org.apache.hcatalog.pig.HCatLoader();
group_table = GROUP data BY (store, tran_date, dept_id);
group_gen = FOREACH grp_table GENERATE
FLATTEN(group) AS (store, tran_date, dept_id),
SUM(table.sale_amt) AS tota_sale_amt;
下面是我在运行作业时得到的错误堆栈跟踪
================================================================================
Pig Stack Trace
---------------
ERROR 2103: Problem doing work on Longs
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: grouped_all: Local Rearrange[tuple]{tuple}(false) - scope-1317 Operator Key: scope-1317): org.apache.pig.backend.executionengine.ExecException: ERROR 2103: Problem doing work on Longs
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:183)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:161)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1645)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1611)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2103: Problem doing work on Longs
at org.apache.pig.builtin.AlgebraicLongMathBase.doTupleWork(AlgebraicLongMathBase.java:84)
at org.apache.pig.builtin.AlgebraicLongMathBase$Intermediate.exec(AlgebraicLongMathBase.java:108)
at org.apache.pig.builtin.AlgebraicLongMathBase$Intermediate.exec(AlgebraicLongMathBase.java:102)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:369)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:333)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281)
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number
at org.apache.pig.builtin.AlgebraicLongMathBase.doTupleWork(AlgebraicLongMathBase.java:77)
================================================================================
当我在寻找解决方案时,许多人说这是由于使用 HCatalog Loader 加载数据造成的。所以我尝试使用 "PigStorage()".
加载数据
仍然出现相同的错误。
这可能是因为您在配置单元中存储数据的方式。如果要在任何列上发生任何聚合,请提及它是 数据类型整数或数字。
基本上每个聚合函数 returns 数据及其默认数据类型,
Like -
AVG returns DOUBLE
SUM returns DOUBLE
COUNT returns LONG
我不认为这是将它存储到配置单元时的问题,因为您已经尝试过 PigStore() 这意味着,这只是您将其传递给聚合时的数据类型问题。
在将数据传递给聚合之前尝试更改数据类型并尝试。