猪中未命名列的总和
Sum of unnamed column in pig
shipnode,delivery_method ,<unnamed>
(9935,PICK,2)
(9960,PICK,2)
(9969,PICK,1)
(9963,SHP,1)
(9989,SHP,1)
(9995,SHP,1)
(9965,SHP,1)
(9995,SHP,1)
这是
的输出
grunt> group_all_shipnode = GROUP
>> union_all
>> BY(
>> shipnode,delivery_method
>> )
>> ;
最后一列未命名,现在我想生成
作为按 shipnode 和 delivery_node 分组并将第三列的总和作为
(9935,PICK,2)
(9960,PICK,2)
(9969,PICK,1)
(9963,SHP,1)
(9989,SHP,1)
(9995,SHP,2) <<------- sum of similar
(9965,SHP,1)
我正在尝试这样做:
grunt> sum_group_all_shipnode =FOREACH group_all_shipnode
>> GENERATE FLATTEN(group) as(shipnode:chararray, delivery_method:chararray),
>> sum(.);
产生错误:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve sum using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
而不是 $1.$2 它需要是您的加载语句中的关系。
例如,假设您正在将数据加载到关系 A.
A = LOAD 'data.csv' USING PigStorage(',');
group_all_shipnode = GROUP A BY (,);
sum_group_all_shipnode = FOREACH group_all_shipnode
GENERATE
FLATTEN(group) AS (shipnode:chararray, delivery_method:chararray),
SUM(A.);
shipnode,delivery_method ,<unnamed>
(9935,PICK,2)
(9960,PICK,2)
(9969,PICK,1)
(9963,SHP,1)
(9989,SHP,1)
(9995,SHP,1)
(9965,SHP,1)
(9995,SHP,1)
这是
的输出 grunt> group_all_shipnode = GROUP
>> union_all
>> BY(
>> shipnode,delivery_method
>> )
>> ;
最后一列未命名,现在我想生成 作为按 shipnode 和 delivery_node 分组并将第三列的总和作为
(9935,PICK,2)
(9960,PICK,2)
(9969,PICK,1)
(9963,SHP,1)
(9989,SHP,1)
(9995,SHP,2) <<------- sum of similar
(9965,SHP,1)
我正在尝试这样做:
grunt> sum_group_all_shipnode =FOREACH group_all_shipnode
>> GENERATE FLATTEN(group) as(shipnode:chararray, delivery_method:chararray),
>> sum(.);
产生错误:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve sum using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
而不是 $1.$2 它需要是您的加载语句中的关系。 例如,假设您正在将数据加载到关系 A.
A = LOAD 'data.csv' USING PigStorage(',');
group_all_shipnode = GROUP A BY (,);
sum_group_all_shipnode = FOREACH group_all_shipnode
GENERATE
FLATTEN(group) AS (shipnode:chararray, delivery_method:chararray),
SUM(A.);