apache pig中一列的最大值
Maximum value of a column in apache pig
我正在尝试使用 pig.I am 运行 下面的脚本来查找列 ratingTime 的最大值:
ratings = LOAD '/user/maria_dev/ml-100k/u.data' AS (userid:int,movieID:int,rating:int, ratingTime:int);
maxrating = MAX(ratings.ratingTime);
DUMP maxrating
示例输入数据为:
196 242 3 881250949
186 302 3 891717742
22 377 1 878887116
244 51 2 880606923
我遇到以下错误:
2018-08-05 07:02:05,247 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook
2018-08-05 07:02:05,914 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <file script.pi
在应用 MAX
之前,您需要一个前置 GROUP ALL
。Source
ratings = LOAD '/user/maria_dev/ml-100k/u.data' USING PigStorage('\t') AS (userid:int,movieID:int,rating:int, ratingTime:int);
rating_group = GROUP ratings ALL;
maxrating = FOREACH ratings_group GENERATE MAX(ratings.ratingTime);
DUMP maxrating;
我正在尝试使用 pig.I am 运行 下面的脚本来查找列 ratingTime 的最大值:
ratings = LOAD '/user/maria_dev/ml-100k/u.data' AS (userid:int,movieID:int,rating:int, ratingTime:int);
maxrating = MAX(ratings.ratingTime);
DUMP maxrating
示例输入数据为:
196 242 3 881250949
186 302 3 891717742
22 377 1 878887116
244 51 2 880606923
我遇到以下错误:
2018-08-05 07:02:05,247 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook
2018-08-05 07:02:05,914 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <file script.pi
在应用 MAX
之前,您需要一个前置 GROUP ALL
。Source
ratings = LOAD '/user/maria_dev/ml-100k/u.data' USING PigStorage('\t') AS (userid:int,movieID:int,rating:int, ratingTime:int);
rating_group = GROUP ratings ALL;
maxrating = FOREACH ratings_group GENERATE MAX(ratings.ratingTime);
DUMP maxrating;