apache pig中一列的最大值

Maximum value of a column in apache pig

我正在尝试使用 pig.I am 运行 下面的脚本来查找列 ratingTime 的最大值:

    ratings = LOAD '/user/maria_dev/ml-100k/u.data' AS (userid:int,movieID:int,rating:int, ratingTime:int);
    maxrating = MAX(ratings.ratingTime);
    DUMP maxrating

示例输入数据为:

    196 242 3   881250949
    186 302 3   891717742
    22  377 1   878887116
    244 51  2   880606923

我遇到以下错误:

     2018-08-05 07:02:05,247 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 

     2018-08-05 07:02:05,914 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <file script.pi    

在应用 MAX 之前,您需要一个前置 GROUP ALLSource

ratings = LOAD '/user/maria_dev/ml-100k/u.data' USING PigStorage('\t') AS (userid:int,movieID:int,rating:int, ratingTime:int);
rating_group = GROUP ratings  ALL;
maxrating = FOREACH ratings_group GENERATE MAX(ratings.ratingTime);
DUMP maxrating;