用 Pig-Latin 打印哈希表?

Printed HashTable in Pig-Latin?

例如,我有这个数据:

x, 23
y, 492
v, 2034
x, 45
z, 25
v, 29

我想转换成:

x, 23, 45
y, 492
v, 2034, 29
z, 25

这相当于打印的散列 table。

这是我当前的脚本:

logs = LOAD 'tmp' using MyLoader (Parameters) as 
       (x:bytearray, y:bytearray, z, x1, y1:bytearray, z1:long, x2:bytearray,  
       z2:bytearray, z3:bytearray, z4:float, dataMap:map[], 
       recs:bag{(record:bytearray)}, key:bytearray, colo:bytearray);

filtered_logs = foreach logs { 
    info = FILTER records BY record MATCHES 'FIRST_REGEX';
    info_records = FOREACH info GENERATE GET_FIELDS([=12=]) as 
                   rec:tuple(mClass:bytearray, rType:bytearray, 
                   rName:bytearray, rStatus:bytearray, rDuration:float, 
                   rData:bytearray, rDataMap:map[]);

    name = FOREACH info_records GENERATE rec.rName;

    matching_requests = FILTER records BY record MATCHES 'SECOND_REGEX'; 

    GENERATE FLATTEN(client_name) as client_name:chararray, 
    dataMap#'corr_id_', (SIZE(matching_requests) > 0 ? true : false) 
    as matched:boolean;
}

A = FILTER filtered_logs BY matched; 

key_corr_id = foreach A generate (chararray)  as key, (chararray)  as corr_id;

id_group = group key_corr_id by key; -- ERROR thrown when this line is included.

STORE id_group into '$output' using 
org.apache.pig.piggybank.storage.CSVExcelStorage(, 'YES_MULTILINE');

抛出的错误:

java.lang.ClassCastException: org.apache.pig.data.DataByteArrayString cannot be cast to java.lang.String

无需创建新关系和 join.Just 按键分组并转储关系。

key_corr_id = foreach A generate (chararray)  as key:chararray, (chararray)  as corr_id:chararray;
id_group = group key_corr_id by key;
dump id_group;

现在,如果您不希望元组表示键 x , {(23),(45)} 但希望项目像 x,23,45 那样分开,则添加另一个步骤以在 corr_id在这样的分组中

final = foreach id_group generate key,BagToString(A., ',');
dump final;