如何将包转换为数值数组?

How can I convert a bag to an array of numeric values?

我正在尝试转换以下架构:

{
  id: chararray,
  v: chararray,
  paid: chararray,
  ts: {(ts: int)}
}

进入以下JSON输出:

{
  "id": "abcdef123456",
  v:    "some identifier",
  paid: "another identifier",
  ts:   [ 1,2,3,4,5,6 ]
}

我知道如何生成 JSON 输出,但我不知道如何将 Pig Schema 中的 ts 属性转换为数值数组。

ts 包中的物品数量已知,但它们都具有相同的架构 (ts: int)

Pig 不支持数组类型的数据类型,您可以尝试这样一种选择。

输入

1       1       100     {(1),(2),(3)}
2       2       200     {(4),(5)}
3       3       300     {(1),(2),(3),(4),(5),(6)}

PigScript:

A = LOAD 'input' USING PigStorage() AS (id: chararray, v: chararray,paid: chararray,ts: {(ts: int)});
B = FOREACH A GENERATE id,v,paid,CONCAT('[',BagToString(ts,','),']') AS ts;
STORE B INTO 'output' USING JsonStorage();

输出:

{"id":"1","v":"1","paid":"100","ts":"[1,2,3]"}
{"id":"2","v":"2","paid":"200","ts":"[4,5]"}
{"id":"3","v":"3","paid":"300","ts":"[1,2,3,4,5,6]"}