Pig 中的 Json 加载程序出错

Error from Json Loader in Pig

我在编写 json 脚本时遇到以下错误。请告诉我如何在 pig 中编写 json 加载程序脚本。

脚本:

x = 加载 'hdfs://user/spanda20/pig/phone.dat' 使用 JsonLoader('id:chararray, phone:(home:{(num:chararray, city:chararray)})');

数据集: { "id": "12345", "phone":{ "home":[ { "zip": "23060", "city": "henrico" }, { "zip": "08902", "city": "northbrunswick" } ] } }

2015-03-18 14:24:10,917 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-03-18 14:24:10,918 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1426618756946_0028 has failed! Stop running all dependent jobs
2015-03-18 14:24:10,918 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-03-18 14:24:10,977 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backed error: AttemptID:attempt_1426618756946_0028_m_000000_3 Info:Error: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream@43c59008; line: 1, column: 0])
 at [Source: java.io.ByteArrayInputStream@43c59008; line: 1, column: 3]
        at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
        at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
        at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:318)
        at org.codehaus.jackson.impl.JsonParserBase._handleEOF(JsonParserBase.java:354)
        at org.codehaus.jackson.impl.Utf8StreamParser._skipWSOrEnd(Utf8StreamParser.java:1841)
        at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:275)
        at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:180)
        at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:164)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

2015-03-18 14:24:10,977 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2015-03-18 14:24:10,978 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
2.5.0-cdh5.2.0  0.12.0-cdh5.2.0 spanda20        2015-03-18 14:23:02     2015-03-18 14:24:10     UNKNOWN

问候 桑吉布

Sanjeeb - 使用这个 json:

{"id":"12345","phone":{"home":[{"zip":"23060","city":"henrico"},{"zip":"08902","city":"northbrunswick"}]}}

输出应为: (12345,({(23060,henrico),(08902,northbrunswick)}))

PS:猪一般不喜欢"human readable"json。摆脱空格 and/or 缩进,你就很好了。