Pig 中的 Json 加载程序出错
Error from Json Loader in Pig
我在编写 json 脚本时遇到以下错误。请告诉我如何在 pig 中编写 json 加载程序脚本。
脚本:
x = 加载 'hdfs://user/spanda20/pig/phone.dat' 使用 JsonLoader('id:chararray, phone:(home:{(num:chararray, city:chararray)})');
数据集:
{
"id": "12345",
"phone":{
"home":[
{
"zip": "23060",
"city": "henrico"
},
{
"zip": "08902",
"city": "northbrunswick"
}
]
}
}
2015-03-18 14:24:10,917 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-03-18 14:24:10,918 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1426618756946_0028 has failed! Stop running all dependent jobs
2015-03-18 14:24:10,918 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-03-18 14:24:10,977 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backed error: AttemptID:attempt_1426618756946_0028_m_000000_3 Info:Error: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream@43c59008; line: 1, column: 0])
at [Source: java.io.ByteArrayInputStream@43c59008; line: 1, column: 3]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:318)
at org.codehaus.jackson.impl.JsonParserBase._handleEOF(JsonParserBase.java:354)
at org.codehaus.jackson.impl.Utf8StreamParser._skipWSOrEnd(Utf8StreamParser.java:1841)
at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:275)
at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:180)
at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:164)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-03-18 14:24:10,977 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2015-03-18 14:24:10,978 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.5.0-cdh5.2.0 0.12.0-cdh5.2.0 spanda20 2015-03-18 14:23:02 2015-03-18 14:24:10 UNKNOWN
问候
桑吉布
Sanjeeb - 使用这个 json:
{"id":"12345","phone":{"home":[{"zip":"23060","city":"henrico"},{"zip":"08902","city":"northbrunswick"}]}}
输出应为:
(12345,({(23060,henrico),(08902,northbrunswick)}))
PS:猪一般不喜欢"human readable"json。摆脱空格 and/or 缩进,你就很好了。
我在编写 json 脚本时遇到以下错误。请告诉我如何在 pig 中编写 json 加载程序脚本。
脚本:
x = 加载 'hdfs://user/spanda20/pig/phone.dat' 使用 JsonLoader('id:chararray, phone:(home:{(num:chararray, city:chararray)})');
数据集: { "id": "12345", "phone":{ "home":[ { "zip": "23060", "city": "henrico" }, { "zip": "08902", "city": "northbrunswick" } ] } }
2015-03-18 14:24:10,917 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2015-03-18 14:24:10,918 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1426618756946_0028 has failed! Stop running all dependent jobs
2015-03-18 14:24:10,918 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-03-18 14:24:10,977 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backed error: AttemptID:attempt_1426618756946_0028_m_000000_3 Info:Error: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.ByteArrayInputStream@43c59008; line: 1, column: 0])
at [Source: java.io.ByteArrayInputStream@43c59008; line: 1, column: 3]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:318)
at org.codehaus.jackson.impl.JsonParserBase._handleEOF(JsonParserBase.java:354)
at org.codehaus.jackson.impl.Utf8StreamParser._skipWSOrEnd(Utf8StreamParser.java:1841)
at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:275)
at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:180)
at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:164)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-03-18 14:24:10,977 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2015-03-18 14:24:10,978 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.5.0-cdh5.2.0 0.12.0-cdh5.2.0 spanda20 2015-03-18 14:23:02 2015-03-18 14:24:10 UNKNOWN
问候 桑吉布
Sanjeeb - 使用这个 json:
{"id":"12345","phone":{"home":[{"zip":"23060","city":"henrico"},{"zip":"08902","city":"northbrunswick"}]}}
输出应为: (12345,({(23060,henrico),(08902,northbrunswick)}))
PS:猪一般不喜欢"human readable"json。摆脱空格 and/or 缩进,你就很好了。