运行 Pig 脚本时出现 TypeCast 异常
TypeCast Exception while running Pig Script
下面是我的猪脚本。它非常简单。加载一些数据。按列过滤数据。使用数据类型生成模式。将数据存储在配置单元 table 中。
当我执行数据时,它抛出
emp = load '/root/emp.nulls' using PigStorage(',');
filt = filter emp by is not null;
f = foreach filt generate [=11=] as id:int, as bdate:chararray, as fname:chararray, as lname:chararray, as gender:chararray, as hdate:chararray;
store f into 'emp_null' using org.apache.hive.hcatalog.pig.HCatStorer();
当我执行数据时,抛出以下错误
2017-09-15 11:21:04,523 [Thread-12] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1554819907_0001
java.lang.Exception: java.io.IOException: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
有人可以帮助我吗?
编辑:
如果我在加载过程中生成模式,它会正常工作。
当您使用以下语法时 [=11=] as id:int
您不是在转换字段,而是使用新字段将值存储在 $0.The 中,正确的方法是在数据类型前面加上前缀field.This 的 field.This 可能已在 Pig.Here 的较新版本中修复,正在讨论 issue 以修复它。
f = foreach filt generate (int)[=10=] as id,
(chararray) as bdate,
(chararray) as fname,
(chararray) as lname,
(chararray) as gender,
(chararray) as hdate;
下面是我的猪脚本。它非常简单。加载一些数据。按列过滤数据。使用数据类型生成模式。将数据存储在配置单元 table 中。 当我执行数据时,它抛出
emp = load '/root/emp.nulls' using PigStorage(',');
filt = filter emp by is not null;
f = foreach filt generate [=11=] as id:int, as bdate:chararray, as fname:chararray, as lname:chararray, as gender:chararray, as hdate:chararray;
store f into 'emp_null' using org.apache.hive.hcatalog.pig.HCatStorer();
当我执行数据时,抛出以下错误
2017-09-15 11:21:04,523 [Thread-12] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1554819907_0001
java.lang.Exception: java.io.IOException: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.IOException: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
有人可以帮助我吗?
编辑: 如果我在加载过程中生成模式,它会正常工作。
当您使用以下语法时 [=11=] as id:int
您不是在转换字段,而是使用新字段将值存储在 $0.The 中,正确的方法是在数据类型前面加上前缀field.This 的 field.This 可能已在 Pig.Here 的较新版本中修复,正在讨论 issue 以修复它。
f = foreach filt generate (int)[=10=] as id,
(chararray) as bdate,
(chararray) as fname,
(chararray) as lname,
(chararray) as gender,
(chararray) as hdate;