Sqoop 无法将 Oracle 列类型映射到 Hive 列
Sqoop not able to map oracle column types to hive column
我正在 sqooping 一堆 Oracle 10g tables 到配置单元中。我正在我的集群上处理 Hortonworks HDP2.3。
其中一个超过100列的oracletable有一个ROW_ID列,我发现它的类型是oracle.sql.ROWID。
Sqoop 抛出错误:
2016-02-04 16:09:19,746 ERROR - [main:] ~ Cannot resolve SQL type -8 (ClassWriter:645)
2016-02-04 16:09:19,747 ERROR - [main:] ~ Cannot resolve SQL type -8 (ClassWriter:645)
2016-02-04 16:09:19,747 ERROR - [main:] ~ No Java type for SQL type -8 for column ROW_ID (ClassWriter:718)
2016-02-04 16:09:19,748 ERROR - [main:] ~ No Java type for SQL type -8 for column ROW_ID (ClassWriter:718)
2016-02-04 16:09:19,749 ERROR - [main:] ~ No Java type for SQL type -8 for column ROW_ID (ClassWriter:798)
2016-02-04 16:09:19,756 ERROR - [main:] ~ Got exception running Sqoop: java.lang.NullPointerException (Sqoop:181)
java.lang.NullPointerException
at org.apache.sqoop.orm.ClassWriter.parseNullVal(ClassWriter.java:1377)
at org.apache.sqoop.orm.ClassWriter.parseColumn(ClassWriter.java:1402)
at org.apache.sqoop.orm.ClassWriter.myGenerateParser(ClassWriter.java:1528)
at org.apache.sqoop.orm.ClassWriter.generateParser(ClassWriter.java:1491)
at org.apache.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:1920)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1736)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
有没有办法将此 oracle 列类型映射到任何 Hive 列类型或只是将其设为字符串?
好的,不确定这样做是否正确。但无论如何。
首先我尝试了 --map-column-hive ROW_ID=String
,它仍然报同样的错误。
然后我尝试了 --map-column-java ROW_ID=String
,它很高兴地从oracle下载了数据,但是还没有准备好放入hive:
2016-02-04 16:55:17,655 ERROR - [main:] ~ Encountered IOException running import job: java.io.IOException: Hive does not support the SQL type for column ROW_ID
at org.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:181)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:188)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
(ImportTool:613)
嗯,他们的组合终于成功了。
sqoop import .... --map-column-java ROW_ID=String --map-column-hive ROW_ID=String
此外,--split-by
也不喜欢 ROW_ID。所以我不得不为此使用不同的列。
-干杯。
我正在 sqooping 一堆 Oracle 10g tables 到配置单元中。我正在我的集群上处理 Hortonworks HDP2.3。 其中一个超过100列的oracletable有一个ROW_ID列,我发现它的类型是oracle.sql.ROWID。 Sqoop 抛出错误:
2016-02-04 16:09:19,746 ERROR - [main:] ~ Cannot resolve SQL type -8 (ClassWriter:645)
2016-02-04 16:09:19,747 ERROR - [main:] ~ Cannot resolve SQL type -8 (ClassWriter:645)
2016-02-04 16:09:19,747 ERROR - [main:] ~ No Java type for SQL type -8 for column ROW_ID (ClassWriter:718)
2016-02-04 16:09:19,748 ERROR - [main:] ~ No Java type for SQL type -8 for column ROW_ID (ClassWriter:718)
2016-02-04 16:09:19,749 ERROR - [main:] ~ No Java type for SQL type -8 for column ROW_ID (ClassWriter:798)
2016-02-04 16:09:19,756 ERROR - [main:] ~ Got exception running Sqoop: java.lang.NullPointerException (Sqoop:181)
java.lang.NullPointerException
at org.apache.sqoop.orm.ClassWriter.parseNullVal(ClassWriter.java:1377)
at org.apache.sqoop.orm.ClassWriter.parseColumn(ClassWriter.java:1402)
at org.apache.sqoop.orm.ClassWriter.myGenerateParser(ClassWriter.java:1528)
at org.apache.sqoop.orm.ClassWriter.generateParser(ClassWriter.java:1491)
at org.apache.sqoop.orm.ClassWriter.generateClassForColumns(ClassWriter.java:1920)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1736)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
有没有办法将此 oracle 列类型映射到任何 Hive 列类型或只是将其设为字符串?
好的,不确定这样做是否正确。但无论如何。
首先我尝试了 --map-column-hive ROW_ID=String
,它仍然报同样的错误。
然后我尝试了 --map-column-java ROW_ID=String
,它很高兴地从oracle下载了数据,但是还没有准备好放入hive:
2016-02-04 16:55:17,655 ERROR - [main:] ~ Encountered IOException running import job: java.io.IOException: Hive does not support the SQL type for column ROW_ID
at org.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:181)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:188)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
(ImportTool:613)
嗯,他们的组合终于成功了。
sqoop import .... --map-column-java ROW_ID=String --map-column-hive ROW_ID=String
此外,--split-by
也不喜欢 ROW_ID。所以我不得不为此使用不同的列。
-干杯。