Sqoop - 如果使用 order by 和 limit 1,则导入最大值查询失败
Sqoop - Import max value query fails if used order by and limit 1
我有一个简单的 Sqoop 查询,我用它来导入 table 的 ID 的最大值并将其存储在 HDFS 中。存储在 HDFS 中是客户要求的,所以出于多种原因我要这么做。
为了达到我使用的最大值
sqoop import \
--connect jdbc:mysql://abc.com/sqoopemp \
--username root \
--password root \
--e 'select max(id) from emp WHERE $CONDITIONS' \
--target-dir sqooplastmax \
--m 1 \
--driver com.mysql.jdbc.Driver
上面的查询给了我所需的答案,但出于性能原因,我正在考虑使用以下内容
sqoop import \
--connect jdbc:mysql://abc.com/sqoopemp \
--username root \
--password root \
--query 'select id from emp oder by id limit 1 WHERE $CONDITIONS' \
--target-dir sqooplastmax1 \
--m 1 \
--driver com.mysql.jdbc.Driver
这个查询给我一个错误,下面是错误
Warning: /usr/hdp/2.4.0.0-169/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/06/05 15:50:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/06/05 15:50:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/06/05 15:50:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
16/06/05 15:50:06 INFO manager.SqlManager: Using default fetchSize of 1000
16/06/05 15:50:06 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/06/05 15:50:06 INFO manager.SqlManager: Executing SQL statement: select id from emp order by id desc limit 1 WHERE (1 = 0)
16/06/05 15:50:06 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE (1 = 0)' at line 1
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE (1 = 0)' at line 1
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2283)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:758)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:767)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:234)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:304)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
16/06/05 15:50:06 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
问题显然与 WHERE $CONDITIONS 有关,但我不知道我遗漏了哪里。第一个自由形式查询有效,但当我将它与 order by 和 limit 一起使用时,它不起作用。如有任何帮助,我们将不胜感激。
您的查询顺序似乎不正确(而且有错别字):
select id from emp oder by id limit 1 WHERE $CONDITIONS
应阅读:
select id from emp WHERE $CONDITIONS order by id limit 1
此外,如果 $CONDITIONS
是外部设置的,这看起来也不安全:任何人都可以通过所谓的 SQL 注入在 $CONDITIONS
中插入任何代码。
处理 SQL 注入的最佳方法是将 $CONDITION 分成两部分:
1) 列名称
2) 值
如果 Sqoop 不允许这样的参数化查询:
select id from emp WHERE some_column=:columnValue order by id limit 1
那么有两个方向可以选择:
A) 在sqoop调用前添加验证码
或者
B) 在 MySQL 中创建一个存储过程以在执行查询之前检查查询的有效性。
我有一个简单的 Sqoop 查询,我用它来导入 table 的 ID 的最大值并将其存储在 HDFS 中。存储在 HDFS 中是客户要求的,所以出于多种原因我要这么做。
为了达到我使用的最大值
sqoop import \
--connect jdbc:mysql://abc.com/sqoopemp \
--username root \
--password root \
--e 'select max(id) from emp WHERE $CONDITIONS' \
--target-dir sqooplastmax \
--m 1 \
--driver com.mysql.jdbc.Driver
上面的查询给了我所需的答案,但出于性能原因,我正在考虑使用以下内容
sqoop import \
--connect jdbc:mysql://abc.com/sqoopemp \
--username root \
--password root \
--query 'select id from emp oder by id limit 1 WHERE $CONDITIONS' \
--target-dir sqooplastmax1 \
--m 1 \
--driver com.mysql.jdbc.Driver
这个查询给我一个错误,下面是错误
Warning: /usr/hdp/2.4.0.0-169/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/06/05 15:50:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/06/05 15:50:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/06/05 15:50:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
16/06/05 15:50:06 INFO manager.SqlManager: Using default fetchSize of 1000
16/06/05 15:50:06 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/06/05 15:50:06 INFO manager.SqlManager: Executing SQL statement: select id from emp order by id desc limit 1 WHERE (1 = 0)
16/06/05 15:50:06 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE (1 = 0)' at line 1
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE (1 = 0)' at line 1
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2283)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:758)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:767)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:270)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:234)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:304)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
16/06/05 15:50:06 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
问题显然与 WHERE $CONDITIONS 有关,但我不知道我遗漏了哪里。第一个自由形式查询有效,但当我将它与 order by 和 limit 一起使用时,它不起作用。如有任何帮助,我们将不胜感激。
您的查询顺序似乎不正确(而且有错别字):
select id from emp oder by id limit 1 WHERE $CONDITIONS
应阅读:
select id from emp WHERE $CONDITIONS order by id limit 1
此外,如果 $CONDITIONS
是外部设置的,这看起来也不安全:任何人都可以通过所谓的 SQL 注入在 $CONDITIONS
中插入任何代码。
处理 SQL 注入的最佳方法是将 $CONDITION 分成两部分:
1) 列名称 2) 值
如果 Sqoop 不允许这样的参数化查询:
select id from emp WHERE some_column=:columnValue order by id limit 1
那么有两个方向可以选择:
A) 在sqoop调用前添加验证码
或者
B) 在 MySQL 中创建一个存储过程以在执行查询之前检查查询的有效性。