异常:通过 sqoop2 将数据从 Oracle 复制到 HDFS 时作业失败 status:3
Exception: Job Failed with status:3 when copying data from Oracle to HDFS through sqoop2
我正在尝试使用 Sqoop2 将数据从 Oracle 11g2 服务器复制到 HDFS。
Oracle 的 link 似乎 有效,因为如果我使用无效凭据,它会抱怨。定义如下:
link with id 14 and name OLink (Enabled: true, Created by xxx at 2/9/16 2:48 PM, Updated by xxx at 2/11/16 10:08 AM)
Using Connector generic-jdbc-connector with id 4
Link configuration
JDBC Driver Class: oracle.jdbc.driver.OracleDriver
JDBC Connection String: jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=5999)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=abc)))
Username: xxx
Password:
JDBC Connection Properties:
(奇怪的端口号在这里,因为我现在需要使用反向隧道访问数据库。很快就会修复)
作业定义如下
Job with id 2 and name Test OLink (Enabled: true, Created by xxx at 2/9/16 2:56 PM, Updated by xxx at 2/11/16 10:58 AM)
Using link id 14 and Connector id 4
From database configuration
Schema name: xxx
Table name: t_name
Table SQL statement:
Table column names:
Partition column name: CL_ID
Null value allowed for the partition column: false
Boundary query:
Throttling resources
Extractors: 3
Loaders: 3
ToJob configuration
Override null value:
Null value:
Output format: TEXT_FILE
Compression format: DEFAULT
Custom compression format:
Output directory: /tmp
当我开始作业时(详细模式设置为 true),它列出了远程 table 的所有列名和类型(意味着与 Oracle 的连接正常),但是作业失败了,例如
2016-02-11 10:44:42 UTC: BOOTING - Progress is not available
2016-02-11 10:44:59 UTC: RUNNING - 0.00 %
2016-02-11 10:45:09 UTC: RUNNING - 0.00 %
2016-02-11 10:45:19 UTC: RUNNING - 0.00 %
2016-02-11 10:45:29 UTC: FAILED
Exception: Job Failed with status:3
Stack trace: Task failed task_1450719316904_0239_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
日志显示如下:
2016-02-11 10:44:59,651 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1450719316904_0239: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:56320, vCores:37> knownNMs=5
2016-02-11 10:45:04,775 INFO [Socket Reader #1 for port 54706] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1450719316904_0239 (auth:SIMPLE)
2016-02-11 10:45:04,803 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1450719316904_0239_m_000002 asked for a task
2016-02-11 10:45:04,803 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1450719316904_0239_m_000002 given task: attempt_1450719316904_0239_m_000000_0
2016-02-11 10:45:06,494 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1450719316904_0239_m_000000_0 is : 0.0
2016-02-11 10:45:06,503 FATAL [IPC Server handler 6 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1450719316904_0239_m_000000_0 - exited : org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs during extractor run
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:99)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.sqoop.common.SqoopException: GENERIC_JDBC_CONNECTOR_0001:Unable to get a connection
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.<init>(GenericJdbcExecutor.java:59)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:50)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:38)
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:95)
... 7 more
Caused by: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:553)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:254)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.<init>(GenericJdbcExecutor.java:51)
... 10 more
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:439)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:454)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:693)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:251)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
... 17 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:149)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:405)
... 22 more
软件版本为
- Sqoop 1.99.5-cdh5.4.8 源码修订版 5d69aef6c630a68db47724e4541e02983ade3d67
由 jenkins 于 10 月 15 日星期四编译 08:50:55 PDT 2015
- java版本“1.7.0_67”
- Java(TM) SE 运行时环境(build 1.7.0_67-b01)
- Java HotSpot(TM) 64 位服务器 VM(构建 24.65-b04,混合模式)
关于如何解决此问题的任何线索?
当 Sqoop 作业启动时,它会从您所在的机器 运行 Sqoop 命令连接到 Oracle 机器以查询表并构建 Sqoop 作业。
当 map-reduce 阶段开始时 运行,集群中作为 运行 map-reduce 任务的每个数据节点都需要连接到数据库。从这些错误来看,您的数据节点似乎无法连接到 Oracle,但您从中启动作业的机器可以。
您能否确认从所有数据节点到 Oracle 的连接?
我正在尝试使用 Sqoop2 将数据从 Oracle 11g2 服务器复制到 HDFS。
Oracle 的 link 似乎 有效,因为如果我使用无效凭据,它会抱怨。定义如下:
link with id 14 and name OLink (Enabled: true, Created by xxx at 2/9/16 2:48 PM, Updated by xxx at 2/11/16 10:08 AM)
Using Connector generic-jdbc-connector with id 4
Link configuration
JDBC Driver Class: oracle.jdbc.driver.OracleDriver
JDBC Connection String: jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=5999)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=abc)))
Username: xxx
Password:
JDBC Connection Properties:
(奇怪的端口号在这里,因为我现在需要使用反向隧道访问数据库。很快就会修复)
作业定义如下
Job with id 2 and name Test OLink (Enabled: true, Created by xxx at 2/9/16 2:56 PM, Updated by xxx at 2/11/16 10:58 AM)
Using link id 14 and Connector id 4
From database configuration
Schema name: xxx
Table name: t_name
Table SQL statement:
Table column names:
Partition column name: CL_ID
Null value allowed for the partition column: false
Boundary query:
Throttling resources
Extractors: 3
Loaders: 3
ToJob configuration
Override null value:
Null value:
Output format: TEXT_FILE
Compression format: DEFAULT
Custom compression format:
Output directory: /tmp
当我开始作业时(详细模式设置为 true),它列出了远程 table 的所有列名和类型(意味着与 Oracle 的连接正常),但是作业失败了,例如
2016-02-11 10:44:42 UTC: BOOTING - Progress is not available
2016-02-11 10:44:59 UTC: RUNNING - 0.00 %
2016-02-11 10:45:09 UTC: RUNNING - 0.00 %
2016-02-11 10:45:19 UTC: RUNNING - 0.00 %
2016-02-11 10:45:29 UTC: FAILED
Exception: Job Failed with status:3
Stack trace: Task failed task_1450719316904_0239_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
日志显示如下:
2016-02-11 10:44:59,651 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1450719316904_0239: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:56320, vCores:37> knownNMs=5
2016-02-11 10:45:04,775 INFO [Socket Reader #1 for port 54706] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1450719316904_0239 (auth:SIMPLE)
2016-02-11 10:45:04,803 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1450719316904_0239_m_000002 asked for a task
2016-02-11 10:45:04,803 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1450719316904_0239_m_000002 given task: attempt_1450719316904_0239_m_000000_0
2016-02-11 10:45:06,494 INFO [IPC Server handler 5 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1450719316904_0239_m_000000_0 is : 0.0
2016-02-11 10:45:06,503 FATAL [IPC Server handler 6 on 54706] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1450719316904_0239_m_000000_0 - exited : org.apache.sqoop.common.SqoopException: MAPRED_EXEC_0017:Error occurs during extractor run
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:99)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.sqoop.common.SqoopException: GENERIC_JDBC_CONNECTOR_0001:Unable to get a connection
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.<init>(GenericJdbcExecutor.java:59)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:50)
at org.apache.sqoop.connector.jdbc.GenericJdbcExtractor.extract(GenericJdbcExtractor.java:38)
at org.apache.sqoop.job.mr.SqoopMapper.run(SqoopMapper.java:95)
... 7 more
Caused by: java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:553)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:254)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:215)
at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.<init>(GenericJdbcExecutor.java:51)
... 10 more
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:439)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:454)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:693)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:251)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
... 17 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:149)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:405)
... 22 more
软件版本为
- Sqoop 1.99.5-cdh5.4.8 源码修订版 5d69aef6c630a68db47724e4541e02983ade3d67 由 jenkins 于 10 月 15 日星期四编译 08:50:55 PDT 2015
- java版本“1.7.0_67”
- Java(TM) SE 运行时环境(build 1.7.0_67-b01)
- Java HotSpot(TM) 64 位服务器 VM(构建 24.65-b04,混合模式)
关于如何解决此问题的任何线索?
当 Sqoop 作业启动时,它会从您所在的机器 运行 Sqoop 命令连接到 Oracle 机器以查询表并构建 Sqoop 作业。
当 map-reduce 阶段开始时 运行,集群中作为 运行 map-reduce 任务的每个数据节点都需要连接到数据库。从这些错误来看,您的数据节点似乎无法连接到 Oracle,但您从中启动作业的机器可以。
您能否确认从所有数据节点到 Oracle 的连接?