Hive Activity 使用 BYOC HDInsight 集群的 ADF 失败 (Linux)

Hive Activity failing with ADF using BYOC HDInsight cluster (Linux)

我在尝试执行 ADF Hive 时遇到错误 Activity。当我 运行 Hive 直接在 HDInsight 集群上查询时,它工作正常。但是在 ADF Hive Activity 中 运行ning 失败了。我做了很多试验和错误,但问题仍然存在。有谁知道可能是什么问题?

WARNING: Use "yarn jar" to launch YARN applications.
17/02/09 06:09:45 INFO impl.TimelineClientImpl: Timeline service address: http://headnodehost:8188/ws/v1/timeline/
17/02/09 06:09:46 INFO impl.TimelineClientImpl: Timeline service address: http://headnodehost:8188/ws/v1/timeline/
17/02/09 06:09:46 WARN ipc.Client: Failed to connect to server: hn0-xxxxx.xxxxx133hn2u3kb1xxx0vlmsre.jx.internal.cloudapp.net/10.0.0.17:8050: retries get failed due to exceeded maximum allowed retries number: 0
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:649)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
    at org.apache.hadoop.ipc.Client$Connection.access00(Client.java:397)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1431)
    at org.apache.hadoop.ipc.Client.call(Client.java:1392)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy19.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:221)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
    at com.sun.proxy.$Proxy20.getNewApplication(Unknown Source)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:220)
    at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:228)
    at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:188)
    at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:231)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:153)
    at org.apache.hadoop.mapreduce.Job.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapred.JobClient.run(JobClient.java:575)
    at org.apache.hadoop.mapred.JobClient.run(JobClient.java:570)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
    at org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:1014)
    at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:135)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
    at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
17/02/09 06:09:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
17/02/09 06:09:50 INFO mapred.FileInputFormat: Total input paths to process : 1
17/02/09 06:09:51 INFO mapreduce.JobSubmitter: number of splits:1
17/02/09 06:09:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1486536013226_0707
17/02/09 06:09:51 INFO mapreduce.JobSubmitter: Kind: mapreduce.job, Service: job_1486536013226_0706, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@7a399295)
17/02/09 06:09:53 INFO impl.YarnClientImpl: Submitted application application_1486536013226_0707
17/02/09 06:09:53 INFO mapreduce.Job: The url to track the job: http://hn1-xxxxx.xxxxx133hn2u3kb1xxx0vlmsre.jx.internal.cloudapp.net:8088/proxy/application_1486536013226_0707/
17/02/09 06:09:53 INFO mapreduce.Job: Running job: job_1486536013226_0707
17/02/09 06:10:11 INFO mapreduce.Job: Job job_1486536013226_0707 running in uber mode : false
17/02/09 06:10:11 INFO mapreduce.Job:  map 0% reduce 0%
17/02/09 06:10:31 INFO mapreduce.Job: Task Id : attempt_1486536013226_0707_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

我成功了。实际上,.Net Activity 在数据工厂中失败,它使用链接服务作为 HDInsight 集群(我使用的是基于 Linux 的集群)。我将链接服务更改为 Azure Batch 帐户链接服务,此管道按预期工作。