Flink on YARN:错误使用 Amazon S3 而不是 HDFS
Flink on YARN : Amazon S3 wrongly used instead of HDFS
我关注了Flink on YARN's setup documentation。但是当我 运行 和 ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048
时,在向 Kerberos 进行身份验证时,我收到以下错误:
2016-06-16 17:46:47,760 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-06-16 17:46:48,518 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: https://**host**:8190/ws/v1/timeline/
2016-06-16 17:46:48,814 INFO org.apache.flink.yarn.FlinkYarnClient - Using values:
2016-06-16 17:46:48,815 INFO org.apache.flink.yarn.FlinkYarnClient - TaskManager count = 2
2016-06-16 17:46:48,815 INFO org.apache.flink.yarn.FlinkYarnClient - JobManager memory = 1024
2016-06-16 17:46:48,815 INFO org.apache.flink.yarn.FlinkYarnClient - TaskManager memory = 2048
Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:224)
at java.util.ServiceLoader.access0(ServiceLoader.java:181)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
at java.util.ServiceLoader.next(ServiceLoader.java:445)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2623)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2634)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.flink.yarn.FlinkYarnClientBase.deployInternal(FlinkYarnClientBase.java:531)
at org.apache.flink.yarn.FlinkYarnClientBase.run(FlinkYarnClientBase.java:342)
at org.apache.flink.yarn.FlinkYarnClientBase.run(FlinkYarnClientBase.java:339)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.flink.yarn.FlinkYarnClientBase.deploy(FlinkYarnClientBase.java:339)
at org.apache.flink.client.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:419)
at org.apache.flink.client.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:362)
Caused by: java.lang.NoClassDefFoundError: com/amazonaws/AmazonServiceException
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2532)
at java.lang.Class.getConstructor0(Class.java:2842)
at java.lang.Class.newInstance(Class.java:345)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
... 18 more
Caused by: java.lang.ClassNotFoundException: com.amazonaws.AmazonServiceException
at java.net.URLClassLoader.run(URLClassLoader.java:366)
at java.net.URLClassLoader.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 23 more
我在我的 ./flink-1.0.3/conf/flink-conf.yaml [=14 中设置了以下属性=]
fs.hdfs.hadoopconf: /etc/hadoop/conf/
fs.hdfs.hdfssite: /etc/hadoop/conf/hdfs-site.xml
如何使用 HDFS 而不是 Amazon 的 S3?
谢谢。
我想问题是 Flink 没有获取你的配置文件。
能否从配置中删除以 fs.hdfs.hdfssite
开头的行。如果设置了 fs.hdfs.hadoopconf
,则不需要。
此外,您可以检查 core-site.xml
中 fs.defaultFs
的设置是否设置为以 hdfs://
开头的内容?
我实际上必须按照删除的答案中的建议设置环境变量 HADOOP_CLASSPATH。
@rmetzger:fs.defaultFS
已设置。
结果命令:
HADOOP_CLASSPATH=... ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048
我关注了Flink on YARN's setup documentation。但是当我 运行 和 ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048
时,在向 Kerberos 进行身份验证时,我收到以下错误:
2016-06-16 17:46:47,760 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-06-16 17:46:48,518 INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: https://**host**:8190/ws/v1/timeline/
2016-06-16 17:46:48,814 INFO org.apache.flink.yarn.FlinkYarnClient - Using values:
2016-06-16 17:46:48,815 INFO org.apache.flink.yarn.FlinkYarnClient - TaskManager count = 2
2016-06-16 17:46:48,815 INFO org.apache.flink.yarn.FlinkYarnClient - JobManager memory = 1024
2016-06-16 17:46:48,815 INFO org.apache.flink.yarn.FlinkYarnClient - TaskManager memory = 2048
Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiated
at java.util.ServiceLoader.fail(ServiceLoader.java:224)
at java.util.ServiceLoader.access0(ServiceLoader.java:181)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:377)
at java.util.ServiceLoader.next(ServiceLoader.java:445)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2623)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2634)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.flink.yarn.FlinkYarnClientBase.deployInternal(FlinkYarnClientBase.java:531)
at org.apache.flink.yarn.FlinkYarnClientBase.run(FlinkYarnClientBase.java:342)
at org.apache.flink.yarn.FlinkYarnClientBase.run(FlinkYarnClientBase.java:339)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.flink.yarn.FlinkYarnClientBase.deploy(FlinkYarnClientBase.java:339)
at org.apache.flink.client.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:419)
at org.apache.flink.client.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:362)
Caused by: java.lang.NoClassDefFoundError: com/amazonaws/AmazonServiceException
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2532)
at java.lang.Class.getConstructor0(Class.java:2842)
at java.lang.Class.newInstance(Class.java:345)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373)
... 18 more
Caused by: java.lang.ClassNotFoundException: com.amazonaws.AmazonServiceException
at java.net.URLClassLoader.run(URLClassLoader.java:366)
at java.net.URLClassLoader.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 23 more
我在我的 ./flink-1.0.3/conf/flink-conf.yaml [=14 中设置了以下属性=]
fs.hdfs.hadoopconf: /etc/hadoop/conf/
fs.hdfs.hdfssite: /etc/hadoop/conf/hdfs-site.xml
如何使用 HDFS 而不是 Amazon 的 S3?
谢谢。
我想问题是 Flink 没有获取你的配置文件。
能否从配置中删除以 fs.hdfs.hdfssite
开头的行。如果设置了 fs.hdfs.hadoopconf
,则不需要。
此外,您可以检查 core-site.xml
中 fs.defaultFs
的设置是否设置为以 hdfs://
开头的内容?
我实际上必须按照删除的答案中的建议设置环境变量 HADOOP_CLASSPATH。
@rmetzger:fs.defaultFS
已设置。
结果命令:
HADOOP_CLASSPATH=... ./bin/yarn-session.sh -n 2 -jm 1024 -tm 2048