java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider
java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider
我有以下工具
- Spark 2.4.3
- Scala 2.11.12
- OS : Windows 10
这是我导入库的 sbt 代码
libraryDependencies ++= Seq(
"javassist" % "javassist" % "3.12.1.GA" ,
"com.typesafe" % "config" % "1.3.4",
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion ,
"com.datastax.spark" %% "spark-cassandra-connector" % "2.4.1",
"com.twitter" % "jsr166e" % "1.1.0",
"com.amazonaws" % "aws-java-sdk" % "1.11.592"
"org.apache.hadoop" % "hadoop-aws" % "2.7.3",
"org.apache.spark" %% "spark-catalyst" % sparkVersion
)
我的scala代码如下
val rdd = sparkSession.sparkContext.parallelize(
Seq(
("first", Array(2.0, 1.0, 2.1, 5.4)),
("test", Array(1.5, 0.5, 0.9, 3.7)),
("choose", Array(8.0, 2.9, 9.1, 2.5))
)
)
val dfWithoutSchema = sparkSession.createDataFrame(rdd)
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", "XXXXXX")
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", "XXXXXXX")
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
dfWithoutSchema.write
.mode("overwrite")
.parquet("s3a://test-daily-extracts/sample2")
当我通过 SBT 编译时,我没有收到任何错误。但是当我 运行 代码时,我得到的错误是
java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider
我的堆栈跟踪如下
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:113)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:88)
at org.apache.parquet.hadoop.ParquetOutputCommitter.<init>(ParquetOutputCommitter.java:43)
at org.apache.parquet.hadoop.ParquetOutputFormat.getOutputCommitter(ParquetOutputFormat.java:442)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupCommitter(HadoopMapReduceCommitProtocol.scala:100)
at org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol.setupCommitter(SQLHadoopMapReduceCommitProtocol.scala:40)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupTask(HadoopMapReduceCommitProtocol.scala:217)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:229)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.amazonaws.auth.AWSCredentialsProvider
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 30 more
在此先感谢您的帮助。
编辑:2019-07-17
我将我的 SBT 代码更新到下面。
libraryDependencies ++= Seq(
"javassist" % "javassist" % "3.12.1.GA" ,
"com.typesafe" % "config" % "1.3.4",
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion ,
"com.datastax.spark" %% "spark-cassandra-connector" % "2.4.1",
"com.twitter" % "jsr166e" % "1.1.0",
"com.amazonaws" % "aws-java-sdk" % "1.7.4",
"net.java.dev.jets3t" % "jets3t" % "0.9.4",
"org.apache.hadoop" % "hadoop-aws" % "2.7.3",
"org.apache.hadoop" % "hadoop-client" % "2.7.3",
"org.apache.hadoop" % "hadoop-hdfs" % "2.7.3",
"org.apache.spark" %% "spark-catalyst" % sparkVersion
)
在驱动程序中添加了以下代码。
val urls = urlsinclasspath(getClass.getClassLoader).foreach(println)
def urlsinclasspath(cl: ClassLoader): Array[java.net.URL] = cl match {
case null => Array()
case u: java.net.URLClassLoader => u.getURLs() ++ urlsinclasspath(cl.getParent)
case _ => urlsinclasspath(cl.getParent)
}
我现在可以看到 aws-java-sdk-1.7.4 正在 运行 加载,并且其中包含 AWSCredentialsProvider class。但我仍然收到以下错误。
我的完整攻略如下
19/07/17 17:02:25 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, XX.XX.XX.XX, executor 0): java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:113)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:88)
at org.apache.parquet.hadoop.ParquetOutputCommitter.<init>(ParquetOutputCommitter.java:43)
at org.apache.parquet.hadoop.ParquetOutputFormat.getOutputCommitter(ParquetOutputFormat.java:442)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupCommitter(HadoopMapReduceCommitProtocol.scala:100)
at org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol.setupCommitter(SQLHadoopMapReduceCommitProtocol.scala:40)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupTask(HadoopMapReduceCommitProtocol.scala:217)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:229)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.amazonaws.auth.AWSCredentialsProvider
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 30 more
此依赖项有 com/amazonaws/auth/AWSCredentialsProvider
class 是您所缺少的。
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.11.592"
我建议您使用 uber jar,即使用 SBT 将所有具有依赖项的 jar 打包为一个 jar,这样就不会遗漏或遗漏任何内容。
如何制作 uber jar
同时将此代码添加到您的 驱动程序 ...了解您的 class 路径中有哪些 jar。
val urls = urlsinclasspath(getClass.getClassLoader).foreach(println)
def urlsinclasspath(cl: ClassLoader): Array[java.net.URL] = cl match {
case null => Array()
case u: java.net.URLClassLoader => u.getURLs() ++ urlsinclasspath(cl.getParent)
case _ => urlsinclasspath(cl.getParent)
}
经过对 google 的大量研究,我发现我的代码没有错误,即使所有的 jar 都在加载,我的 spark 安装丢失了 hadoop.dll 在 C:\winutils\bin & C:\Windows\System32 我从这个 link https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin 并放在两个目录中。它工作得很好。我不确定错误为何具有误导性。
感谢大家的帮助。
我有以下工具
- Spark 2.4.3
- Scala 2.11.12
- OS : Windows 10
这是我导入库的 sbt 代码
libraryDependencies ++= Seq(
"javassist" % "javassist" % "3.12.1.GA" ,
"com.typesafe" % "config" % "1.3.4",
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion ,
"com.datastax.spark" %% "spark-cassandra-connector" % "2.4.1",
"com.twitter" % "jsr166e" % "1.1.0",
"com.amazonaws" % "aws-java-sdk" % "1.11.592"
"org.apache.hadoop" % "hadoop-aws" % "2.7.3",
"org.apache.spark" %% "spark-catalyst" % sparkVersion
)
我的scala代码如下
val rdd = sparkSession.sparkContext.parallelize(
Seq(
("first", Array(2.0, 1.0, 2.1, 5.4)),
("test", Array(1.5, 0.5, 0.9, 3.7)),
("choose", Array(8.0, 2.9, 9.1, 2.5))
)
)
val dfWithoutSchema = sparkSession.createDataFrame(rdd)
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", "XXXXXX")
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", "XXXXXXX")
sparkSession.sparkContext.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
dfWithoutSchema.write
.mode("overwrite")
.parquet("s3a://test-daily-extracts/sample2")
当我通过 SBT 编译时,我没有收到任何错误。但是当我 运行 代码时,我得到的错误是
java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider
我的堆栈跟踪如下
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:113)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:88)
at org.apache.parquet.hadoop.ParquetOutputCommitter.<init>(ParquetOutputCommitter.java:43)
at org.apache.parquet.hadoop.ParquetOutputFormat.getOutputCommitter(ParquetOutputFormat.java:442)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupCommitter(HadoopMapReduceCommitProtocol.scala:100)
at org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol.setupCommitter(SQLHadoopMapReduceCommitProtocol.scala:40)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupTask(HadoopMapReduceCommitProtocol.scala:217)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:229)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.amazonaws.auth.AWSCredentialsProvider
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 30 more
在此先感谢您的帮助。
编辑:2019-07-17
我将我的 SBT 代码更新到下面。
libraryDependencies ++= Seq(
"javassist" % "javassist" % "3.12.1.GA" ,
"com.typesafe" % "config" % "1.3.4",
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion ,
"com.datastax.spark" %% "spark-cassandra-connector" % "2.4.1",
"com.twitter" % "jsr166e" % "1.1.0",
"com.amazonaws" % "aws-java-sdk" % "1.7.4",
"net.java.dev.jets3t" % "jets3t" % "0.9.4",
"org.apache.hadoop" % "hadoop-aws" % "2.7.3",
"org.apache.hadoop" % "hadoop-client" % "2.7.3",
"org.apache.hadoop" % "hadoop-hdfs" % "2.7.3",
"org.apache.spark" %% "spark-catalyst" % sparkVersion
)
在驱动程序中添加了以下代码。
val urls = urlsinclasspath(getClass.getClassLoader).foreach(println)
def urlsinclasspath(cl: ClassLoader): Array[java.net.URL] = cl match {
case null => Array()
case u: java.net.URLClassLoader => u.getURLs() ++ urlsinclasspath(cl.getParent)
case _ => urlsinclasspath(cl.getParent)
}
我现在可以看到 aws-java-sdk-1.7.4 正在 运行 加载,并且其中包含 AWSCredentialsProvider class。但我仍然收到以下错误。 我的完整攻略如下
19/07/17 17:02:25 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, XX.XX.XX.XX, executor 0): java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentialsProvider
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access0(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:113)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:88)
at org.apache.parquet.hadoop.ParquetOutputCommitter.<init>(ParquetOutputCommitter.java:43)
at org.apache.parquet.hadoop.ParquetOutputFormat.getOutputCommitter(ParquetOutputFormat.java:442)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupCommitter(HadoopMapReduceCommitProtocol.scala:100)
at org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol.setupCommitter(SQLHadoopMapReduceCommitProtocol.scala:40)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupTask(HadoopMapReduceCommitProtocol.scala:217)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:229)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:170)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write.apply(FileFormatWriter.scala:169)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.amazonaws.auth.AWSCredentialsProvider
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 30 more
此依赖项有 com/amazonaws/auth/AWSCredentialsProvider
class 是您所缺少的。
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.11.592"
我建议您使用 uber jar,即使用 SBT 将所有具有依赖项的 jar 打包为一个 jar,这样就不会遗漏或遗漏任何内容。
如何制作 uber jar
同时将此代码添加到您的 驱动程序 ...了解您的 class 路径中有哪些 jar。
val urls = urlsinclasspath(getClass.getClassLoader).foreach(println)
def urlsinclasspath(cl: ClassLoader): Array[java.net.URL] = cl match {
case null => Array()
case u: java.net.URLClassLoader => u.getURLs() ++ urlsinclasspath(cl.getParent)
case _ => urlsinclasspath(cl.getParent)
}
经过对 google 的大量研究,我发现我的代码没有错误,即使所有的 jar 都在加载,我的 spark 安装丢失了 hadoop.dll 在 C:\winutils\bin & C:\Windows\System32 我从这个 link https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin 并放在两个目录中。它工作得很好。我不确定错误为何具有误导性。
感谢大家的帮助。