胶水作业失败,随机出现“找不到 JohnSnowLabs spark-nlp 依赖项”错误

Glue job failed with `JohnSnowLabs spark-nlp dependency not found` error randomly

我正在使用 AWS Glue 运行 一些 pyspark python 代码,有时成功但有时失败并出现依赖错误:Resource Setup Error: Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: JohnSnowLabs#spark-nlp;2.5.4: not found],这是错误日志:

:: problems summary ::
:::: WARNINGS
        module not found: JohnSnowLabs#spark-nlp;2.5.4

    ==== local-m2-cache: tried

      file:/root/.m2/repository/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      file:/root/.m2/repository/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.jar

    ==== local-ivy-cache: tried

      /root/.ivy2/local/JohnSnowLabs/spark-nlp/2.5.4/ivys/ivy.xml

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      /root/.ivy2/local/JohnSnowLabs/spark-nlp/2.5.4/jars/spark-nlp.jar

    ==== central: tried

      https://repo1.maven.org/maven2/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      https://repo1.maven.org/maven2/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.jar

    ==== spark-packages: tried

      https://dl.bintray.com/spark-packages/maven/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom

      -- artifact JohnSnowLabs#spark-nlp;2.5.4!spark-nlp.jar:

      https://dl.bintray.com/spark-packages/maven/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.jar

        ::::::::::::::::::::::::::::::::::::::::::::::

        ::          UNRESOLVED DEPENDENCIES         ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: JohnSnowLabs#spark-nlp;2.5.4: not found

        ::::::::::::::::::::::::::::::::::::::::::::::



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: JohnSnowLabs#spark-nlp;2.5.4: not found]

从成功 运行 的日志中,我可以看到 glue 能够从 https://dl.bintray.com/spark-packages/maven/JohnSnowLabs/spark-nlp/2.5.4/spark-nlp-2.5.4.pom 下载依赖项,失败的作业也尝试从那里下载依赖项,但失败了。

这个问题上周似乎自行解决了,但最近几天又出现了,而且至今没有自行解决。有没有人见过这个奇怪的问题?谢谢

spark-packages 于 2021 年 5 月 1 日移动。在我的 scala 项目中,我不得不像这样添加一个不同的解析器。它在 java.

中必须相似
resolvers in ThisBuild ++= Seq(
  "SparkPackages" at "https://repos.spark-packages.org"
 ## remove -> "MVNRepository"  at "https://dl.bintray.com/spark-packages/maven"
)

你自己去看看,那个包不在你要找的那个解析器上。我的也不是。

https://dl.bintray.com/spark-packages/