[AWS Glue]：org.apache.thrift.TApplicationException：内部错误处理 createInterpreter

Question

我正在尝试使用 zeppelin-0.8.0 连接到 AWS Glue 开发端点，但在执行下面的单元格时发生错误。并且没有帮助信息来了解可能是什么问题。任何潜在客户表示赞赏

172318_1906434757 is finished, status: ERROR, exception: java.lang.RuntimeException: org.apache.thrift.TApplicationException: Internal error processing createInterpreter, result: %text org.apache.thrift.TApplicationException: Internal error processing createInterpreter
        at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:209)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:192)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.call(RemoteInterpreter.java:169)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.call(RemoteInterpreter.java:165)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:165)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)
        at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:407)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
        at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access1(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

更新：所以在下面的中看起来 0.8.0 还不能与 Glue 一起使用。我遇到了问题运行 0 .7.x 当运行 Java 8 时 javax.ws.rx 包有一堆 MethodNotFoundException 很好（也没有帮助更新替代 Java 7） .但是当运行在 JDK 7 docker 容器中时，它可以正常工作并且能够连接到我的 Dev 端点。 如果有人能澄清它的根本原因，将不胜感激

Answer 1

能否请您提供更多信息，例如 zeppin 实例位置。它是运行ning 在您的 desktop/laptop 上还是运行ning 作为 AWS Notebook 服务器？您是否也尝试连接到 zeppelin 0.7.3 版本，如 AWS 论坛中所述link：

https://forums.aws.amazon.com/thread.jspa?threadID=285128

根据以上 link 日期为 2018 年 7 月的内容，认为 AWS Glue 尚不支持 Zeppelin 0.8 版本。我假设所有其他配置、环境设置都是根据需要完成的。如果您能提供更多信息，可以提供更多帮助。

更新： 无论如何，请参阅 here and setting up zeppelin on windows，以获取有关设置本地开发环境和 zeppelin notebook 的任何帮助。

设置 zeppelin notebook 后，建立 SSH 连接（使用 AWS Glue DevEndpoint URL），这样您就可以访问数据 catalog/crawlers 等，以及您的数据所在的 S3 存储桶。然后，您可以在 zeppelin notebook 中创建 python 脚本，并从 zeppelin 中创建运行。

您可以使用 Glue 提供的开发实例，但您可能会因此产生额外费用（EC2 实例费用）。

环境设置（根据评论更新）：

JAVA_HOME=E:\Java7\jre7
Path=E:\Python27;E:\Python27\Lib;E:\Python27\Scripts;
PYTHONPATH=E:\spark-2.1.0-bin-hadoop2.7\python;E:\spark-2.1.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip;E:\spark-2.1.0-bin-hadoop2.7\python\lib\pys
park.zip
SPARK_HOME=E:\spark-2.1.0-bin-hadoop2.7

相应地更改驱动器名称/文件夹。如果需要任何帮助，请告诉我。

[AWS Glue]：org.apache.thrift.TApplicationException：内部错误处理 createInterpreter

[AWS Glue]: org.apache.thrift.TApplicationException: Internal error processing createInterpreter

apache-zeppelin

aws-glue