python-archives Not A Directory Exception 而 运行 Flink Job - PyFlink
python-archives Not A Directory Exception While Running Flink Job - PyFlink
我在 运行 一个 pyflink 应用程序时遇到以下异常:
- 我正在使用
start-cluster.sh
启动flink集群
- 我正在使用 Python 虚拟环境来 运行 flink 作业 (
/root/Python3.6/venv.zip
)
- 我在应用程序中设置了存档路径 (
t_env.add_python_archive(archive_path="/root/Python3.6/venv.zip", target_dir=None)
)
- 我正在使用 UDF,如果我将 UDF 取出,我将无法获得此异常和作业 运行s 成功
Caused by: java.io.IOException: Cannot run program "/root/Python3.6/venv.zip/venv/bin/python" (in directory "/tmp/python-dist-ffa89c4c-527b-49d8-bae3-fd2fd6d3cd67/python-archives"): error=20, Not a directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.flink.python.util.PythonEnvironmentManagerUtils.execute(PythonEnvironmentManagerUtils.java:193)
at org.apache.flink.python.util.PythonEnvironmentManagerUtils.getPythonUdfRunnerScript(PythonEnvironmentManagerUtils.java:154)
at org.apache.flink.python.env.beam.ProcessPythonEnvironmentManager.createEnvironment(ProcessPythonEnvironmentManager.java:156)
at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createPythonExecutionEnvironment(BeamPythonFunctionRunner.java:395)
at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.lambda$open[=12=](BeamPythonFunctionRunner.java:243)
at org.apache.flink.runtime.memory.MemoryManager.lambda$getSharedMemoryResourceForManagedMemory(MemoryManager.java:539)
at org.apache.flink.runtime.memory.SharedResources.createResource(SharedResources.java:126)
at org.apache.flink.runtime.memory.SharedResources.getOrAllocateSharedResource(SharedResources.java:72)
at org.apache.flink.runtime.memory.MemoryManager.getSharedMemoryResourceForManagedMemory(MemoryManager.java:555)
at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:246)
at org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.open(AbstractPythonFunctionOperator.java:131)
at org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:110)
at org.apache.flink.table.runtime.operators.python.scalar.AbstractPythonScalarFunctionOperator.open(AbstractPythonScalarFunctionOperator.java:100)
at org.apache.flink.table.runtime.operators.python.scalar.PythonScalarFunctionOperator.open(PythonScalarFunctionOperator.java:62)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:110)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:711)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor.call(StreamTaskActionExecutor.java:55)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:687)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: error=20, Not a directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)```
这是一个无知的错误。我为 python 虚拟环境提供了错误的路径。而且我不需要在我的情况下设置 python-archive 路径,因为我没有在代码中使用它。
此外,我必须将 python.client.executable
属性 指向相同的路径:
t_env.get_config().get_configuration().set_string( "python.client.executable", "/root/Python3.6/venv/bin/python")
t_env.get_config().set_python_executable( "/root/Python3.6/venv/bin/python")
我在 运行 一个 pyflink 应用程序时遇到以下异常:
- 我正在使用
start-cluster.sh
启动flink集群 - 我正在使用 Python 虚拟环境来 运行 flink 作业 (
/root/Python3.6/venv.zip
) - 我在应用程序中设置了存档路径 (
t_env.add_python_archive(archive_path="/root/Python3.6/venv.zip", target_dir=None)
) - 我正在使用 UDF,如果我将 UDF 取出,我将无法获得此异常和作业 运行s 成功
Caused by: java.io.IOException: Cannot run program "/root/Python3.6/venv.zip/venv/bin/python" (in directory "/tmp/python-dist-ffa89c4c-527b-49d8-bae3-fd2fd6d3cd67/python-archives"): error=20, Not a directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.flink.python.util.PythonEnvironmentManagerUtils.execute(PythonEnvironmentManagerUtils.java:193)
at org.apache.flink.python.util.PythonEnvironmentManagerUtils.getPythonUdfRunnerScript(PythonEnvironmentManagerUtils.java:154)
at org.apache.flink.python.env.beam.ProcessPythonEnvironmentManager.createEnvironment(ProcessPythonEnvironmentManager.java:156)
at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createPythonExecutionEnvironment(BeamPythonFunctionRunner.java:395)
at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.lambda$open[=12=](BeamPythonFunctionRunner.java:243)
at org.apache.flink.runtime.memory.MemoryManager.lambda$getSharedMemoryResourceForManagedMemory(MemoryManager.java:539)
at org.apache.flink.runtime.memory.SharedResources.createResource(SharedResources.java:126)
at org.apache.flink.runtime.memory.SharedResources.getOrAllocateSharedResource(SharedResources.java:72)
at org.apache.flink.runtime.memory.MemoryManager.getSharedMemoryResourceForManagedMemory(MemoryManager.java:555)
at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:246)
at org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.open(AbstractPythonFunctionOperator.java:131)
at org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:110)
at org.apache.flink.table.runtime.operators.python.scalar.AbstractPythonScalarFunctionOperator.open(AbstractPythonScalarFunctionOperator.java:100)
at org.apache.flink.table.runtime.operators.python.scalar.PythonScalarFunctionOperator.open(PythonScalarFunctionOperator.java:62)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:110)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:711)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor.call(StreamTaskActionExecutor.java:55)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:687)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: error=20, Not a directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)```
这是一个无知的错误。我为 python 虚拟环境提供了错误的路径。而且我不需要在我的情况下设置 python-archive 路径,因为我没有在代码中使用它。
此外,我必须将 python.client.executable
属性 指向相同的路径:
t_env.get_config().get_configuration().set_string( "python.client.executable", "/root/Python3.6/venv/bin/python")
t_env.get_config().set_python_executable( "/root/Python3.6/venv/bin/python")