ImportError: No module named options.value_provider
ImportError: No module named options.value_provider
以下管道与 DirectRunner 一起工作,但在下面与 DataflowRunner 一起引发异常。
我该如何调试此类错误?这对我来说似乎很不透明。
p = beam.Pipeline("DataflowRunner", argv=[
'--project', project,
'--staging_location', staging_location,
'--temp_location', temp_location,
'--output', output_gcs
])
(p
| 'read events' >> beam.io.Read(beam.io.BigQuerySource(query=query, use_standard_sql=True))
| 'write' >> beam.io.WriteToText(output_gcs)
)
p.run().wait_until_finish()
加注
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 578, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 165, in execute
op.start()
File "dataflow_worker/operations.py", line 350, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:13064)
def start(self):
File "dataflow_worker/operations.py", line 351, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:12958)
with self.scoped_start_state:
File "dataflow_worker/operations.py", line 356, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:12159)
pickler.loads(self.spec.serialized_fn))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 212, in loads
return dill.loads(s)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in loads
return load(file)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in load
obj = pik.load()
File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 423, in find_class
return StockUnpickler.find_class(self, module, name)
File "/usr/lib/python2.7/pickle.py", line 1124, in find_class
__import__(module)
ImportError: No module named options.value_provider
value_provider 是最近在 python SDK 中引入的用于处理模板的模块。但是,我在您的代码段中没有看到任何模板,因此可能是包不匹配。您是否为 SDK 和工作人员使用了匹配的版本?您可以检查您的 worker-startup 日志以检查您安装的软件包的版本。
同样的问题。正如 Maria 所指出的,这是 apache_beam 和 google-cloud-dataflow 包之间的不匹配问题。
为了说清楚,下面的命令解决了:
pip2 install --upgrade apache_beam google-cloud-dataflow
以下管道与 DirectRunner 一起工作,但在下面与 DataflowRunner 一起引发异常。 我该如何调试此类错误?这对我来说似乎很不透明。
p = beam.Pipeline("DataflowRunner", argv=[
'--project', project,
'--staging_location', staging_location,
'--temp_location', temp_location,
'--output', output_gcs
])
(p
| 'read events' >> beam.io.Read(beam.io.BigQuerySource(query=query, use_standard_sql=True))
| 'write' >> beam.io.WriteToText(output_gcs)
)
p.run().wait_until_finish()
加注
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 578, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 165, in execute
op.start()
File "dataflow_worker/operations.py", line 350, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:13064)
def start(self):
File "dataflow_worker/operations.py", line 351, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:12958)
with self.scoped_start_state:
File "dataflow_worker/operations.py", line 356, in dataflow_worker.operations.DoOperation.start (dataflow_worker/operations.c:12159)
pickler.loads(self.spec.serialized_fn))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 212, in loads
return dill.loads(s)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 277, in loads
return load(file)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 266, in load
obj = pik.load()
File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 423, in find_class
return StockUnpickler.find_class(self, module, name)
File "/usr/lib/python2.7/pickle.py", line 1124, in find_class
__import__(module)
ImportError: No module named options.value_provider
value_provider 是最近在 python SDK 中引入的用于处理模板的模块。但是,我在您的代码段中没有看到任何模板,因此可能是包不匹配。您是否为 SDK 和工作人员使用了匹配的版本?您可以检查您的 worker-startup 日志以检查您安装的软件包的版本。
同样的问题。正如 Maria 所指出的,这是 apache_beam 和 google-cloud-dataflow 包之间的不匹配问题。
为了说清楚,下面的命令解决了:
pip2 install --upgrade apache_beam google-cloud-dataflow