添加 ReadAllFromText 转换时管道失败
Pipeline fails when addng ReadAllFromText transform
我正在尝试 运行 Apache Beam 中的一个非常简单的程序来尝试它是如何工作的。
import apache_beam as beam
class Split(beam.DoFn):
def process(self, element):
return element
with beam.Pipeline() as p:
rows = (p | beam.io.ReadAllFromText(
"input.csv") | beam.ParDo(Split()))
在 运行 执行此操作时,出现以下错误
.... some more stack....
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/util.py", line 565, in expand
windowing_saved = pcoll.windowing
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/pvalue.py", line 137, in windowing
self.producer.inputs)
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 464, in get_windowing
return inputs[0].windowing
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/pvalue.py", line 137, in windowing
self.producer.inputs)
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 464, in get_windowing
return inputs[0].windowing
AttributeError: 'PBegin' object has no attribute 'windowing'
知道这里出了什么问题吗?
谢谢
ReadAllFromText
期望从文件的 PCollection 中读取而不是将其作为参数传递。所以,在你的情况下,它应该是:
p | beam.Create(["input.csv"])
| beam.io.ReadAllFromText()
我正在尝试 运行 Apache Beam 中的一个非常简单的程序来尝试它是如何工作的。
import apache_beam as beam
class Split(beam.DoFn):
def process(self, element):
return element
with beam.Pipeline() as p:
rows = (p | beam.io.ReadAllFromText(
"input.csv") | beam.ParDo(Split()))
在 运行 执行此操作时,出现以下错误
.... some more stack....
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/util.py", line 565, in expand
windowing_saved = pcoll.windowing
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/pvalue.py", line 137, in windowing
self.producer.inputs)
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 464, in get_windowing
return inputs[0].windowing
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/pvalue.py", line 137, in windowing
self.producer.inputs)
File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 464, in get_windowing
return inputs[0].windowing
AttributeError: 'PBegin' object has no attribute 'windowing'
知道这里出了什么问题吗?
谢谢
ReadAllFromText
期望从文件的 PCollection 中读取而不是将其作为参数传递。所以,在你的情况下,它应该是:
p | beam.Create(["input.csv"])
| beam.io.ReadAllFromText()