运行 个作业在单个数据流管道中相互独立 failure/success

Running jobs independent of each other's failure/success in a single dataflow pipeline

我正在尝试使用单个管道将 Avro 格式的数据从 GCS 加载到 Big Query。例如,我尝试加载 10 个表,这意味着单个管道中有 10 个并行作业。 现在,如果第 3 个作业失败,所有后续作业都会失败。我怎样才能使其他工作 运行 独立于一个 failure/success?

如果不实现自定义逻辑(例如,自定义 DoFn/ParDo implementations). Some I/O connectors such as BigQuery offer a way to send failed requests to a dead-letter queue in some write modes but this might not give what you want. If you want full isolation you should run separate jobs and combine them into a workflow using a orchestration framework such as Apache Airflow.

,则无法隔离单个数据流管道中的不同步骤