我如何在 Azure 数据工厂 V2 中重新 运行 管道只有失败的 activities/Dataset?
How do i Re-run pipeline with only failed activities/Dataset in Azure Data Factory V2?
我正在 运行 设置一个管道,在其中循环遍历 INFORMATION.SCHEMA.TABLES 中的所有 table 并将其复制到 Azure 数据湖 store.My 问题是如何我 运行 此管道用于失败的 table 仅当任何 table 无法复制时?
我发现的最佳方法是将您的流程编码为:
0. Yes, root cause the failure and identify if it is something wrong with the pipeline or if it is a “feature” of your dependency you have to code around.
1. Be idempotent. If your process ensures a clean state as the very first step, similar to Command Design pattern’s undo (but more naive), then your process can re-execute.
* with #1, you can safely use “retry” in your pipeline activities, along with sufficient time between retries.
* this is an ADFv1 or v2 compatible approach
2. If ADFv2, then you have more options and can have more complex logic to handle errors:
* for the activity that is failing, wrap this in an until-success loop, and be sure to include a bound on execution.
* you can add more activities in the loop to handle failure and log, notify, or resolve known failure conditions due to externalities out of your control.
3. You can also use asynchronous communication to future process executions that save success to a central store. Then later executions “if” I already was successful then stop processing before the activity.
* this is powerful for more generalized pipelines, since you can choose where to begin
4. Last resort I know (and I would love to learn new ways to handle) is manual re-execution of failed activities.
希望这对您有所帮助,
J
我正在 运行 设置一个管道,在其中循环遍历 INFORMATION.SCHEMA.TABLES 中的所有 table 并将其复制到 Azure 数据湖 store.My 问题是如何我 运行 此管道用于失败的 table 仅当任何 table 无法复制时?
我发现的最佳方法是将您的流程编码为:
0. Yes, root cause the failure and identify if it is something wrong with the pipeline or if it is a “feature” of your dependency you have to code around.
1. Be idempotent. If your process ensures a clean state as the very first step, similar to Command Design pattern’s undo (but more naive), then your process can re-execute.
* with #1, you can safely use “retry” in your pipeline activities, along with sufficient time between retries.
* this is an ADFv1 or v2 compatible approach
2. If ADFv2, then you have more options and can have more complex logic to handle errors:
* for the activity that is failing, wrap this in an until-success loop, and be sure to include a bound on execution.
* you can add more activities in the loop to handle failure and log, notify, or resolve known failure conditions due to externalities out of your control.
3. You can also use asynchronous communication to future process executions that save success to a central store. Then later executions “if” I already was successful then stop processing before the activity.
* this is powerful for more generalized pipelines, since you can choose where to begin
4. Last resort I know (and I would love to learn new ways to handle) is manual re-execution of failed activities.
希望这对您有所帮助, J