与 AWS Lambda 重试失败函数相关的成本?

Costs related to AWS Lambda retrying failing function?

我正在研究无服务器技术(具体来说,AWS Lambda 上的 Python、Django 和 Zappa),关于错误处理的一件事让我印象深刻。在 Zappa 文档中它说

By default, AWS Lambda will attempt to retry an event based (non-API Gateway, e.g. CloudWatch) invocation if an exception has been thrown.

AWS Lambda documentation中,我读到:

Depending on the event source, AWS Lambda may retry the failed Lambda function. For example, if Kinesis is the event source, AWS Lambda will retry the failed invocation until the Lambda function succeeds or the records in the stream expire.

这是否意味着函数在引发未处理的异常时将被调用无限次?再这样下去,成本肯定会暴增。

与此相关; "until the records in the stream expire" 是什么意思?什么记录,什么流?

根据AWS docs:

  • 非基于流的事件源:如 S3、API 网关等

    • 同步调用:如果您使用SDK或API网关调用Lambda,如果出现异常,您将负责决定 if/when/how 应重试请求。

    • 异步调用:如果 Lambda 是通过异步调用(如 S3)触发的,它将自动重试调用两次,重试之间有延迟。如果您指定了死信队列,则失败的事件将发送到 SQS/SNS。如果未指定 DLQ,则事件将被丢弃。

  • 基于流的事件源:像 DynamoDB 和 Kinesis。

    • 如果 Lambda 函数失败,它将继续尝试直到数据过期(Kinesis 最多 7 天)。它会在两次重试之间以 1 分钟的上限进行指数退避后重试。您将为所有重试付费,但您可以创建警报以在源离线时触发和停止流。

有关基于流的事件源的文档不是很准确,但您可以在 AWS 论坛中阅读 this thread,其中 AWS 员工已经回答了有关此问题的问题:

问题:

Specifically, when my Lambda is getting Kinesis events and writing the data to another service... but the other service goes down for a period of time (e.g., a few hours)... is my Lambda going to keep getting called (and throwing errors) at a constant rate?

Lambda retry is good because I want guaranteed delivery of events, but ideally in this situation, I also don't want to be billed at a high rate when my Lambda becomes consistently UNsuccessful for a time

回答:

If the function starts executing but fails because of a downstream dependency, then you do get billed for the duration the function ran. Lambda exponentially backs off in case your function fails, up to about one minute. You can also monitor this as the ShardIteratorAge increases, and take action to pause your stream processing if needed till you resolve the downstream dependency