Amazon Kinesis Firehose 缓冲到 S3

Amazon Kinesis Firehose Buffering to S3

我正在尝试为部署到 AWS 的流数据/分析应用程序定价,并考虑使用 Kinesis Firehose 将数据转储到 S3。

我的问题是,在为此计算 S3 成本时,我需要计算出我需要多少 PUT。

所以,我知道 Firehose 缓冲数据然后冲出到 S3,但是我不清楚它是否会写一个 "file" 包含到那时为止累积的所有记录,或者是否它将单独写入每条记录。

因此,假设我根据记录大小将缓冲区大小/间隔设置为最佳数量,S3 PUT 的数量是否仍然等于记录数量或 Firehose 执行的刷新数量?

我认为您不会为从 Firehose 到 S3 的写入操作支付任何额外费用。

You will be billed separately for charges associated with Amazon S3 and Amazon Redshift usage including storage and read/write requests. However, you will not be billed for data transfer charges for the data that Amazon Kinesis Firehose loads into Amazon S3 and Amazon Redshift. For further details, see Amazon S3 pricing and Amazon Redshift pricing.

https://aws.amazon.com/kinesis/firehose/pricing/

阅读了大量 AWS 文档后,我非常不同意 S3 不会向您收费的说法。

You will be billed separately for charges associated with Amazon S3 and Amazon Redshift usage including storage and read/write requests. However, you will not be billed for data transfer charges for the data that Amazon Kinesis Firehose loads into Amazon S3 and Amazon Redshift. For further details, see Amazon S3 pricing and Amazon Redshift pricing. [emphasis mine]

https://aws.amazon.com/kinesis/firehose/pricing/

他们说您不会被收取的费用是 Kinesis Firehose 为传输额外收取的费用,除了 $0.035/GB,但您需要为与你的桶。 (入站到存储桶的数据始终不收取实际的每 GB 传输费用)。

不过,归根结底,根据一些可调参数,您 似乎 控制了针对您的存储桶的 PUT 请求的粗略数量:

Q: What is buffer size and buffer interval?

Amazon Kinesis Firehose buffers incoming streaming data to a certain size or for a certain period of time before delivering it to destinations. You can configure buffer size and buffer interval while creating your delivery stream. Buffer size is in MBs and ranges from 1MB to 128MB. Buffer interval is in seconds and ranges from 60 seconds to 900 seconds.

https://aws.amazon.com/kinesis/firehose/faqs/#creating-delivery-streams

除非它正在收集记录并将其聚合到大文件中,否则我不明白为什么缓冲区大小和缓冲区间隔会有意义...但是,如果不启动服务并将其用于旋转,我只能(不幸地)真正推测。

对于由 kinesis 完成的任何操作,而不是对于单个对象,成本是一个 S3 PUT。 所以一次冲洗水管是一次放:

https://docs.aws.amazon.com/whitepapers/latest/building-data-lakes/data-ingestion-methods.html

https://forums.aws.amazon.com/thread.jspa?threadID=219275&tstart=0