AWS Map State 不断覆盖 S3 中的文件

Question

我在 AWS step-function 中有一个地图状态，它链接到一个创建 csv 文件并将其保存到 S3 的 lambda。

我想将地图的每次迭代的结果附加到 main.csv，但是较新的迭代只是不断覆盖存储在 s3 中的 main.csv。

我附上代码代码：

#some code to connect to s3

# describing the files
key = 'some_path/main.csv'
tempdir = tempfile.mkdtemp()
local_file = 'main.csv'
path = os.path.join(tempdir, local_file)

lists = []
# some processing to populate the lists

#writing the file to S3
with open(path, 'a', newline='') as output
    writer = csv.writer(outfile)
    for line in lists:
        writer.writerow(line)
bucket.upload_file(path, key)

如果有人建议，每当我执行 step 函数时，应该从头开始创建 main.csv，并且应该附加地图迭代，这将非常有帮助。我不想附加到 main.csv，它是由旧的状态函数执行创建的。

Answer 1

这一行在Lambda执行环境的/tmp文件夹中创建了一个全新的文件：

with open(path, 'a', newline='') as output

您正在以追加模式打开它，但文件 /tmp/main.csv 尚不存在，因此它将创建一个新文件。

稍后，当您通过 bucket.upload_file(path, key) 与 S3 交互时，您将上传刚刚创建的新文件，覆盖 S3 中的现有文件。

您需要先从 S3 下载文件，附加到它，然后将新版本上传回 S3，如下所示：

# get the file from s3
bucket.download_file(key, path)

# append to the file
with open(path, 'a', newline='') as output
    writer = csv.writer(outfile)
    for line in lists:
        writer.writerow(line)

# write the file to S3
bucket.upload_file(path, key)

AWS Map State 不断覆盖 S3 中的文件

AWS Map State keeps overwriting the file in S3

python

amazon-s3

amazon-web-services

aws-lambda

aws-step-functions