遍历从 S3 下载的 json 文件中的对象数组
Looping through array of objects in json file downloaded from S3
我正在使用 boto3 get_object 从 S3 获取一个 json 文件。我需要从文件中获取内容并遍历对象数组并一次获取一个对象。当我循环时,我每次迭代得到一个字符。
进口json
导入 boto3
s3 = boto3.client('s3')
会话 = boto3.Session()
def lambda_handler(event, context):
bucket = event["bucket"]
key = event["key"]
data = s3.get_object(Bucket=bucket, Key=key)
contents = data['Body'].read()
test = contents.decode("utf-8")
# convert contents to native python string representing json object
s3_string = json.dumps(contents.decode("utf-8"))
# return dict
s3_dict = json.loads(s3_string)
# this seems to output valid json
# print(str(s3_dict))
for item in s3_dict:
print(item)
文件中的json格式如下
[{
"location": "123 Road Dr",
"city_state": "MyCity ST",
"phone": "555-555-5555",
"distance": "1"
},
{
"location": "456 Avenue Crt",
"city_state": "MyTown AL",
"phone": "555-867-5309",
"distance": "0"
}
]
这是我得到的(每次迭代一个字符)...
- [
- {
- ”
...
这就是我需要的(json 格式)...
第一个循环
{
"location": "123 Road Dr",
"city_state": "MyCity ST",
"phone": "555-555-5555",
"distance": "1"
}
第二个循环
{
"location": "456 Avenue Crt",
"city_state": "MyTown AL",
"phone": "555-867-5309",
"distance": "0"
}
谁能告诉我哪里错了?
提前致谢。
这是可行的解决方案。
def lambda_handler(event, context):
bucket = event["bucket"]
key = event["key"]
data = s3.get_object(Bucket=bucket, Key=key)
contents = data['Body'].read()
# convert contents to native python string representing json object
s3_string = contents.decode("utf-8")
# check the "type" of s3_string - in this case it is <class 'str'>
print("s3_string is " + str(type(s3_string)))
# return python list
s3_list = json.loads(s3_string)
# check the "type" of s3_list - in this case it is <class 'list'>
print("s3_list is " + str(type(s3_list)))
# this returns valid json for every object in the array in original json file.
for item in s3_list:
print(json.dumps(item))
我原以为在 json.loads 的默认行为中我得到的是 python 指令。我实际上得到了一份清单。这就解释了为什么...
我正在使用 boto3 get_object 从 S3 获取一个 json 文件。我需要从文件中获取内容并遍历对象数组并一次获取一个对象。当我循环时,我每次迭代得到一个字符。
进口json 导入 boto3
s3 = boto3.client('s3') 会话 = boto3.Session()
def lambda_handler(event, context):
bucket = event["bucket"]
key = event["key"]
data = s3.get_object(Bucket=bucket, Key=key)
contents = data['Body'].read()
test = contents.decode("utf-8")
# convert contents to native python string representing json object
s3_string = json.dumps(contents.decode("utf-8"))
# return dict
s3_dict = json.loads(s3_string)
# this seems to output valid json
# print(str(s3_dict))
for item in s3_dict:
print(item)
文件中的json格式如下
[{
"location": "123 Road Dr",
"city_state": "MyCity ST",
"phone": "555-555-5555",
"distance": "1"
},
{
"location": "456 Avenue Crt",
"city_state": "MyTown AL",
"phone": "555-867-5309",
"distance": "0"
}
]
这是我得到的(每次迭代一个字符)...
- [
- {
- ” ...
这就是我需要的(json 格式)...
第一个循环
{
"location": "123 Road Dr",
"city_state": "MyCity ST",
"phone": "555-555-5555",
"distance": "1"
}
第二个循环
{
"location": "456 Avenue Crt",
"city_state": "MyTown AL",
"phone": "555-867-5309",
"distance": "0"
}
谁能告诉我哪里错了?
提前致谢。
这是可行的解决方案。
def lambda_handler(event, context):
bucket = event["bucket"]
key = event["key"]
data = s3.get_object(Bucket=bucket, Key=key)
contents = data['Body'].read()
# convert contents to native python string representing json object
s3_string = contents.decode("utf-8")
# check the "type" of s3_string - in this case it is <class 'str'>
print("s3_string is " + str(type(s3_string)))
# return python list
s3_list = json.loads(s3_string)
# check the "type" of s3_list - in this case it is <class 'list'>
print("s3_list is " + str(type(s3_list)))
# this returns valid json for every object in the array in original json file.
for item in s3_list:
print(json.dumps(item))
我原以为在 json.loads 的默认行为中我得到的是 python 指令。我实际上得到了一份清单。这就解释了为什么...