生成动态嵌套 JSON 对象和数组 - python
Generating a dynamic nested JSON object and array - python
正如问题所解释的那样,我一直在尝试生成嵌套的 JSON 对象。在这种情况下,我有 for
循环从字典 dic
中获取数据。下面是代码:
f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
f.write("\"term_freq\":"+str(len(value))+",\n")
f.write("\"lists\":[\n\t")
for item in value:
f.write("{\n")
f.write("\t\t\"occurance\" :"+str(item)+"\n")
#Check last object
if value.index(item)+1 == len(value):
f.write("}\n"
f.write("]\n")
else:
f.write("},") # close occurrence object
# Check last item in dic
if i == len(dic)-1:
flag = True
if(flag):
f.write("}")
else:
f.write("},") #close lists object
flag = False
#check for flag
f.write("]") #close lists array
f.write("}")
预期输出为:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}]
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}],
"term_freq": 5
}]
}
但目前我得到如下输出:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},] // Here lies the problem "," before array(last element)
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},], // Here lies the problem "," before array(last element)
"term_freq": 5
}]
}
请帮忙,我试图解决它,但失败了。请不要将其标记为重复,因为我已经检查了其他答案并且根本没有帮助。
编辑 1:
输入基本上取自映射类型为 <String, List>
的字典 dic
例如:"irritation" => [1,3,5,7,8]
其中刺激是关键,并映射到页码列表。
这基本上是在外部 for 循环中读取的,其中键是关键字,值是该关键字出现的页面列表。
编辑 2:
dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
# Here dic[x] represents list - each value of x
print key,":",dic[x],"\n" #prints the data in dictionary
您当前的代码无法正常工作,因为循环遍历 before-last 项并添加 },
然后当循环再次运行时它会将标志设置为 false,但最后一次 运行 它添加了一个 ,
因为它认为还会有另一个元素。
如果这是你的命令:a = {"bomber":[1,2,3,4,5]}
那么你可以这样做:
import json
file_name = "a_file.json"
file_name_input = "abc.pdf"
new_output = {}
new_output["filename"] = file_name_input
new_data = []
i = 0
for key, val in a.iteritems():
new_data.append({"keyword":key, "lists":[], "term_freq":len(val)})
for p in val:
new_data[i]["lists"].append({"occurrance":p})
i += 1
new_output['data'] = new_data
然后通过以下方式保存数据:
f = open(file_name, 'w+')
f.write(json.dumps(new_output, indent=4, sort_keys=True, default=unicode))
f.close()
什么@andrea-f我觉得不错,这里是另一个解决方案:
请随意选择两者:)
import json
dic = {
"bomber": [1, 2, 3, 4, 5],
"irritation": [1, 3, 5, 7, 8]
}
filename = "abc.pdf"
json_dict = {}
data = []
for k, v in dic.iteritems():
tmp_dict = {}
tmp_dict["keyword"] = k
tmp_dict["term_freq"] = len(v)
tmp_dict["lists"] = [{"occurrance": i} for i in v]
data.append(tmp_dict)
json_dict["filename"] = filename
json_dict["data"] = data
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
同样的思路,我先创建一个大的json_dict
直接保存在json中。我使用 with
语句来保存 json 避免捕获 exception
此外,如果您需要进一步改进 json
输出,您应该查看 json.dumps()
的文档。
编辑
只是为了好玩,如果你不喜欢 tmp
var,你可以在 one-liner 中循环所有数据 for
:)
json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]
它可以为最终解决方案提供一些不完全可读的内容,如下所示:
import json
json_dict = {
"filename": "abc.pdf",
"data": [{
"keyword": k,
"term_freq": len(v),
"lists": [{"occurrance": i} for i in v]
} for k, v in dic.iteritems()]
}
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
编辑 2
您似乎不想将 json
保存为所需的输出,但可以 阅读 它。
其实你也可以使用json.dumps()
来打印你的json.
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle)
print json.dumps(json_dict, indent=4, sort_keys=True)
这里还有一个问题,"filename":
打印在列表的末尾,因为 data
的 d
在 f
之前。
要强制执行命令,您必须在生成字典时使用 OrderedDict
。小心 python 2.X
的语法很难看(imo)
这是新的完整解决方案 ;)
import json
from collections import OrderedDict
dic = {
'bomber': [1, 2, 3, 4, 5],
'irritation': [1, 3, 5, 7, 8]
}
json_dict = OrderedDict([
('filename', 'abc.pdf'),
('data', [ OrderedDict([
('keyword', k),
('term_freq', len(v)),
('lists', [{'occurrance': i} for i in v])
]) for k, v in dic.iteritems()])
])
with open('abc.json', 'w') as outfile:
json.dump(json_dict, outfile)
# Now to read the orderer json file
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle, object_pairs_hook=OrderedDict)
print json.dumps(json_dict, indent=4)
将输出:
{
"filename": "abc.pdf",
"data": [
{
"keyword": "bomber",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 2
},
{
"occurrance": 3
},
{
"occurrance": 4
},
{
"occurrance": 5
}
]
},
{
"keyword": "irritation",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 3
},
{
"occurrance": 5
},
{
"occurrance": 7
},
{
"occurrance": 8
}
]
}
]
}
但要小心,大多数时候,最好保存一个 常规 .json
文件以便跨语言。
正如问题所解释的那样,我一直在尝试生成嵌套的 JSON 对象。在这种情况下,我有 for
循环从字典 dic
中获取数据。下面是代码:
f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
f.write("\"term_freq\":"+str(len(value))+",\n")
f.write("\"lists\":[\n\t")
for item in value:
f.write("{\n")
f.write("\t\t\"occurance\" :"+str(item)+"\n")
#Check last object
if value.index(item)+1 == len(value):
f.write("}\n"
f.write("]\n")
else:
f.write("},") # close occurrence object
# Check last item in dic
if i == len(dic)-1:
flag = True
if(flag):
f.write("}")
else:
f.write("},") #close lists object
flag = False
#check for flag
f.write("]") #close lists array
f.write("}")
预期输出为:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}]
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
}],
"term_freq": 5
}]
}
但目前我得到如下输出:
{
"filename": "abc.pdf",
"data": [{
"keyword": "irritation",
"term_freq": 5,
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},] // Here lies the problem "," before array(last element)
}, {
"keyword": "bomber",
"lists": [{
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 1
}, {
"occurance": 2
},], // Here lies the problem "," before array(last element)
"term_freq": 5
}]
}
请帮忙,我试图解决它,但失败了。请不要将其标记为重复,因为我已经检查了其他答案并且根本没有帮助。
编辑 1:
输入基本上取自映射类型为 <String, List>
的字典 dic
例如:"irritation" => [1,3,5,7,8]
其中刺激是关键,并映射到页码列表。
这基本上是在外部 for 循环中读取的,其中键是关键字,值是该关键字出现的页面列表。
编辑 2:
dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
# Here dic[x] represents list - each value of x
print key,":",dic[x],"\n" #prints the data in dictionary
您当前的代码无法正常工作,因为循环遍历 before-last 项并添加 },
然后当循环再次运行时它会将标志设置为 false,但最后一次 运行 它添加了一个 ,
因为它认为还会有另一个元素。
如果这是你的命令:a = {"bomber":[1,2,3,4,5]}
那么你可以这样做:
import json
file_name = "a_file.json"
file_name_input = "abc.pdf"
new_output = {}
new_output["filename"] = file_name_input
new_data = []
i = 0
for key, val in a.iteritems():
new_data.append({"keyword":key, "lists":[], "term_freq":len(val)})
for p in val:
new_data[i]["lists"].append({"occurrance":p})
i += 1
new_output['data'] = new_data
然后通过以下方式保存数据:
f = open(file_name, 'w+')
f.write(json.dumps(new_output, indent=4, sort_keys=True, default=unicode))
f.close()
什么@andrea-f我觉得不错,这里是另一个解决方案:
请随意选择两者:)
import json
dic = {
"bomber": [1, 2, 3, 4, 5],
"irritation": [1, 3, 5, 7, 8]
}
filename = "abc.pdf"
json_dict = {}
data = []
for k, v in dic.iteritems():
tmp_dict = {}
tmp_dict["keyword"] = k
tmp_dict["term_freq"] = len(v)
tmp_dict["lists"] = [{"occurrance": i} for i in v]
data.append(tmp_dict)
json_dict["filename"] = filename
json_dict["data"] = data
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
同样的思路,我先创建一个大的json_dict
直接保存在json中。我使用 with
语句来保存 json 避免捕获 exception
此外,如果您需要进一步改进 json
输出,您应该查看 json.dumps()
的文档。
编辑
只是为了好玩,如果你不喜欢 tmp
var,你可以在 one-liner 中循环所有数据 for
:)
json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]
它可以为最终解决方案提供一些不完全可读的内容,如下所示:
import json
json_dict = {
"filename": "abc.pdf",
"data": [{
"keyword": k,
"term_freq": len(v),
"lists": [{"occurrance": i} for i in v]
} for k, v in dic.iteritems()]
}
with open("abc.json", "w") as outfile:
json.dump(json_dict, outfile, indent=4, sort_keys=True)
编辑 2
您似乎不想将 json
保存为所需的输出,但可以 阅读 它。
其实你也可以使用json.dumps()
来打印你的json.
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle)
print json.dumps(json_dict, indent=4, sort_keys=True)
这里还有一个问题,"filename":
打印在列表的末尾,因为 data
的 d
在 f
之前。
要强制执行命令,您必须在生成字典时使用 OrderedDict
。小心 python 2.X
这是新的完整解决方案 ;)
import json
from collections import OrderedDict
dic = {
'bomber': [1, 2, 3, 4, 5],
'irritation': [1, 3, 5, 7, 8]
}
json_dict = OrderedDict([
('filename', 'abc.pdf'),
('data', [ OrderedDict([
('keyword', k),
('term_freq', len(v)),
('lists', [{'occurrance': i} for i in v])
]) for k, v in dic.iteritems()])
])
with open('abc.json', 'w') as outfile:
json.dump(json_dict, outfile)
# Now to read the orderer json file
with open('abc.json', 'r') as handle:
new_json_dict = json.load(handle, object_pairs_hook=OrderedDict)
print json.dumps(json_dict, indent=4)
将输出:
{
"filename": "abc.pdf",
"data": [
{
"keyword": "bomber",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 2
},
{
"occurrance": 3
},
{
"occurrance": 4
},
{
"occurrance": 5
}
]
},
{
"keyword": "irritation",
"term_freq": 5,
"lists": [
{
"occurrance": 1
},
{
"occurrance": 3
},
{
"occurrance": 5
},
{
"occurrance": 7
},
{
"occurrance": 8
}
]
}
]
}
但要小心,大多数时候,最好保存一个 常规 .json
文件以便跨语言。