Python 解析嵌套的 JSON 文件并取出特定属性
Python parse nested JSON file and take out specific attributes
所以我这里有一个大 JSON 文件,如下所示:
data = {
"Module1": {
"Description": "",
"Layer": "1",
"SourceDir": "pathModule1",
"Attributes": {
"some",
},
"Vendor": "comp",
"components":{
"Component1": {
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
},
"Component2":{
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
}
}
},
"Module2": {
"Description": "",
"Layer": "2",
"SourceDir": "pathModule2",
"Attributes": {
"some",
},
"Vendor": "comp",
"components":{
"Component1": {
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
},
"Component2":{
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
}
}
},
"Module3": {
"Description": "",
"Layer": "3",
"SourceDir": "path",
"Attributes": {
"some",
},
"Vendor": "",
},
"Module4": {
"Description": "",
"Layer": "4",
"SourceDir": "path",
"Attributes": {
"some",
}
}
}
我必须仔细检查并从中取出一些东西,所以最后我得到了这个:
每当 Vendor 字段等于“comp”时,考虑该模块,考虑它的 SourceDir 文件、所有组件、它们的路径和包含。
所以输出将是:
Module1,“pathModule1”,组件:[Component1,路径,[包括:include1,include2,include3,include4,include5]],[Component2,路径,包括:[include1,include2,include3,include4,include5 ]]
Module2,“pathModule2”,组件:[Component1,路径,[包括:include1,include2,include3,include4,include5]],[Component2,路径,包括:[include1,include2,include3,include4,include5 ]]
我真的很难访问我需要的所有字段。
我目前的代码是这样的:
with open ("DB.json", 'r') as f:
modules= json.load(f)
for k in modules.keys():
try:
if swc_list[k]["Vendor"] == "comp":
list_components.append(k)
sourceDirList.append(swc_list[k]['SourceDir'])
for i in swc_list[k]['sw_objects']:
list_sw_objects.append((swc_list[k]['sw_objects']))
except KeyError:
continue
我只能获取 Module1 和 sourceDir,但不能获取 Component1、2 及其属性。
我怎样才能做到这一点?
谢谢!
我会先过滤掉您不感兴趣的项目,方法如下:
data = {k: v for k,v in data.items() if v.get("Vendor") == "comp"}
这将删除所有您不需要的模块。这有点低效,因为您正在第二次解析字典以获取所需格式的数据,但作为第一步更容易推理,这很有帮助!
此时你可以根据需要再次遍历字典——你会得到类似的东西:
{'Module1': {'Attributes': {'some'},
'Description': '',
'Layer': '1',
'SourceDir': 'pathModule1',
'Vendor': 'comp',
'components': {'Component1': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'},
'Component2': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'}}},
'Module2': {'Attributes': {'some'},
'Description': '',
'Layer': '2',
'SourceDir': 'pathModule2',
'Vendor': 'comp',
'components': {'Component1': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'},
'Component2': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'}}}}
要仅打印源目录和组件,您可以这样做:
for k,v in data2.items():
print(k, v["SourceDir"], v["components"])
这会给你:
Module1 pathModule1 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}
Module2 pathModule2 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}
编辑:
要进一步细化输出,您可以将上述循环更改为:
for k,v in data2.items():
components = [(comp_name, comp_data["path"], comp_data["includes"]) for comp_name, comp_data in v["components"].items()]
print(k, v["SourceDir"], components)
这会给你:
Module1 pathModule1 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]
Module2 pathModule2 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]
所以我这里有一个大 JSON 文件,如下所示:
data = {
"Module1": {
"Description": "",
"Layer": "1",
"SourceDir": "pathModule1",
"Attributes": {
"some",
},
"Vendor": "comp",
"components":{
"Component1": {
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
},
"Component2":{
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
}
}
},
"Module2": {
"Description": "",
"Layer": "2",
"SourceDir": "pathModule2",
"Attributes": {
"some",
},
"Vendor": "comp",
"components":{
"Component1": {
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
},
"Component2":{
"path": "something",
"includes": [
"include1",
"include2",
"include3",
"include4",
"include5"
]
"generated:" "txt"
"memory:" "txt"
etc
}
}
},
"Module3": {
"Description": "",
"Layer": "3",
"SourceDir": "path",
"Attributes": {
"some",
},
"Vendor": "",
},
"Module4": {
"Description": "",
"Layer": "4",
"SourceDir": "path",
"Attributes": {
"some",
}
}
}
我必须仔细检查并从中取出一些东西,所以最后我得到了这个:
每当 Vendor 字段等于“comp”时,考虑该模块,考虑它的 SourceDir 文件、所有组件、它们的路径和包含。
所以输出将是:
Module1,“pathModule1”,组件:[Component1,路径,[包括:include1,include2,include3,include4,include5]],[Component2,路径,包括:[include1,include2,include3,include4,include5 ]]
Module2,“pathModule2”,组件:[Component1,路径,[包括:include1,include2,include3,include4,include5]],[Component2,路径,包括:[include1,include2,include3,include4,include5 ]]
我真的很难访问我需要的所有字段。
我目前的代码是这样的:
with open ("DB.json", 'r') as f:
modules= json.load(f)
for k in modules.keys():
try:
if swc_list[k]["Vendor"] == "comp":
list_components.append(k)
sourceDirList.append(swc_list[k]['SourceDir'])
for i in swc_list[k]['sw_objects']:
list_sw_objects.append((swc_list[k]['sw_objects']))
except KeyError:
continue
我只能获取 Module1 和 sourceDir,但不能获取 Component1、2 及其属性。 我怎样才能做到这一点?
谢谢!
我会先过滤掉您不感兴趣的项目,方法如下:
data = {k: v for k,v in data.items() if v.get("Vendor") == "comp"}
这将删除所有您不需要的模块。这有点低效,因为您正在第二次解析字典以获取所需格式的数据,但作为第一步更容易推理,这很有帮助!
此时你可以根据需要再次遍历字典——你会得到类似的东西:
{'Module1': {'Attributes': {'some'},
'Description': '',
'Layer': '1',
'SourceDir': 'pathModule1',
'Vendor': 'comp',
'components': {'Component1': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'},
'Component2': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'}}},
'Module2': {'Attributes': {'some'},
'Description': '',
'Layer': '2',
'SourceDir': 'pathModule2',
'Vendor': 'comp',
'components': {'Component1': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'},
'Component2': {'includes': ['include1',
'include2',
'include3',
'include4',
'include5'],
'path': 'something'}}}}
要仅打印源目录和组件,您可以这样做:
for k,v in data2.items():
print(k, v["SourceDir"], v["components"])
这会给你:
Module1 pathModule1 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}
Module2 pathModule2 {'Component1': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}, 'Component2': {'path': 'something', 'includes': ['include1', 'include2', 'include3', 'include4', 'include5']}}
编辑: 要进一步细化输出,您可以将上述循环更改为:
for k,v in data2.items():
components = [(comp_name, comp_data["path"], comp_data["includes"]) for comp_name, comp_data in v["components"].items()]
print(k, v["SourceDir"], components)
这会给你:
Module1 pathModule1 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]
Module2 pathModule2 [('Component1', 'something', ['include1', 'include2', 'include3', 'include4', 'include5']), ('Component2', 'something', ['include1', 'include2', 'include3', 'include4', 'include5'])]