如何使用一些规则将一些纯文本输出转换为 JSON?
How to transform some plain text output into JSON using some rules?
我正在解析一个程序的输出,该程序一行打印两个单词,并且这些单词可以重复。输出已排序。
a 1
a 2
b 5
c 6
c 6
d 3
e 1
e 1
e 2
f 0
我想创建一个如下所示的字典(使用我提供的输入数据):
[
{"name": "a", numbers: [{"number": "1", "duplicated": false},
{"number": "2", "duplicated": false}]},
{"name": "b", numbers: [{"number": "5", "duplicated": false}],
{"name": "c", numbers: [{"number": "6", "duplicated": true}],
{"name": "d", numbers: [{"number": "3", "duplicated": false}],
{"name": "e", numbers: [{"number": "1", "duplicated": true},
{"number": "2", "duplicated": false}]},
{"name": "f", numbers: [{"number": "0", "duplicated": false}],
]
我怎样才能做到这一点?如果可能,除了标准库之外不使用任何东西。
我试过的所有东西看起来都又大又丑。
没有代码,因为我无法得到任何结果。
您可以像 {name_a: {value1: count1, value2:count2} ...}
那样构建一个字典,然后从中生成您的输出:
from collections import defaultdict
name_to_values = defaultdict(lambda: defaultdict(int))
with open('data.txt') as f:
for line in f:
name, value = line.split()
name_to_values[name][value] += 1
out = []
for name in name_to_values:
d = {'name': name}
d['numbers'] = [{'number': value, 'duplicated': count > 1 }
for value, count in name_to_values[name].items()]
out.append(d)
输出:
print(out)
# [{'name': 'a', 'numbers': [{'number': '1', 'duplicated': False}, {'number': '2', 'duplicated': False}]},
# {'name': 'b', 'numbers': [{'number': '5', 'duplicated': False}]},
# {'name': 'c', 'numbers': [{'number': '6', 'duplicated': True}]},
# {'name': 'd', 'numbers': [{'number': '3', 'duplicated': False}]},
# {'name': 'e', 'numbers': [{'number': '1', 'duplicated': True},
# {'number': '2', 'duplicated': False}]},
# {'name': 'f', 'numbers': [{'number': '0', 'duplicated': False}]}]
首先,通过将输出存储在字典中来简化输出,稍后使用它来生成如下所示的预期输出:
from pprint import pprint
lines ="""a 1
a 2
b 5
c 6
c 6
d 3
e 1
e 1
e 2
f 0"""
d = dict()
for line in lines.split("\n"):
name, number = line.strip().split()
if d.get(name):
d[name].append(number)
else:
d[name] = [number]
results = []
for name, numbers in d.items():
number_list = []
for num in set(numbers):
number_list.append({
'number': num,
'duplicated': str(numbers.count(num)>1),
})
results.append({
'name': name,
'numbers': number_list,
})
pprint(results)
我正在解析一个程序的输出,该程序一行打印两个单词,并且这些单词可以重复。输出已排序。
a 1
a 2
b 5
c 6
c 6
d 3
e 1
e 1
e 2
f 0
我想创建一个如下所示的字典(使用我提供的输入数据):
[
{"name": "a", numbers: [{"number": "1", "duplicated": false},
{"number": "2", "duplicated": false}]},
{"name": "b", numbers: [{"number": "5", "duplicated": false}],
{"name": "c", numbers: [{"number": "6", "duplicated": true}],
{"name": "d", numbers: [{"number": "3", "duplicated": false}],
{"name": "e", numbers: [{"number": "1", "duplicated": true},
{"number": "2", "duplicated": false}]},
{"name": "f", numbers: [{"number": "0", "duplicated": false}],
]
我怎样才能做到这一点?如果可能,除了标准库之外不使用任何东西。
我试过的所有东西看起来都又大又丑。
没有代码,因为我无法得到任何结果。
您可以像 {name_a: {value1: count1, value2:count2} ...}
那样构建一个字典,然后从中生成您的输出:
from collections import defaultdict
name_to_values = defaultdict(lambda: defaultdict(int))
with open('data.txt') as f:
for line in f:
name, value = line.split()
name_to_values[name][value] += 1
out = []
for name in name_to_values:
d = {'name': name}
d['numbers'] = [{'number': value, 'duplicated': count > 1 }
for value, count in name_to_values[name].items()]
out.append(d)
输出:
print(out)
# [{'name': 'a', 'numbers': [{'number': '1', 'duplicated': False}, {'number': '2', 'duplicated': False}]},
# {'name': 'b', 'numbers': [{'number': '5', 'duplicated': False}]},
# {'name': 'c', 'numbers': [{'number': '6', 'duplicated': True}]},
# {'name': 'd', 'numbers': [{'number': '3', 'duplicated': False}]},
# {'name': 'e', 'numbers': [{'number': '1', 'duplicated': True},
# {'number': '2', 'duplicated': False}]},
# {'name': 'f', 'numbers': [{'number': '0', 'duplicated': False}]}]
首先,通过将输出存储在字典中来简化输出,稍后使用它来生成如下所示的预期输出:
from pprint import pprint
lines ="""a 1
a 2
b 5
c 6
c 6
d 3
e 1
e 1
e 2
f 0"""
d = dict()
for line in lines.split("\n"):
name, number = line.strip().split()
if d.get(name):
d[name].append(number)
else:
d[name] = [number]
results = []
for name, numbers in d.items():
number_list = []
for num in set(numbers):
number_list.append({
'number': num,
'duplicated': str(numbers.count(num)>1),
})
results.append({
'name': name,
'numbers': number_list,
})
pprint(results)