如何使用一些规则将一些纯文本输出转换为 JSON?

How to transform some plain text output into JSON using some rules?

我正在解析一个程序的输出,该程序一行打印两个单词,并且这些单词可以重复。输出已排序。

a  1
a  2
b  5
c  6
c  6
d  3
e  1
e  1
e  2
f  0

我想创建一个如下所示的字典(使用我提供的输入数据):

[
  {"name": "a", numbers: [{"number": "1", "duplicated": false},
                          {"number": "2", "duplicated": false}]},
  {"name": "b", numbers: [{"number": "5", "duplicated": false}],
  {"name": "c", numbers: [{"number": "6", "duplicated": true}],
  {"name": "d", numbers: [{"number": "3", "duplicated": false}],
  {"name": "e", numbers: [{"number": "1", "duplicated": true},
                          {"number": "2", "duplicated": false}]},
  {"name": "f", numbers: [{"number": "0", "duplicated": false}],
]

我怎样才能做到这一点?如果可能,除了标准库之外不使用任何东西。

我试过的所有东西看起来都又大又丑。

没有代码,因为我无法得到任何结果。

您可以像 {name_a: {value1: count1, value2:count2} ...} 那样构建一个字典,然后从中生成您的输出:

from collections import defaultdict

name_to_values = defaultdict(lambda: defaultdict(int))
with open('data.txt') as f:
    for line in f:
        name, value = line.split()
        name_to_values[name][value] += 1

out = []
for name in name_to_values:
    d = {'name': name}
    d['numbers'] = [{'number': value, 'duplicated': count > 1 }
                    for value, count in name_to_values[name].items()]
    out.append(d)

输出:

print(out)

# [{'name': 'a', 'numbers': [{'number': '1', 'duplicated': False}, {'number': '2', 'duplicated': False}]},
#  {'name': 'b', 'numbers': [{'number': '5', 'duplicated': False}]}, 
#  {'name': 'c', 'numbers': [{'number': '6', 'duplicated': True}]},
#  {'name': 'd', 'numbers': [{'number': '3', 'duplicated': False}]},
#  {'name': 'e', 'numbers': [{'number': '1', 'duplicated': True}, 
#  {'number': '2', 'duplicated': False}]}, 
#  {'name': 'f', 'numbers': [{'number': '0', 'duplicated': False}]}]

首先,通过将输出存储在字典中来简化输出,稍后使用它来生成如下所示的预期输出:

from pprint import pprint

lines ="""a  1
a  2
b  5
c  6
c  6
d  3
e  1
e  1
e  2
f  0"""

d = dict()
for line in lines.split("\n"):
    name, number = line.strip().split()
    if d.get(name):
        d[name].append(number)
    else:
        d[name] = [number]

results = []
for name, numbers in d.items():
    number_list = []
    for num in set(numbers):
        number_list.append({
            'number': num,
            'duplicated': str(numbers.count(num)>1),
        })
    results.append({
        'name': name,
        'numbers': number_list,
        })
pprint(results)