Python: 如果其他值为真,则计算值在嵌套字典中的出现次数
Python: Count the occurence of value in nested dictionary if other value is true
我有一个嵌套字典,如下所示:
{
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}},
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}}
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
我现在需要得到每个国家出现的次数以及回答是或否的人数。目前我只收集每个国家出现的次数:
nationalities = ['USA', 'Hong Kong', 'France' ...]
for countries in nationalities:
cnt =[item for l in [v2 for v1 in dictionary1.values() for v2 in v1.values()] for item in l].count(countries)
result.append(countries + ': ' + str(cnt))
所以使用我的数据表我得到类似
的东西
['Hong Kong: 2', 'France: 2', 'Italy: 3']
但是,我想得到回答是和回答不是的人的比例。这样我得到一个 ['Hong Kong: 2 1 1']
形式的列表,其中第一个数字是总数,第二个和第三个分别是 yes 和 no
感谢您的帮助
a={
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}},
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}},
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
results=[]
nationalities = ['USA', 'Hong Kong', 'France']
for country in nationalities:
countryyes=0
countryno=0
for row in a.values():
if str(row['Country'])[2:-2] == country:
if str(row['Answer'])[2:-2] == 'yes':
countryyes+=1
if str(row['Answer'])[2:-2] == 'no':
countryno+=1
results.append(country+': '+str(countryyes+countryno)+' '+str(countryyes)+' '+str(countryno))
我想做一些笔记。首先,我将 countries 改为 country(这样在 for 循环中使用复数名称是不正常的)。其次,我想评论一下,如果您上面的代码在一组中有名称、答案和国家/地区,我认为您最好将其更改为仅将其作为字符串。
这是一个可能的解决方案,它使用 defaultdict
来生成结果字典,方法是对每个 country
求和多少等于 yes
或 no
的答案:
from collections import defaultdict
dictionary1 = {
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}},
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}},
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
nationalities = ['USA', 'Hong Kong', 'France']
result = defaultdict(list)
for countries in nationalities:
[yes, no] = [sum(list(d['Answer'])[0] == answer and list(d['Country'])[0] == countries for d in dictionary1.values()) for answer in ['yes', 'no']]
result[countries] = [ yes+no, yes, no ]
print(dict(result))
对于您的示例数据,这给出了
{
'USA': [1, 1, 0],
'Hong Kong': [2, 1, 1],
'France': [0, 0, 0]
}
然后您可以通过
将其转换为字符串列表
result = [ f"{key}: {' '.join(map(str, counts))}" for key, counts in result.items()]
给出:
['USA: 1 1 0', 'Hong Kong: 2 1 1', 'France: 0 0 0']
我会使用 Counter
to count answers and groupby()
按国家/地区对条目进行分组:
from collections import Counter
from operator import itemgetter
from itertools import groupby
dictionary1 = {...} # input data
group_func = itemgetter('Country')
result = []
for (country, *_), items in groupby(sorted(dictionary1.values(), key=group_func), group_func):
answers = Counter(answer.lower() for i in items for answer in i['Answer'])
result.append(f'{country} {sum(answers.values())} {answers.get("yes", 0)} {answers.get("no", 0)}')
我有一个嵌套字典,如下所示:
{
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}},
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}}
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
我现在需要得到每个国家出现的次数以及回答是或否的人数。目前我只收集每个国家出现的次数:
nationalities = ['USA', 'Hong Kong', 'France' ...]
for countries in nationalities:
cnt =[item for l in [v2 for v1 in dictionary1.values() for v2 in v1.values()] for item in l].count(countries)
result.append(countries + ': ' + str(cnt))
所以使用我的数据表我得到类似
的东西['Hong Kong: 2', 'France: 2', 'Italy: 3']
但是,我想得到回答是和回答不是的人的比例。这样我得到一个 ['Hong Kong: 2 1 1']
形式的列表,其中第一个数字是总数,第二个和第三个分别是 yes 和 no
感谢您的帮助
a={
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}},
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}},
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
results=[]
nationalities = ['USA', 'Hong Kong', 'France']
for country in nationalities:
countryyes=0
countryno=0
for row in a.values():
if str(row['Country'])[2:-2] == country:
if str(row['Answer'])[2:-2] == 'yes':
countryyes+=1
if str(row['Answer'])[2:-2] == 'no':
countryno+=1
results.append(country+': '+str(countryyes+countryno)+' '+str(countryyes)+' '+str(countryno))
我想做一些笔记。首先,我将 countries 改为 country(这样在 for 循环中使用复数名称是不正常的)。其次,我想评论一下,如果您上面的代码在一组中有名称、答案和国家/地区,我认为您最好将其更改为仅将其作为字符串。
这是一个可能的解决方案,它使用 defaultdict
来生成结果字典,方法是对每个 country
求和多少等于 yes
或 no
的答案:
from collections import defaultdict
dictionary1 = {
1: {'Name': {'John'}, 'Answer': {'yes'}, 'Country': {'USA'}},
2: {'Name': {'Julia'}, 'Answer': {'no'}, 'Country': {'Hong Kong'}},
3: {'Name': {'Adam'}, 'Answer': {'yes'}, 'Country': {'Hong Kong'}}
}
nationalities = ['USA', 'Hong Kong', 'France']
result = defaultdict(list)
for countries in nationalities:
[yes, no] = [sum(list(d['Answer'])[0] == answer and list(d['Country'])[0] == countries for d in dictionary1.values()) for answer in ['yes', 'no']]
result[countries] = [ yes+no, yes, no ]
print(dict(result))
对于您的示例数据,这给出了
{
'USA': [1, 1, 0],
'Hong Kong': [2, 1, 1],
'France': [0, 0, 0]
}
然后您可以通过
将其转换为字符串列表result = [ f"{key}: {' '.join(map(str, counts))}" for key, counts in result.items()]
给出:
['USA: 1 1 0', 'Hong Kong: 2 1 1', 'France: 0 0 0']
我会使用 Counter
to count answers and groupby()
按国家/地区对条目进行分组:
from collections import Counter
from operator import itemgetter
from itertools import groupby
dictionary1 = {...} # input data
group_func = itemgetter('Country')
result = []
for (country, *_), items in groupby(sorted(dictionary1.values(), key=group_func), group_func):
answers = Counter(answer.lower() for i in items for answer in i['Answer'])
result.append(f'{country} {sum(answers.values())} {answers.get("yes", 0)} {answers.get("no", 0)}')