如何计算 python 中字典中列表元素的平均值?

How to calculate the mean of elements of a lists inside a dictionary in python?

嗨,我在 python 中有一个字典,看起来像这样:

{{'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]}},
{'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}}}

其中NN3-X是时间序列的id,diff和mas是模型的名字,_后面的数字是模型执行的次数。

并且我希望列表中的每个 i 元素与另一个列表中的 i 元素相对应,具有相同的模型名称,例如:diffe_1 的 1,加上 5,从 diffe_2 开始,平均值为 3,最终产品将类似于:

{{'NN3-001': {'diffe':[3,4,5,6], 'mas':[30,40,50,60]}},
{'NN3-002': {'diffe':[16,17,18,19], 'mas':[300,400,500,600]}}}

谢谢。

首先:你的例子不是正确的字典。您在某些地方错过了 {}

你应该

{
    'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
    'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}

要获取单个列表,您可以使用

values = data['NN3-001']['diffe_1'] 

你可以计算出mean

mean = sum(values)/len(values)

对于所有列表,您必须使用 for-loops with dict.items()

dictionary = {
    'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
    'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}

for name, values in dictionary.items():
    print('=== time serie:', name, '===')
    for key, data in values.items():
        print('  key:', key)
        print(' data:', data)
        print(' mean:', sum(data)/len(data))
        print('---')

结果:

=== time serie: NN3-001 ===
  key: diffe_1
 data: [1, 2, 3, 4]
 mean: 2.5
---
  key: mas_1
 data: [10, 20, 30, 40]
 mean: 25.0
---
  key: diffe_2
 data: [5, 6, 7, 8]
 mean: 6.5
---
  key: mas_2
 data: [50, 60, 70, 80]
 mean: 65.0
---
=== time serie: NN3-002 ===
  key: diffe_1
 data: [14, 15, 16, 17]
 mean: 15.5
---
  key: mas_1
 data: [100, 200, 300, 400]
 mean: 250.0
---
  key: diffe_2
 data: [18, 19, 20, 21]
 mean: 19.5
---
  key: mas_2
 data: [500, 600, 700, 800]
 mean: 650.0

编辑:

更改问题后,我发现您需要 zip(diffe_1, diffe_2) 才能创建配对。

dictionary = {
    'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
    'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}

result = {}

for name, values in dictionary.items():
    print('=== time serie:', name, '===')
    
    result[name] = {'diff':[], 'mas':[]}
    
    print('--- diffe_1, diffe_2 ---')
    for a, b in zip(values['diffe_1'],values['diffe_2']):
        mean = int( (a+b)/2 )
        print(a, '&', b, '=>', mean)
        result[name]['diff'].append(mean)
        
    print('--- mas_1, mas_2 ---')
    for a, b in zip(values['mas_1'],values['mas_2']):
        mean = int( (a+b)/2 )
        print(a, '&', b, '=>', mean)
        result[name]['mas'].append(mean)

print(result)      

给予

=== time serie: NN3-001 ===
--- diffe_1, diffe_2 ---
1 & 5 => 3.0
2 & 6 => 4.0
3 & 7 => 5.0
4 & 8 => 6.0
--- mas_1, mas_2 ---
10 & 50 => 30.0
20 & 60 => 40.0
30 & 70 => 50.0
40 & 80 => 60.0
=== time serie: NN3-002 ===
--- diffe_1, diffe_2 ---
14 & 18 => 16.0
15 & 19 => 17.0
16 & 20 => 18.0
17 & 21 => 19.0
--- mas_1, mas_2 ---
100 & 500 => 300.0
200 & 600 => 400.0
300 & 700 => 500.0
400 & 800 => 600.0


{
'NN3-001': {'diff': [3, 4, 5, 6], 'mas': [30, 40, 50, 60]},  
'NN3-002': {'diff': [16, 17, 18, 19], 'mas': [300, 400, 500, 600]}
}

您也可以使用循环 for prefix in ['diffe', 'mas']: 来减少代码。

dictionary = {
    'NN3-001': {'diffe_1':[1,2,3,4],'mas_1':[10,20,30,40],'diffe_2':[5,6,7,8],'mas_2':[50,60,70,80]},
    'NN3-002': {'diffe_1':[14,15,16,17],'mas_1':[100,200,300,400],'diffe_2':[18,19,20,21],'mas_2':[500,600,700,800]}
}

result = {}

for name, values in dictionary.items():
    print('=== time serie:', name, '===')
    
    
    result[name] = {}
    
    for prefix in ['diffe', 'mas']:

        print('--- prefix:', prefix, '---')
        
        result[name][prefix] = []

        for a, b in zip(values[prefix+'_1'],values[prefix+'_2']):
            mean = int( (a+b)/2 )
            print(a, '&', b, '=>', mean)
            result[name][prefix].append(mean)
        
print(result)

首先,是的,您的字典无效,但这可能是因为您只写了两行。你可能想这样写:

dictionary = {
    'NN3-001': {
        'diffe_1':[1,2,3,4],
        'mas_1':[10,20,30,40],
        'diffe_2':[5,6,7,8],
        'mas_2':[50,60,70,80],
        },
    'NN3-002': {
        'diffe_1':[14,15,16,17],
        'mas_1':[100,200,300,400],
        'diffe_2':[18,19,20,21],
        'mas_2':[500,600,700,800],
        },
    }

对于均值计算函数:

def compute_mean(dictionary):
    new_dictionary = {}
    # Loop on 'NN3-' level
    for key, sub_dictionary in dictionary.items():
        new_sub_dictionary, accumulated_arrays = {}, {}
        # Loop on 'diffe_' level
        for sub_key, list in sub_dictionary.items():
            # Extract the sub_key without the _n
            sub_key = sub_key.split('_')[0]
            # If we already encountered this sub_key
            if sub_key in new_sub_dictionary:
                new_sub_dictionary[sub_key] += np.array(list)
                accumulated_arrays[sub_key] += 1
            # If we haven't encountered this sub_key
            else:
                new_sub_dictionary[sub_key] = np.array(list)
                accumulated_arrays[sub_key] = 0
        # Compute mean and convert back to list 
        for sub_key, array in new_sub_dictionary.items():
            new_sub_dictionary[sub_key] = list(array / accumulated_arrays[sub_key])
        # Add to the main dictionary
        new_dictionary[key] = new_sub_dictionary
    return new_dictionary