如何使用字典键更改数组值?

How do I change array values with dict keys?

我有一个像这样的 3d 数组 形状为 (20001, 128, 128)

array([[[48, 48, 48, ..., 48, 48, 48],
        [48, 48, 48, ..., 48, 48, 48],
        [48, 48, 48, ..., 48, 48, 48],
        ...,
       [[12, 12, 12, ..., 12, 12, 12],
        [12, 12, 12, ..., 12, 12, 12],
        [12, 12, 12, ..., 12, 12, 12],
        ...,
        [19, 19, 19, ..., 12, 12, 12],
        [19, 19, 19, ..., 19, 12, 12],
        [19, 19, 19, ..., 19, 19, 19]],

我有一个看起来像这样的字典

{1: [1, 39],
 2: [2, 5, 9, 20, 32, 42, 47, 72, 88, 91, 95],
 3: [3, 49, 55],
 4: [4, 24, 34, 40, 53, 76, 81, 90, 96],
 5: [6, 17, 30, 48, 83],
 6: [7, 13, 15, 16, 27, 44, 51, 54, 56, 75],
 7: [8, 50],
 8: [10, 19, 22, 35, 61, 63, 65],
 9: [11, 12, 21, 46, 52, 69, 78, 84, 89],
 10: [14, 36, 74],
 11: [18],
 12: [23, 38, 66, 97],
 13: [25],
 14: [26, 28, 29, 62, 64, 86, 94],
 15: [31, 59, 85],
 16: [33, 80],
 17: [37, 45, 60],
 18: [41, 92, 93],
 19: [43, 77, 79, 82],
 20: [57, 67],
 21: [58],
 22: [68],
 23: [70],
 24: [71],
 25: [73, 87],
 0: [0]}

所以我想要的是,如果数组值 = dict 值将数组值更改为键,就像这样 ->

array([[[5, 5, 5, ..., 5, 5, 5],
        [5, 5, 5, ..., 5, 5, 5],
        [5, 5, 5, ..., 5, 5, 5],
        ...,
        [9, 9, 9, ..., 9, 9, 9],
        [9, 9, 9, ..., 9, 9, 9],
        [9, 9, 9, ..., 9, 9, 9]],
        ...,
        [8, 8, 8, ..., 9, 9, 9],
        [8, 8, 8, ..., 8, 9, 9],
        [8, 8, 8, ..., 8, 8, 8]],

因为 48 在键 5 中, 12 在键 9 等

arr = [  # Example list of lists - arbitrary values
    [11, 11, 12, 13],
    [24, 24, 24, 35],
    [16, 27, 27, 8]
]

dictionary = {
    1: [1, 39],
    2: [2, 5, 9, 20, 32, 42, 47, 72, 88, 91, 95],
    3: [3, 49, 55],
    4: [4, 24, 34, 40, 53, 76, 81, 90, 96],
    5: [6, 17, 30, 48, 83],
    6: [7, 13, 15, 16, 27, 44, 51, 54, 56, 75],
    7: [8, 50],
    8: [10, 19, 22, 35, 61, 63, 65],
    9: [11, 12, 21, 46, 52, 69, 78, 84, 89],
    10: [14, 36, 74],
    11: [18],
    12: [23, 38, 66, 97],
    13: [25],
    14: [26, 28, 29, 62, 64, 86, 94],
    15: [31, 59, 85],
    16: [33, 80],
    17: [37, 45, 60],
    18: [41, 92, 93],
    19: [43, 77, 79, 82],
    20: [57, 67],
    21: [58],
    22: [68],
    23: [70],
    24: [71],
    25: [73, 87],
    0: [0]
}

def get_key(search_value):
    for key, num_list in dictionary.items():
         if search_value in num_list:
             return key

for sub_list in arr:
    for index, value in enumerate(sub_list):
        new_val = get_key(value)  # get the key from 'dict'
        sub_list[index] = new_val  # replace old subarray value

print(arr)  # QED - see new array below
# [
#     [9, 9, 9, 6],
#     [4, 4, 4, 8],
#     [6, 6, 6, 7]
# ]

你应该反转你原来的字典:

lookup_dict = {1: [1, 39],
 2: [2, 5, 9, 20, 32, 42, 47, 72, 88, 91, 95],
 3: [3, 49, 55],
 4: [4, 24, 34, 40, 53, 76, 81, 90, 96],
 5: [6, 17, 30, 48, 83],
 6: [7, 13, 15, 16, 27, 44, 51, 54, 56, 75],
 7: [8, 50],
 8: [10, 19, 22, 35, 61, 63, 65],
 9: [11, 12, 21, 46, 52, 69, 78, 84, 89],
 10: [14, 36, 74],
 11: [18],
 12: [23, 38, 66, 97],
 13: [25],
 14: [26, 28, 29, 62, 64, 86, 94],
 15: [31, 59, 85],
 16: [33, 80],
 17: [37, 45, 60],
 18: [41, 92, 93],
 19: [43, 77, 79, 82],
 20: [57, 67],
 21: [58],
 22: [68],
 23: [70],
 24: [71],
 25: [73, 87],
 0: [0]}

reversed_dict = {val: key for key, lst in lookup_dict.items() for val in lst}

现在,您可以遍历输入数组并在从 reversed_dict 中查找后将每个项目设置到一个新数组中,这已经比 更有效,因为您不需要不需要遍历所有列表来找到新值。

然而,如果你把这个 reversed_dict 的值放到一个数组中,这样字典中的键就是数组中的索引,那么你可以简单地使用 numpy 的内置广播能力来索引到数组并为您提供正确形状的结果。我更喜欢这种方法,因为它更快:

max_index = max(reversed_dict.keys())

lookup_array = np.zeros((max_index+1,))
for k, v in reversed_dict.items():
    lookup_array[k] = v

最后:

input_array = np.array([[[48, 48, 48, 48, 48, 48],
        [48, 48, 48, 48, 48, 48],
        [48, 48, 48, 48, 48, 48]],
        
        [[12, 12, 12,  12, 12, 12],
        [12, 12, 12,  12, 12, 12],
        [12, 12, 12,  12, 12, 12]],
        
        [[19, 19, 19,  12, 12, 12],
        [19, 19, 19,  19, 12, 12],
        [19, 19, 19,  19, 19, 19]]])

output_array = lookup_array[input_array]

给出:

array([[[5., 5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5., 5.]],

       [[9., 9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9., 9.]],

       [[8., 8., 8., 9., 9., 9.],
        [8., 8., 8., 8., 9., 9.],
        [8., 8., 8., 8., 8., 8.]]])

这种方法的优点是它对任何形状的input_array有效as-is,而且超级快!.

为三种方法计时:

  1. , func1
  2. 从反向字典中查找值,func2
  3. 索引到新的 numpy 数组,func3
import timeit

input_array = np.random.randint(0, max_index, (100, 100, 100))

def get_key(search_value):
    for key, num_list in lookup_dict.items():
         if search_value in num_list:
             return key

def func1(arr):
  arr = np.copy(arr)
  for outer_lst in arr:
    for sub_list in outer_lst:
        for index, value in enumerate(sub_list):
            new_val = get_key(value)  # get the key from 'dict'
            sub_list[index] = new_val  # replace old subarray value
  return arr

def func2(arr):
    arr = np.copy(arr)
    for outer_lst in arr:
        for sub_list in outer_lst:
            for index, value in enumerate(sub_list):
                new_val = reversed_dict[value]
                sub_list[index] = new_val
    return arr

def func3(arr):
    return lookup_array[arr]

t1 = timeit.timeit("func1(input_array)", globals=globals(), number=2)
print("t1 =", t1)
t2 = timeit.timeit("func2(input_array)", globals=globals(), number=2)
print("t2 =", t2)
t3 = timeit.timeit("func3(input_array)", globals=globals(), number=2)
print("t3 =", t3)

在我的电脑上,这给出了:

t1 = 25.02508409996517
t2 = 1.2259434000588953
t3 = 0.01203500002156943

换句话说,

  • JRiggles 的方法比反转字典并在反转字典中查找值慢 20 倍
  • JRiggles 的方法 2000 比创建数组和使用 numpy 索引到数组慢

这是一个测试数组,其中包含比输入数组少 ~300x 个元素。使用您的阵列,节省的时间会显着