如何比较两个列表并提取位置、索引和邻居?

How to compare two lists and extract position, index and neighbors?

假设我们有两个列表:

list1 = [1, 2, 3, 4, 5]
list2 = [6, 7, 8, 9, 10]

这是基本结构:

     Columns
Rows    0      1      2      3      4
0       1      2      3      4      5
1       6      7      8      9      10

每个元素都应打印一行,其中包含一些元数据(索引、位置和邻居):

row/col   Print statement
0/0       "Row index=0, Column Index=0, Value=1, Value below=6, Value to the right = 2"
0/1       "Row index=0, Column Index=1, Value=2, Value below=7, Value to the right = 3"
0/2       "Row index=0, Column Index=2, Value=3, Value below=8, Value to the right = 4"
0/3       "Row index=0, Column Index=3, Value=4, Value below=9, Value to the right = 5"
0/4       "Row index=0, Column Index=4, Value=5, Value below=10, Value to the right = NaN"
1/0       "Row index=1, Column Index=0, Value=6, Value below=NaN, Value to the right = 7"
1/1       "Row index=1, Column Index=1, Value=7, Value below=NaN, Value to the right = 8"
1/2       "Row index=1, Column Index=2, Value=8, Value below=NaN, Value to the right = 9"
1/3       "Row index=1, Column Index=3, Value=9, Value below=NaN, Value to the right = 10"
1/4       "Row index=1, Column Index=4, Value=10, Value below=NaN, Value to the right = NaN"

是否有类似列表理解或任何其他方法来尽快比较这两个列表?

我不想使用 for/while 循环,因为它们被认为非常慢。

编辑:该解决方案将成为必须处理数百万次比较的大数据功能的核心。使用传统的 for 循环会极大地降低我的功能。这就是为什么我正在寻找一种更快的方法来做到这一点。

我认为解决您的问题的最佳方法是使用 for 循环。下面的代码接受多行并将其存储在字典中。

row1 = [1, 2, 3, 4, 5]
row2 = [6, 7, 8, 9, 10]
row3 = [11, 12, 13, 14, 15]
row4 = [16, 17, 18, 19, 20]
row5 = [21, 22, 23, 24, 25]

def create_array_dict(*args):
    return {row_index: {column_index: value for column_index, value in enumerate(row_list)} for row_index, row_list in enumerate(args)}

arr = create_array_dict(row1, row2, row3, row4, row5)

创建字典后,我们可以打印出结果:

def print_output(arr):
    for row, value in arr.items():
        for column, value in value.items():
            try:
                print(f"Row index={row}, Column Index={column}, Value={value},Value below={arr[row + 1][column]}, Value to the right={arr[row][column + 1]}")
            except:
                try:
                    print(f"Row index={row}, Column Index={column}, Value={value},Value below={arr[row + 1][column]}, Value to the right=None")
                except:
                    try:
                        print(f"Row index={row}, Column Index={column}, Value={value},Value below=None, Value to the right={arr[row][column + 1]}")
                    except:
                        print(f"Row index={row}, Column Index={column}, Value={value},Value below=None, Value to the right=None")

输出:

Row index=0, Column Index=0, Value=1,Value below=6, Value to the right=2
Row index=0, Column Index=1, Value=2,Value below=7, Value to the right=3
Row index=0, Column Index=2, Value=3,Value below=8, Value to the right=4
Row index=0, Column Index=3, Value=4,Value below=9, Value to the right=5
Row index=0, Column Index=4, Value=5,Value below=10, Value to the right=None
Row index=1, Column Index=0, Value=6,Value below=11, Value to the right=7
Row index=1, Column Index=1, Value=7,Value below=12, Value to the right=8
Row index=1, Column Index=2, Value=8,Value below=13, Value to the right=9
Row index=1, Column Index=3, Value=9,Value below=14, Value to the right=10
Row index=1, Column Index=4, Value=10,Value below=15, Value to the right=None
Row index=2, Column Index=0, Value=11,Value below=16, Value to the right=12
Row index=2, Column Index=1, Value=12,Value below=17, Value to the right=13
Row index=2, Column Index=2, Value=13,Value below=18, Value to the right=14
Row index=2, Column Index=3, Value=14,Value below=19, Value to the right=15
Row index=2, Column Index=4, Value=15,Value below=20, Value to the right=None
Row index=3, Column Index=0, Value=16,Value below=21, Value to the right=17
Row index=3, Column Index=1, Value=17,Value below=22, Value to the right=18
Row index=3, Column Index=2, Value=18,Value below=23, Value to the right=19
Row index=3, Column Index=3, Value=19,Value below=24, Value to the right=20
Row index=3, Column Index=4, Value=20,Value below=25, Value to the right=None
Row index=4, Column Index=0, Value=21,Value below=None, Value to the right=22
Row index=4, Column Index=1, Value=22,Value below=None, Value to the right=23
Row index=4, Column Index=2, Value=23,Value below=None, Value to the right=24
Row index=4, Column Index=3, Value=24,Value below=None, Value to the right=25
Row index=4, Column Index=4, Value=25,Value below=None, Value to the right=None

如果你想要一个字符串列表,那么你必须迭代 - 遍历每个元素。这样的字符串格式适用于标量元素,而不适用于整个数组。

可以使用整个数组操作生成值。但结果将是某种形式的数组,而不是您的字符串列表。

例如:

In [198]: list1 = [1, 2, 3, 4, 5] 
     ...: list2 = [6, 7, 8, 9, 10]                                                             
In [199]: arr = np.array([list1,list2])                                                        
In [200]: arr                                                                                  
Out[200]: 
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

迭代数组同时显示索引的一种简单方法是:

In [201]: list(np.ndenumerate(arr))                                                            
Out[201]: 
[((0, 0), 1),
 ((0, 1), 2),
 ((0, 2), 3),
 ((0, 3), 4),
 ((0, 4), 5),
 ((1, 0), 6),
 ((1, 1), 7),
 ((1, 2), 8),
 ((1, 3), 9),
 ((1, 4), 10)]

这仍然是一个迭代。我们可以通过以下方式获得数组形式的索引:

In [215]: np.indices(arr.shape)                                                                
Out[215]: 
array([[[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1]],

       [[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]]])
In [216]: I,J = np.indices(arr.shape)

并用以下值堆叠这些索引:

In [218]: np.stack((I.ravel(),J.ravel(),arr.ravel()),axis=1)                                   
Out[218]: 
array([[ 0,  0,  1],
       [ 0,  1,  2],
       [ 0,  2,  3],
       [ 0,  3,  4],
       [ 0,  4,  5],
       [ 1,  0,  6],
       [ 1,  1,  7],
       [ 1,  2,  8],
       [ 1,  3,  9],
       [ 1,  4, 10]])

要获得下面的正确值,我们可以生成一个填充数组:

In [223]: arr1 = np.pad(arr.astype(float),[(0,1),(0,1)],mode='constant',constant_values=np.nan)
     ...:                                                                                      
In [224]: arr1                                                                                 
Out[224]: 
array([[ 1.,  2.,  3.,  4.,  5., nan],
       [ 6.,  7.,  8.,  9., 10., nan],
       [nan, nan, nan, nan, nan, nan]])

请注意,我必须将数组转换为浮点数才能接受浮点数 np.nan 值。

结合一切:

In [225]: np.stack((I.ravel(),J.ravel(),arr.ravel(),arr1[1:,:-1].ravel(),arr1[:-1,1:].ravel()),
     ...: axis=1)                                                                              
Out[225]: 
array([[ 0.,  0.,  1.,  6.,  2.],
       [ 0.,  1.,  2.,  7.,  3.],
       [ 0.,  2.,  3.,  8.,  4.],
       [ 0.,  3.,  4.,  9.,  5.],
       [ 0.,  4.,  5., 10., nan],
       [ 1.,  0.,  6., nan,  7.],
       [ 1.,  1.,  7., nan,  8.],
       [ 1.,  2.,  8., nan,  9.],
       [ 1.,  3.,  9., nan, 10.],
       [ 1.,  4., 10., nan, nan]])

要获取您的字符串列表,我们可以定义格式字符串:

In [230]: astr = "Row index={}, Column index={}, Value={}, Value below={}, Value to the right={}"  

并将其应用到到达行(是的,这会迭代)

In [233]: for row in Out[225]: 
     ...:     print(astr.format(*row)) 
     ...:                                                                                      
Row index=0.0, Column index=0.0, Value=1.0, Value below=6.0, Value to the right=2.0
Row index=0.0, Column index=1.0, Value=2.0, Value below=7.0, Value to the right=3.0
Row index=0.0, Column index=2.0, Value=3.0, Value below=8.0, Value to the right=4.0
Row index=0.0, Column index=3.0, Value=4.0, Value below=9.0, Value to the right=5.0
Row index=0.0, Column index=4.0, Value=5.0, Value below=10.0, Value to the right=nan
Row index=1.0, Column index=0.0, Value=6.0, Value below=nan, Value to the right=7.0
Row index=1.0, Column index=1.0, Value=7.0, Value below=nan, Value to the right=8.0
Row index=1.0, Column index=2.0, Value=8.0, Value below=nan, Value to the right=9.0
Row index=1.0, Column index=3.0, Value=9.0, Value below=nan, Value to the right=10.0
Row index=1.0, Column index=4.0, Value=10.0, Value below=nan, Value to the right=nan

如果我们省略所有这些 ravel,我们将得到一个 3d 值数组:

In [234]: np.stack((I,J,arr,arr1[1:,:-1],arr1[:-1,1:]),axis=2)                                 
Out[234]: 
array([[[ 0.,  0.,  1.,  6.,  2.],
        [ 0.,  1.,  2.,  7.,  3.],
        [ 0.,  2.,  3.,  8.,  4.],
        [ 0.,  3.,  4.,  9.,  5.],
        [ 0.,  4.,  5., 10., nan]],

       [[ 1.,  0.,  6., nan,  7.],
        [ 1.,  1.,  7., nan,  8.],
        [ 1.,  2.,  8., nan,  9.],
        [ 1.,  3.,  9., nan, 10.],
        [ 1.,  4., 10., nan, nan]]])
 In [235]: _.reshape(-1,5)           # to get the 2d array  
 ...