如何比较两个列表并提取位置、索引和邻居?
How to compare two lists and extract position, index and neighbors?
假设我们有两个列表:
list1 = [1, 2, 3, 4, 5]
list2 = [6, 7, 8, 9, 10]
这是基本结构:
Columns
Rows 0 1 2 3 4
0 1 2 3 4 5
1 6 7 8 9 10
每个元素都应打印一行,其中包含一些元数据(索引、位置和邻居):
row/col Print statement
0/0 "Row index=0, Column Index=0, Value=1, Value below=6, Value to the right = 2"
0/1 "Row index=0, Column Index=1, Value=2, Value below=7, Value to the right = 3"
0/2 "Row index=0, Column Index=2, Value=3, Value below=8, Value to the right = 4"
0/3 "Row index=0, Column Index=3, Value=4, Value below=9, Value to the right = 5"
0/4 "Row index=0, Column Index=4, Value=5, Value below=10, Value to the right = NaN"
1/0 "Row index=1, Column Index=0, Value=6, Value below=NaN, Value to the right = 7"
1/1 "Row index=1, Column Index=1, Value=7, Value below=NaN, Value to the right = 8"
1/2 "Row index=1, Column Index=2, Value=8, Value below=NaN, Value to the right = 9"
1/3 "Row index=1, Column Index=3, Value=9, Value below=NaN, Value to the right = 10"
1/4 "Row index=1, Column Index=4, Value=10, Value below=NaN, Value to the right = NaN"
是否有类似列表理解或任何其他方法来尽快比较这两个列表?
我不想使用 for/while 循环,因为它们被认为非常慢。
编辑:该解决方案将成为必须处理数百万次比较的大数据功能的核心。使用传统的 for 循环会极大地降低我的功能。这就是为什么我正在寻找一种更快的方法来做到这一点。
我认为解决您的问题的最佳方法是使用 for 循环。下面的代码接受多行并将其存储在字典中。
row1 = [1, 2, 3, 4, 5]
row2 = [6, 7, 8, 9, 10]
row3 = [11, 12, 13, 14, 15]
row4 = [16, 17, 18, 19, 20]
row5 = [21, 22, 23, 24, 25]
def create_array_dict(*args):
return {row_index: {column_index: value for column_index, value in enumerate(row_list)} for row_index, row_list in enumerate(args)}
arr = create_array_dict(row1, row2, row3, row4, row5)
创建字典后,我们可以打印出结果:
def print_output(arr):
for row, value in arr.items():
for column, value in value.items():
try:
print(f"Row index={row}, Column Index={column}, Value={value},Value below={arr[row + 1][column]}, Value to the right={arr[row][column + 1]}")
except:
try:
print(f"Row index={row}, Column Index={column}, Value={value},Value below={arr[row + 1][column]}, Value to the right=None")
except:
try:
print(f"Row index={row}, Column Index={column}, Value={value},Value below=None, Value to the right={arr[row][column + 1]}")
except:
print(f"Row index={row}, Column Index={column}, Value={value},Value below=None, Value to the right=None")
输出:
Row index=0, Column Index=0, Value=1,Value below=6, Value to the right=2
Row index=0, Column Index=1, Value=2,Value below=7, Value to the right=3
Row index=0, Column Index=2, Value=3,Value below=8, Value to the right=4
Row index=0, Column Index=3, Value=4,Value below=9, Value to the right=5
Row index=0, Column Index=4, Value=5,Value below=10, Value to the right=None
Row index=1, Column Index=0, Value=6,Value below=11, Value to the right=7
Row index=1, Column Index=1, Value=7,Value below=12, Value to the right=8
Row index=1, Column Index=2, Value=8,Value below=13, Value to the right=9
Row index=1, Column Index=3, Value=9,Value below=14, Value to the right=10
Row index=1, Column Index=4, Value=10,Value below=15, Value to the right=None
Row index=2, Column Index=0, Value=11,Value below=16, Value to the right=12
Row index=2, Column Index=1, Value=12,Value below=17, Value to the right=13
Row index=2, Column Index=2, Value=13,Value below=18, Value to the right=14
Row index=2, Column Index=3, Value=14,Value below=19, Value to the right=15
Row index=2, Column Index=4, Value=15,Value below=20, Value to the right=None
Row index=3, Column Index=0, Value=16,Value below=21, Value to the right=17
Row index=3, Column Index=1, Value=17,Value below=22, Value to the right=18
Row index=3, Column Index=2, Value=18,Value below=23, Value to the right=19
Row index=3, Column Index=3, Value=19,Value below=24, Value to the right=20
Row index=3, Column Index=4, Value=20,Value below=25, Value to the right=None
Row index=4, Column Index=0, Value=21,Value below=None, Value to the right=22
Row index=4, Column Index=1, Value=22,Value below=None, Value to the right=23
Row index=4, Column Index=2, Value=23,Value below=None, Value to the right=24
Row index=4, Column Index=3, Value=24,Value below=None, Value to the right=25
Row index=4, Column Index=4, Value=25,Value below=None, Value to the right=None
如果你想要一个字符串列表,那么你必须迭代 - 遍历每个元素。这样的字符串格式适用于标量元素,而不适用于整个数组。
可以使用整个数组操作生成值。但结果将是某种形式的数组,而不是您的字符串列表。
例如:
In [198]: list1 = [1, 2, 3, 4, 5]
...: list2 = [6, 7, 8, 9, 10]
In [199]: arr = np.array([list1,list2])
In [200]: arr
Out[200]:
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10]])
迭代数组同时显示索引的一种简单方法是:
In [201]: list(np.ndenumerate(arr))
Out[201]:
[((0, 0), 1),
((0, 1), 2),
((0, 2), 3),
((0, 3), 4),
((0, 4), 5),
((1, 0), 6),
((1, 1), 7),
((1, 2), 8),
((1, 3), 9),
((1, 4), 10)]
这仍然是一个迭代。我们可以通过以下方式获得数组形式的索引:
In [215]: np.indices(arr.shape)
Out[215]:
array([[[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
In [216]: I,J = np.indices(arr.shape)
并用以下值堆叠这些索引:
In [218]: np.stack((I.ravel(),J.ravel(),arr.ravel()),axis=1)
Out[218]:
array([[ 0, 0, 1],
[ 0, 1, 2],
[ 0, 2, 3],
[ 0, 3, 4],
[ 0, 4, 5],
[ 1, 0, 6],
[ 1, 1, 7],
[ 1, 2, 8],
[ 1, 3, 9],
[ 1, 4, 10]])
要获得下面的正确值,我们可以生成一个填充数组:
In [223]: arr1 = np.pad(arr.astype(float),[(0,1),(0,1)],mode='constant',constant_values=np.nan)
...:
In [224]: arr1
Out[224]:
array([[ 1., 2., 3., 4., 5., nan],
[ 6., 7., 8., 9., 10., nan],
[nan, nan, nan, nan, nan, nan]])
请注意,我必须将数组转换为浮点数才能接受浮点数 np.nan
值。
结合一切:
In [225]: np.stack((I.ravel(),J.ravel(),arr.ravel(),arr1[1:,:-1].ravel(),arr1[:-1,1:].ravel()),
...: axis=1)
Out[225]:
array([[ 0., 0., 1., 6., 2.],
[ 0., 1., 2., 7., 3.],
[ 0., 2., 3., 8., 4.],
[ 0., 3., 4., 9., 5.],
[ 0., 4., 5., 10., nan],
[ 1., 0., 6., nan, 7.],
[ 1., 1., 7., nan, 8.],
[ 1., 2., 8., nan, 9.],
[ 1., 3., 9., nan, 10.],
[ 1., 4., 10., nan, nan]])
要获取您的字符串列表,我们可以定义格式字符串:
In [230]: astr = "Row index={}, Column index={}, Value={}, Value below={}, Value to the right={}"
并将其应用到到达行(是的,这会迭代)
In [233]: for row in Out[225]:
...: print(astr.format(*row))
...:
Row index=0.0, Column index=0.0, Value=1.0, Value below=6.0, Value to the right=2.0
Row index=0.0, Column index=1.0, Value=2.0, Value below=7.0, Value to the right=3.0
Row index=0.0, Column index=2.0, Value=3.0, Value below=8.0, Value to the right=4.0
Row index=0.0, Column index=3.0, Value=4.0, Value below=9.0, Value to the right=5.0
Row index=0.0, Column index=4.0, Value=5.0, Value below=10.0, Value to the right=nan
Row index=1.0, Column index=0.0, Value=6.0, Value below=nan, Value to the right=7.0
Row index=1.0, Column index=1.0, Value=7.0, Value below=nan, Value to the right=8.0
Row index=1.0, Column index=2.0, Value=8.0, Value below=nan, Value to the right=9.0
Row index=1.0, Column index=3.0, Value=9.0, Value below=nan, Value to the right=10.0
Row index=1.0, Column index=4.0, Value=10.0, Value below=nan, Value to the right=nan
如果我们省略所有这些 ravel
,我们将得到一个 3d 值数组:
In [234]: np.stack((I,J,arr,arr1[1:,:-1],arr1[:-1,1:]),axis=2)
Out[234]:
array([[[ 0., 0., 1., 6., 2.],
[ 0., 1., 2., 7., 3.],
[ 0., 2., 3., 8., 4.],
[ 0., 3., 4., 9., 5.],
[ 0., 4., 5., 10., nan]],
[[ 1., 0., 6., nan, 7.],
[ 1., 1., 7., nan, 8.],
[ 1., 2., 8., nan, 9.],
[ 1., 3., 9., nan, 10.],
[ 1., 4., 10., nan, nan]]])
In [235]: _.reshape(-1,5) # to get the 2d array
...
假设我们有两个列表:
list1 = [1, 2, 3, 4, 5]
list2 = [6, 7, 8, 9, 10]
这是基本结构:
Columns
Rows 0 1 2 3 4
0 1 2 3 4 5
1 6 7 8 9 10
每个元素都应打印一行,其中包含一些元数据(索引、位置和邻居):
row/col Print statement
0/0 "Row index=0, Column Index=0, Value=1, Value below=6, Value to the right = 2"
0/1 "Row index=0, Column Index=1, Value=2, Value below=7, Value to the right = 3"
0/2 "Row index=0, Column Index=2, Value=3, Value below=8, Value to the right = 4"
0/3 "Row index=0, Column Index=3, Value=4, Value below=9, Value to the right = 5"
0/4 "Row index=0, Column Index=4, Value=5, Value below=10, Value to the right = NaN"
1/0 "Row index=1, Column Index=0, Value=6, Value below=NaN, Value to the right = 7"
1/1 "Row index=1, Column Index=1, Value=7, Value below=NaN, Value to the right = 8"
1/2 "Row index=1, Column Index=2, Value=8, Value below=NaN, Value to the right = 9"
1/3 "Row index=1, Column Index=3, Value=9, Value below=NaN, Value to the right = 10"
1/4 "Row index=1, Column Index=4, Value=10, Value below=NaN, Value to the right = NaN"
是否有类似列表理解或任何其他方法来尽快比较这两个列表?
我不想使用 for/while 循环,因为它们被认为非常慢。
编辑:该解决方案将成为必须处理数百万次比较的大数据功能的核心。使用传统的 for 循环会极大地降低我的功能。这就是为什么我正在寻找一种更快的方法来做到这一点。
我认为解决您的问题的最佳方法是使用 for 循环。下面的代码接受多行并将其存储在字典中。
row1 = [1, 2, 3, 4, 5]
row2 = [6, 7, 8, 9, 10]
row3 = [11, 12, 13, 14, 15]
row4 = [16, 17, 18, 19, 20]
row5 = [21, 22, 23, 24, 25]
def create_array_dict(*args):
return {row_index: {column_index: value for column_index, value in enumerate(row_list)} for row_index, row_list in enumerate(args)}
arr = create_array_dict(row1, row2, row3, row4, row5)
创建字典后,我们可以打印出结果:
def print_output(arr):
for row, value in arr.items():
for column, value in value.items():
try:
print(f"Row index={row}, Column Index={column}, Value={value},Value below={arr[row + 1][column]}, Value to the right={arr[row][column + 1]}")
except:
try:
print(f"Row index={row}, Column Index={column}, Value={value},Value below={arr[row + 1][column]}, Value to the right=None")
except:
try:
print(f"Row index={row}, Column Index={column}, Value={value},Value below=None, Value to the right={arr[row][column + 1]}")
except:
print(f"Row index={row}, Column Index={column}, Value={value},Value below=None, Value to the right=None")
输出:
Row index=0, Column Index=0, Value=1,Value below=6, Value to the right=2
Row index=0, Column Index=1, Value=2,Value below=7, Value to the right=3
Row index=0, Column Index=2, Value=3,Value below=8, Value to the right=4
Row index=0, Column Index=3, Value=4,Value below=9, Value to the right=5
Row index=0, Column Index=4, Value=5,Value below=10, Value to the right=None
Row index=1, Column Index=0, Value=6,Value below=11, Value to the right=7
Row index=1, Column Index=1, Value=7,Value below=12, Value to the right=8
Row index=1, Column Index=2, Value=8,Value below=13, Value to the right=9
Row index=1, Column Index=3, Value=9,Value below=14, Value to the right=10
Row index=1, Column Index=4, Value=10,Value below=15, Value to the right=None
Row index=2, Column Index=0, Value=11,Value below=16, Value to the right=12
Row index=2, Column Index=1, Value=12,Value below=17, Value to the right=13
Row index=2, Column Index=2, Value=13,Value below=18, Value to the right=14
Row index=2, Column Index=3, Value=14,Value below=19, Value to the right=15
Row index=2, Column Index=4, Value=15,Value below=20, Value to the right=None
Row index=3, Column Index=0, Value=16,Value below=21, Value to the right=17
Row index=3, Column Index=1, Value=17,Value below=22, Value to the right=18
Row index=3, Column Index=2, Value=18,Value below=23, Value to the right=19
Row index=3, Column Index=3, Value=19,Value below=24, Value to the right=20
Row index=3, Column Index=4, Value=20,Value below=25, Value to the right=None
Row index=4, Column Index=0, Value=21,Value below=None, Value to the right=22
Row index=4, Column Index=1, Value=22,Value below=None, Value to the right=23
Row index=4, Column Index=2, Value=23,Value below=None, Value to the right=24
Row index=4, Column Index=3, Value=24,Value below=None, Value to the right=25
Row index=4, Column Index=4, Value=25,Value below=None, Value to the right=None
如果你想要一个字符串列表,那么你必须迭代 - 遍历每个元素。这样的字符串格式适用于标量元素,而不适用于整个数组。
可以使用整个数组操作生成值。但结果将是某种形式的数组,而不是您的字符串列表。
例如:
In [198]: list1 = [1, 2, 3, 4, 5]
...: list2 = [6, 7, 8, 9, 10]
In [199]: arr = np.array([list1,list2])
In [200]: arr
Out[200]:
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10]])
迭代数组同时显示索引的一种简单方法是:
In [201]: list(np.ndenumerate(arr))
Out[201]:
[((0, 0), 1),
((0, 1), 2),
((0, 2), 3),
((0, 3), 4),
((0, 4), 5),
((1, 0), 6),
((1, 1), 7),
((1, 2), 8),
((1, 3), 9),
((1, 4), 10)]
这仍然是一个迭代。我们可以通过以下方式获得数组形式的索引:
In [215]: np.indices(arr.shape)
Out[215]:
array([[[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
In [216]: I,J = np.indices(arr.shape)
并用以下值堆叠这些索引:
In [218]: np.stack((I.ravel(),J.ravel(),arr.ravel()),axis=1)
Out[218]:
array([[ 0, 0, 1],
[ 0, 1, 2],
[ 0, 2, 3],
[ 0, 3, 4],
[ 0, 4, 5],
[ 1, 0, 6],
[ 1, 1, 7],
[ 1, 2, 8],
[ 1, 3, 9],
[ 1, 4, 10]])
要获得下面的正确值,我们可以生成一个填充数组:
In [223]: arr1 = np.pad(arr.astype(float),[(0,1),(0,1)],mode='constant',constant_values=np.nan)
...:
In [224]: arr1
Out[224]:
array([[ 1., 2., 3., 4., 5., nan],
[ 6., 7., 8., 9., 10., nan],
[nan, nan, nan, nan, nan, nan]])
请注意,我必须将数组转换为浮点数才能接受浮点数 np.nan
值。
结合一切:
In [225]: np.stack((I.ravel(),J.ravel(),arr.ravel(),arr1[1:,:-1].ravel(),arr1[:-1,1:].ravel()),
...: axis=1)
Out[225]:
array([[ 0., 0., 1., 6., 2.],
[ 0., 1., 2., 7., 3.],
[ 0., 2., 3., 8., 4.],
[ 0., 3., 4., 9., 5.],
[ 0., 4., 5., 10., nan],
[ 1., 0., 6., nan, 7.],
[ 1., 1., 7., nan, 8.],
[ 1., 2., 8., nan, 9.],
[ 1., 3., 9., nan, 10.],
[ 1., 4., 10., nan, nan]])
要获取您的字符串列表,我们可以定义格式字符串:
In [230]: astr = "Row index={}, Column index={}, Value={}, Value below={}, Value to the right={}"
并将其应用到到达行(是的,这会迭代)
In [233]: for row in Out[225]:
...: print(astr.format(*row))
...:
Row index=0.0, Column index=0.0, Value=1.0, Value below=6.0, Value to the right=2.0
Row index=0.0, Column index=1.0, Value=2.0, Value below=7.0, Value to the right=3.0
Row index=0.0, Column index=2.0, Value=3.0, Value below=8.0, Value to the right=4.0
Row index=0.0, Column index=3.0, Value=4.0, Value below=9.0, Value to the right=5.0
Row index=0.0, Column index=4.0, Value=5.0, Value below=10.0, Value to the right=nan
Row index=1.0, Column index=0.0, Value=6.0, Value below=nan, Value to the right=7.0
Row index=1.0, Column index=1.0, Value=7.0, Value below=nan, Value to the right=8.0
Row index=1.0, Column index=2.0, Value=8.0, Value below=nan, Value to the right=9.0
Row index=1.0, Column index=3.0, Value=9.0, Value below=nan, Value to the right=10.0
Row index=1.0, Column index=4.0, Value=10.0, Value below=nan, Value to the right=nan
如果我们省略所有这些 ravel
,我们将得到一个 3d 值数组:
In [234]: np.stack((I,J,arr,arr1[1:,:-1],arr1[:-1,1:]),axis=2)
Out[234]:
array([[[ 0., 0., 1., 6., 2.],
[ 0., 1., 2., 7., 3.],
[ 0., 2., 3., 8., 4.],
[ 0., 3., 4., 9., 5.],
[ 0., 4., 5., 10., nan]],
[[ 1., 0., 6., nan, 7.],
[ 1., 1., 7., nan, 8.],
[ 1., 2., 8., nan, 9.],
[ 1., 3., 9., nan, 10.],
[ 1., 4., 10., nan, nan]]])
In [235]: _.reshape(-1,5) # to get the 2d array
...