是否可以矢量化此 numpy 数组比较?

Is it possible to vectorize this numpy array comparison?

我在 Python 中有这两个 numpy 数组:

a = np.array(sorted(np.random.rand(6)*6)) # It is sorted.
b = np.array(np.random.rand(3)*6)

假设数组是

a = array([0.27148588, 0.42828064, 2.48130785, 4.01811243, 4.79403723, 5.46398145])
b = array([0.06231266, 1.64276013, 5.22786201])

我想生成一个包含索引的数组,其中 ab 中的每个元素 <=,即我想要的正是:

np.argmin(np.array([a<b_i for b_i in b]),1)-1

产生 array([-1, 1, 4]) 意味着 b[0]<a[0]a[1]<b[1]<a[2]a[4]<b[2]<a[5].

是否有任何本机 numpy 快速矢量化方法可以避免 for 循环?

要回答您的 具体 问题,即获得等同于 np.array([a<b_i for b_i in b] 的矢量化方法,您可以利用广播,在这里,您可以使用:

a[None, ...] < b[..., None]

所以:

>>> a[None, ...] < b[..., None]
array([[False, False, False, False, False, False],
       [ True,  True, False, False, False, False],
       [ True,  True,  True,  True,  True, False]])

重要的是,对于广播:

>>> a[None, ...].shape,  b[..., None].shape
((1, 6), (3, 1))

这里link to the official numpy docs了解广播。一些相关花絮:

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when

  1. they are equal, or

  2. one of them is 1

...

When either of the dimensions compared is one, the other is used. In other words, dimensions with size 1 are stretched or “copied” to match the other.

编辑

如您问题下的评论所述,使用完全不同的方法在算法上比您自己的蛮力解决方案要好得多,即利用二进制搜索,使用np.searchsorted