合并两个具有连续行的numpy数组
Merging two numpy arrays with sequential rows
我有两个 numpy 数组,希望在不使用任何 for 循环的情况下将它们与以下规则合并。
- 从第一个数组中取出前 n 行。
- 添加第二个数组的前 m 行。
- 在第一个数组的第 n 到 2n 之间添加行。
- 在第二个数组的 m 和 2m 之间添加行。
.....
- 添加第二个数组的最后 m 行。
例如,假设我有两个数组 n=2, m=3
x = np.random.randint(10, size=(10, 6))
y = np.random.randint(20, size=(12, 6))
[[5 0 2 2 6 1]
[4 8 9 2 7 2]
[5 5 0 5 3 0]
[2 1 4 7 9 4]
[8 1 1 9 2 8]
[4 1 1 0 1 1]
[2 9 3 5 7 9]
[3 6 6 6 0 4]
[4 4 7 3 7 9]
[7 3 7 1 5 2]]
[[ 3 15 3 8 12 12]
[19 12 13 0 19 16]
[11 2 18 16 9 19]
[19 15 15 11 13 2]
[19 14 1 6 13 17]
[19 14 19 14 13 3]
[ 0 1 13 0 19 10]
[19 13 19 5 16 13]
[12 4 15 11 12 17]
[ 4 19 17 2 11 12]
[ 9 12 10 9 15 3]
[13 7 2 5 13 10]]
期望的输出是
[[5 0 2 2 6 1]
[4 8 9 2 7 2]
[ 3 15 3 8 12 12]
[19 12 13 0 19 16]
[11 2 18 16 9 19]
[5 5 0 5 3 0]
[2 1 4 7 9 4]
[19 15 15 11 13 2]
[19 14 1 6 13 17]
[19 14 19 14 13 3]
[8 1 1 9 2 8]
[4 1 1 0 1 1]
[ 0 1 13 0 19 10]
[19 13 19 5 16 13]
[12 4 15 11 12 17]
[2 9 3 5 7 9]
[3 6 6 6 0 4]
[ 4 19 17 2 11 12]
[ 9 12 10 9 15 3]
[13 7 2 5 13 10]
[4 4 7 3 7 9]
[7 3 7 1 5 2]
您可以创建一个将数组拆分为连续切片(块)的函数。然后,将两个数组分块并使用 itertools.zip_longest
函数交错排列。最后将输出包装在 np.vstack
中以获得新数组。
import numpy as np
from itertool import zip_longest
from math import ceil
def chunk(arr, n):
"""Split an array `arr` into n-sized chunks along its first axis"""
for i in range(ceil(len(arr)/n)):
ix = slice(i * n, (i+1) * n)
yield arr[ix]
def chunk_stack(a, b, n, m):
"""Splits the arrays `a` and `b` into `n` and `m` sized chunks.
Returns an array of the interleaved chunks.
"""
chunker_a = chunk(a, n)
chunker_b = chunk(b, m)
arr = []
for cha, chb in zip_longest(chunker_a, chunker_b):
if cha is not None:
arr.append(cha)
if chb is not None:
arr.append(chb)
return np.vstack(arr)
在您的示例数组上测试它:
x = np.array(
[[5, 0, 2, 2, 6, 1],
[4, 8, 9, 2, 7, 2],
[5, 5, 0, 5, 3, 0],
[2, 1, 4, 7, 9, 4],
[8, 1, 1, 9, 2, 8],
[4, 1, 1, 0, 1, 1],
[2, 9, 3, 5, 7, 9],
[3, 6, 6, 6, 0, 4],
[4, 4, 7, 3, 7, 9],
[7, 3, 7, 1, 5, 2]])
y = np.array(
[[3, 15, 3, 8, 12, 12],
[19, 12, 13, 0, 19, 16],
[11, 2, 18, 16, 9, 19],
[19, 15, 15, 11, 13, 2],
[19, 14, 1, 6, 13, 17],
[19, 14, 19, 14, 13, 3],
[0, 1, 13, 0, 19, 10],
[19, 13, 19, 5, 16, 13],
[12, 4, 15, 11, 12, 17],
[4, 19, 17, 2, 11, 12],
[9, 12, 10, 9, 15, 3],
[13, 7, 2, 5, 13, 10]])
chunk_stack(x, y, 2, 3)
# returns:
array([[ 5, 0, 2, 2, 6, 1],
[ 4, 8, 9, 2, 7, 2],
[ 3, 15, 3, 8, 12, 12],
[19, 12, 13, 0, 19, 16],
[11, 2, 18, 16, 9, 19],
[ 5, 5, 0, 5, 3, 0],
[ 2, 1, 4, 7, 9, 4],
[19, 15, 15, 11, 13, 2],
[19, 14, 1, 6, 13, 17],
[19, 14, 19, 14, 13, 3],
[ 8, 1, 1, 9, 2, 8],
[ 4, 1, 1, 0, 1, 1],
[ 0, 1, 13, 0, 19, 10],
[19, 13, 19, 5, 16, 13],
[12, 4, 15, 11, 12, 17],
[ 2, 9, 3, 5, 7, 9],
[ 3, 6, 6, 6, 0, 4],
[ 4, 19, 17, 2, 11, 12],
[ 9, 12, 10, 9, 15, 3],
[13, 7, 2, 5, 13, 10],
[ 4, 4, 7, 3, 7, 9],
[ 7, 3, 7, 1, 5, 2]])
您可以创建一个输出数组并按索引将输入放入其中。输出总是
output = np.empty((x.shape[0] + y.shape[0], x.shape[1]), dtype=x.dtype)
您可以生成如下输出索引:
idx = (np.arange(0, output.shape[0] - n + 1, m + n)[:, None] + np.arange(n)).ravel()
idy = (np.arange(n, output.shape[0] - m + 1, m + n)[:, None] + np.arange(m)).ravel()
这将创建一个包含起始索引的列向量,并添加 n
或 m
步骤以标记输入所在的所有行。然后您可以直接分配输入:
output[idx, :] = x
output[idy, :] = y
我们重塑 x 和 y,将 n 和 m 分组在一起
然后我们水平堆叠,使n和m形成交替序列
然后,无论 x 和 y 是什么,我们都会追加那些
x = np.random.randint(10, size=(10, 6))
y = np.random.randint(20, size=(12, 6))
n, m = 2, 3
output = np.empty((x.shape[0] + y.shape[0], x.shape[1]), dtype=x.dtype)
x_dim_1 = x.shape[0] // n # 5
y_dim_1 = y.shape[0] // m # 4
common_dim = min(x_dim_1, y_dim_1) # 4
x_1 = x[:common_dim * n].reshape(common_dim, n, -1) # (4, 2, 6)
y_1 = y[:common_dim * m].reshape(common_dim, m, -1) # (4, 3, 6)
# We stack horizontally x_1, y_1 to (4, 5, 6) then convert 4, 5 -> 4*5
# make n's and m's alternate
assign_til = common_dim * (n + m)
output[:assign_til] = np.hstack([x_1, y_1]).reshape(assign_til, x.shape[1])
# Remaining x's and y's
r_x = x[common_dim * n:]
r_y = y[common_dim * m:]
# Next entry in output will be of r_x, since alternate
# Choose n entries or whatever remaining and append those
rem = min(r_x.shape[0], n)
output[assign_til:assign_til + rem] = r_x[:rem]
assign_til += rem
# Next append all remaining y's
output[assign_til:] = r_y
assign_til += r_y.shape[0]
# If by chance x_dim_1 > y_dim_1 then r_x has atleast n elements
output[assign_til:] = r_x[rem:]
我有两个 numpy 数组,希望在不使用任何 for 循环的情况下将它们与以下规则合并。
- 从第一个数组中取出前 n 行。
- 添加第二个数组的前 m 行。
- 在第一个数组的第 n 到 2n 之间添加行。
- 在第二个数组的 m 和 2m 之间添加行。
.....
- 添加第二个数组的最后 m 行。
例如,假设我有两个数组 n=2, m=3
x = np.random.randint(10, size=(10, 6))
y = np.random.randint(20, size=(12, 6))
[[5 0 2 2 6 1]
[4 8 9 2 7 2]
[5 5 0 5 3 0]
[2 1 4 7 9 4]
[8 1 1 9 2 8]
[4 1 1 0 1 1]
[2 9 3 5 7 9]
[3 6 6 6 0 4]
[4 4 7 3 7 9]
[7 3 7 1 5 2]]
[[ 3 15 3 8 12 12]
[19 12 13 0 19 16]
[11 2 18 16 9 19]
[19 15 15 11 13 2]
[19 14 1 6 13 17]
[19 14 19 14 13 3]
[ 0 1 13 0 19 10]
[19 13 19 5 16 13]
[12 4 15 11 12 17]
[ 4 19 17 2 11 12]
[ 9 12 10 9 15 3]
[13 7 2 5 13 10]]
期望的输出是
[[5 0 2 2 6 1]
[4 8 9 2 7 2]
[ 3 15 3 8 12 12]
[19 12 13 0 19 16]
[11 2 18 16 9 19]
[5 5 0 5 3 0]
[2 1 4 7 9 4]
[19 15 15 11 13 2]
[19 14 1 6 13 17]
[19 14 19 14 13 3]
[8 1 1 9 2 8]
[4 1 1 0 1 1]
[ 0 1 13 0 19 10]
[19 13 19 5 16 13]
[12 4 15 11 12 17]
[2 9 3 5 7 9]
[3 6 6 6 0 4]
[ 4 19 17 2 11 12]
[ 9 12 10 9 15 3]
[13 7 2 5 13 10]
[4 4 7 3 7 9]
[7 3 7 1 5 2]
您可以创建一个将数组拆分为连续切片(块)的函数。然后,将两个数组分块并使用 itertools.zip_longest
函数交错排列。最后将输出包装在 np.vstack
中以获得新数组。
import numpy as np
from itertool import zip_longest
from math import ceil
def chunk(arr, n):
"""Split an array `arr` into n-sized chunks along its first axis"""
for i in range(ceil(len(arr)/n)):
ix = slice(i * n, (i+1) * n)
yield arr[ix]
def chunk_stack(a, b, n, m):
"""Splits the arrays `a` and `b` into `n` and `m` sized chunks.
Returns an array of the interleaved chunks.
"""
chunker_a = chunk(a, n)
chunker_b = chunk(b, m)
arr = []
for cha, chb in zip_longest(chunker_a, chunker_b):
if cha is not None:
arr.append(cha)
if chb is not None:
arr.append(chb)
return np.vstack(arr)
在您的示例数组上测试它:
x = np.array(
[[5, 0, 2, 2, 6, 1],
[4, 8, 9, 2, 7, 2],
[5, 5, 0, 5, 3, 0],
[2, 1, 4, 7, 9, 4],
[8, 1, 1, 9, 2, 8],
[4, 1, 1, 0, 1, 1],
[2, 9, 3, 5, 7, 9],
[3, 6, 6, 6, 0, 4],
[4, 4, 7, 3, 7, 9],
[7, 3, 7, 1, 5, 2]])
y = np.array(
[[3, 15, 3, 8, 12, 12],
[19, 12, 13, 0, 19, 16],
[11, 2, 18, 16, 9, 19],
[19, 15, 15, 11, 13, 2],
[19, 14, 1, 6, 13, 17],
[19, 14, 19, 14, 13, 3],
[0, 1, 13, 0, 19, 10],
[19, 13, 19, 5, 16, 13],
[12, 4, 15, 11, 12, 17],
[4, 19, 17, 2, 11, 12],
[9, 12, 10, 9, 15, 3],
[13, 7, 2, 5, 13, 10]])
chunk_stack(x, y, 2, 3)
# returns:
array([[ 5, 0, 2, 2, 6, 1],
[ 4, 8, 9, 2, 7, 2],
[ 3, 15, 3, 8, 12, 12],
[19, 12, 13, 0, 19, 16],
[11, 2, 18, 16, 9, 19],
[ 5, 5, 0, 5, 3, 0],
[ 2, 1, 4, 7, 9, 4],
[19, 15, 15, 11, 13, 2],
[19, 14, 1, 6, 13, 17],
[19, 14, 19, 14, 13, 3],
[ 8, 1, 1, 9, 2, 8],
[ 4, 1, 1, 0, 1, 1],
[ 0, 1, 13, 0, 19, 10],
[19, 13, 19, 5, 16, 13],
[12, 4, 15, 11, 12, 17],
[ 2, 9, 3, 5, 7, 9],
[ 3, 6, 6, 6, 0, 4],
[ 4, 19, 17, 2, 11, 12],
[ 9, 12, 10, 9, 15, 3],
[13, 7, 2, 5, 13, 10],
[ 4, 4, 7, 3, 7, 9],
[ 7, 3, 7, 1, 5, 2]])
您可以创建一个输出数组并按索引将输入放入其中。输出总是
output = np.empty((x.shape[0] + y.shape[0], x.shape[1]), dtype=x.dtype)
您可以生成如下输出索引:
idx = (np.arange(0, output.shape[0] - n + 1, m + n)[:, None] + np.arange(n)).ravel()
idy = (np.arange(n, output.shape[0] - m + 1, m + n)[:, None] + np.arange(m)).ravel()
这将创建一个包含起始索引的列向量,并添加 n
或 m
步骤以标记输入所在的所有行。然后您可以直接分配输入:
output[idx, :] = x
output[idy, :] = y
我们重塑 x 和 y,将 n 和 m 分组在一起
然后我们水平堆叠,使n和m形成交替序列
然后,无论 x 和 y 是什么,我们都会追加那些
x = np.random.randint(10, size=(10, 6))
y = np.random.randint(20, size=(12, 6))
n, m = 2, 3
output = np.empty((x.shape[0] + y.shape[0], x.shape[1]), dtype=x.dtype)
x_dim_1 = x.shape[0] // n # 5
y_dim_1 = y.shape[0] // m # 4
common_dim = min(x_dim_1, y_dim_1) # 4
x_1 = x[:common_dim * n].reshape(common_dim, n, -1) # (4, 2, 6)
y_1 = y[:common_dim * m].reshape(common_dim, m, -1) # (4, 3, 6)
# We stack horizontally x_1, y_1 to (4, 5, 6) then convert 4, 5 -> 4*5
# make n's and m's alternate
assign_til = common_dim * (n + m)
output[:assign_til] = np.hstack([x_1, y_1]).reshape(assign_til, x.shape[1])
# Remaining x's and y's
r_x = x[common_dim * n:]
r_y = y[common_dim * m:]
# Next entry in output will be of r_x, since alternate
# Choose n entries or whatever remaining and append those
rem = min(r_x.shape[0], n)
output[assign_til:assign_til + rem] = r_x[:rem]
assign_til += rem
# Next append all remaining y's
output[assign_til:] = r_y
assign_til += r_y.shape[0]
# If by chance x_dim_1 > y_dim_1 then r_x has atleast n elements
output[assign_til:] = r_x[rem:]