将数组切成段
Slice an array into segments
假设我有一个数组[1,2,3,4,5,6,7,8]
,数组由两个样本[1,2,3,4]
和[5,6,7,8]
组成。对于每个样本,我想用 window 大小 n
进行切片 window。如果没有足够的元素,则用最后一个元素填充结果。 return 值中的每一行应该是从该行中的元素开始的切片 window。
例如:
如果 n=3
,那么结果应该是:
[[1,2,3],
[2,3,4],
[3,4,4],
[4,4,4],
[5,6,7],
[6,7,8],
[7,8,8],
[8,8,8]]
如何通过高效切片而不是 for 循环来实现这一点?谢谢。
一个python列表方法:
In [201]: order = [1,3,2,3,5,8]
In [202]: samples = [[1,2,3,4],[5,6,7,8]]
扩展示例以解决填充问题:
In [203]: samples = [row+([row[-1]]*n) for row in samples]
In [204]: samples
Out[204]: [[1, 2, 3, 4, 4, 4, 4], [5, 6, 7, 8, 8, 8, 8]]
定义函数:
def foo(i, samples):
for row in samples:
try:
j = row.index(i)
except ValueError:
continue
return row[j:j+n]
In [207]: foo(3,samples)
Out[207]: [3, 4, 4]
In [208]: foo(9,samples) # non-found case isn't handled well
对于所有订单元素:
In [209]: [foo(i,samples) for i in order]
Out[209]: [[1, 2, 3], [3, 4, 4], [2, 3, 4], [3, 4, 4], [5, 6, 7], [8, 8, 8]]
@hpaulj 使用一些 numpy 内置功能的类似方法
import numpy as np
samples = [[1,2,3,4],[5,6,7,8]]
ws = 3 #window size
# add padding
samples = [s + [s[-1]]*(ws-1) for s in samples]
# rolling window function for arrays
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1]-window+1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
result = sum([rolling_window(np.array(s), ws).tolist() for s in samples ], [])
result
[[1, 2, 3],
[2, 3, 4],
[3, 4, 4],
[4, 4, 4],
[5, 6, 7],
[6, 7, 8],
[7, 8, 8],
[8, 8, 8]]
我有一个简单的内衬:
import numpy as np
samples = np.array([[1,2,3,4],[5,6,7,8]])
n,d = samples.shape
ws = 3
result = samples[:,np.minimum(np.arange(d)[:,None]+np.arange(ws)[None,:],d-1)]
输出是:
没有循环,只有广播。这使得它可能是最有效的方法。输出的维度不完全是你要求的,但很容易用简单的 np.reshape
来纠正
[[[1 2 3]
[2 3 4]
[3 4 4]
[4 4 4]]
[[5 6 7]
[6 7 8]
[7 8 8]
[8 8 8]]]
假设我有一个数组[1,2,3,4,5,6,7,8]
,数组由两个样本[1,2,3,4]
和[5,6,7,8]
组成。对于每个样本,我想用 window 大小 n
进行切片 window。如果没有足够的元素,则用最后一个元素填充结果。 return 值中的每一行应该是从该行中的元素开始的切片 window。
例如:
如果 n=3
,那么结果应该是:
[[1,2,3],
[2,3,4],
[3,4,4],
[4,4,4],
[5,6,7],
[6,7,8],
[7,8,8],
[8,8,8]]
如何通过高效切片而不是 for 循环来实现这一点?谢谢。
一个python列表方法:
In [201]: order = [1,3,2,3,5,8]
In [202]: samples = [[1,2,3,4],[5,6,7,8]]
扩展示例以解决填充问题:
In [203]: samples = [row+([row[-1]]*n) for row in samples]
In [204]: samples
Out[204]: [[1, 2, 3, 4, 4, 4, 4], [5, 6, 7, 8, 8, 8, 8]]
定义函数:
def foo(i, samples):
for row in samples:
try:
j = row.index(i)
except ValueError:
continue
return row[j:j+n]
In [207]: foo(3,samples)
Out[207]: [3, 4, 4]
In [208]: foo(9,samples) # non-found case isn't handled well
对于所有订单元素:
In [209]: [foo(i,samples) for i in order]
Out[209]: [[1, 2, 3], [3, 4, 4], [2, 3, 4], [3, 4, 4], [5, 6, 7], [8, 8, 8]]
@hpaulj 使用一些 numpy 内置功能的类似方法
import numpy as np
samples = [[1,2,3,4],[5,6,7,8]]
ws = 3 #window size
# add padding
samples = [s + [s[-1]]*(ws-1) for s in samples]
# rolling window function for arrays
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1]-window+1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
result = sum([rolling_window(np.array(s), ws).tolist() for s in samples ], [])
result
[[1, 2, 3],
[2, 3, 4],
[3, 4, 4],
[4, 4, 4],
[5, 6, 7],
[6, 7, 8],
[7, 8, 8],
[8, 8, 8]]
我有一个简单的内衬:
import numpy as np
samples = np.array([[1,2,3,4],[5,6,7,8]])
n,d = samples.shape
ws = 3
result = samples[:,np.minimum(np.arange(d)[:,None]+np.arange(ws)[None,:],d-1)]
输出是:
没有循环,只有广播。这使得它可能是最有效的方法。输出的维度不完全是你要求的,但很容易用简单的 np.reshape
[[[1 2 3]
[2 3 4]
[3 4 4]
[4 4 4]]
[[5 6 7]
[6 7 8]
[7 8 8]
[8 8 8]]]