如何在迭代时间序列时产生滑动 windows 块?
How to produce chunks of sliding windows while iterating over a time series?
已编辑:
我有一个时间序列,比方说 ts = [[0 0][1 1][2 2][3 3][4 4][5 5][6 6][7 7][8 8]]
,我想分成以下两个序列:
X = [[[[0][1]][[1][2]][[2][3]]] [[[1][2]][[2][3]][[3][4]]] [[[2][3]][[3][4]][[4][5]]] [[[3][4]][[4][5]][[5][6]]] [[[4][5]][[5][6]][[6][7]]] [[[5][6]][[6][7]][[7][8]]]]
y = [[3][4][5][6][7][8]]
X 是三个两步滑动的块序列 windows 而 y 是它的特征。
我的策略是首先采用以下方法:
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
prev_end_ix = end_ix - 1
# check if we are beyond the dataset
if end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[prev_end_ix:end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)
哪个反驳:
X =[[[0][1]] [[1][2]] [[2][3]] [[3][4]] [[4][5]] [[5][6]] [[6][7]] [[7][8]]]
y = [[1][2][3][4][5][6][7][8]]
然后我应用以下两种方法来获得所需的输出:
def separar_uni_X(sequencia, n_passos):
X = list()
for i in range(len(sequencia)):
# find the end of this pattern
end_ix = i + n_passos
# check if we are beyond the sequence
if end_ix > len(sequencia):
break
# gather input and output parts of the pattern
seq_x = sequencia[i:end_ix, :]
X.append(seq_x)
return np.array(X)
def separar_uni_y(sequencia, n_passos):
y = list()
for i in range(len(sequencia)):
# find the end of this pattern
end_ix = i + n_passos
# check if we are beyond the sequence
if end_ix > len(sequencia):
break
# gather input and output parts of the pattern
seq_y = sequencia[i:end_ix, :]
y.append(seq_y[-1])
return np.array(y)
问题:问题是为了获得所需的输出,它必须存储从第一种方法到第二种方法的数据,当序列太长时,它会超过内存容量。为了解决这个在子流程中分解流程的缺点,我使用了这个方法:
def split_sequence_3D(sequences, n_steps, batch_size):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
prev_end_ix = end_ix - 1
# check if we are beyond the dataset
if end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[prev_end_ix:end_ix, -1]
sub_X, sub_y = [], []
for j in range(batch_size):
sub_X.append(seq_x)
sub_y.append(seq_y)
X.append(sub_X)
y.append(sub_y[-1])
return np.array(X), np.array(y)
这给了我错误的输出,原因很明显:
X = [[[[0][1]][[0][1]][[0][1]]] [[[1][2]][[1][2]][[1][2]]] [[[2][3]][[2][3]][[2] [3]]] [[[3][4]][[3][4]][[3][4]]] [[[4][5]][[4][5]][[4][5]]] [[[5][6]][[5][6]][[5 [6]]] [[[6][7]][[6][7]][[6][7]]] [[[7][8]][[7][8]][[7][8]]]]
y = [[1][2][3][4][5][6][7][8]]
我已经广泛寻找替代品,但没有找到。
好吧,我真的很努力地解决了你的问题,这也是我的问题。但最终,解决方案被证明是一种简单的方法。我的解决方案是让滑动 window 迭代器也滑动。
def input_3D(sequencia, lote, janela):
if lote > len(sequencia):
raise ValueError('Tamanho do lote maior que o conjunto dos dados')
if janela > len(sequencia):
raise ValueError('Tamanho da janela maior que o conjunto dos dados')
X_, y_ = [], []
for j in range (len(sequencia)):
if j+lote+janela > len(sequencia):
break
X, y = [], []
for i in range (j,j+lote,1):
end_ix = i+janela
prev_end_ix = end_ix - 1
seq_x, seq_y = sequencia[i:end_ix, :-1], sequencia[prev_end_ix:end_ix, -1]
X.append(np.array(seq_x))
y.append(np.array(seq_y[-1]))
X_.append(np.array(X))
y_.append(np.array(y[-1]))
return np.array(X_), np.array(y_)
假设您的输入是:
arr_x = list(range(0,100))
arr_y = list(range(0,100))
arr = np.stack([arr_x,arr_y])
arr = arr.T
那么你的输出将是:
[[[[ 0]
[ 1]
[ 2]
...
[ 7]
[ 8]
[ 9]]
[[ 1]
[ 2]
[ 3]
...
[ 8]
[ 9]
[10]]
[[ 2]
[ 3]
[ 4]
...
[ 9]
[10]
[11]]
[[ 3]
[ 4]
[ 5]
...
[10]
[11]
[12]]]
...
[[[86]
[87]
[88]
...
[93]
[94]
[95]]
[[87]
[88]
[89]
...
[94]
[95]
[96]]
[[88]
[89]
[90]
...
[95]
[96]
[97]]
[[89]
[90]
[91]
...
[96]
[97]
[98]]]] [12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98]
已编辑:
我有一个时间序列,比方说 ts = [[0 0][1 1][2 2][3 3][4 4][5 5][6 6][7 7][8 8]]
,我想分成以下两个序列:
X = [[[[0][1]][[1][2]][[2][3]]] [[[1][2]][[2][3]][[3][4]]] [[[2][3]][[3][4]][[4][5]]] [[[3][4]][[4][5]][[5][6]]] [[[4][5]][[5][6]][[6][7]]] [[[5][6]][[6][7]][[7][8]]]]
y = [[3][4][5][6][7][8]]
X 是三个两步滑动的块序列 windows 而 y 是它的特征。 我的策略是首先采用以下方法:
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
prev_end_ix = end_ix - 1
# check if we are beyond the dataset
if end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[prev_end_ix:end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)
哪个反驳:
X =[[[0][1]] [[1][2]] [[2][3]] [[3][4]] [[4][5]] [[5][6]] [[6][7]] [[7][8]]]
y = [[1][2][3][4][5][6][7][8]]
然后我应用以下两种方法来获得所需的输出:
def separar_uni_X(sequencia, n_passos):
X = list()
for i in range(len(sequencia)):
# find the end of this pattern
end_ix = i + n_passos
# check if we are beyond the sequence
if end_ix > len(sequencia):
break
# gather input and output parts of the pattern
seq_x = sequencia[i:end_ix, :]
X.append(seq_x)
return np.array(X)
def separar_uni_y(sequencia, n_passos):
y = list()
for i in range(len(sequencia)):
# find the end of this pattern
end_ix = i + n_passos
# check if we are beyond the sequence
if end_ix > len(sequencia):
break
# gather input and output parts of the pattern
seq_y = sequencia[i:end_ix, :]
y.append(seq_y[-1])
return np.array(y)
问题:问题是为了获得所需的输出,它必须存储从第一种方法到第二种方法的数据,当序列太长时,它会超过内存容量。为了解决这个在子流程中分解流程的缺点,我使用了这个方法:
def split_sequence_3D(sequences, n_steps, batch_size):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
prev_end_ix = end_ix - 1
# check if we are beyond the dataset
if end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[prev_end_ix:end_ix, -1]
sub_X, sub_y = [], []
for j in range(batch_size):
sub_X.append(seq_x)
sub_y.append(seq_y)
X.append(sub_X)
y.append(sub_y[-1])
return np.array(X), np.array(y)
这给了我错误的输出,原因很明显:
X = [[[[0][1]][[0][1]][[0][1]]] [[[1][2]][[1][2]][[1][2]]] [[[2][3]][[2][3]][[2] [3]]] [[[3][4]][[3][4]][[3][4]]] [[[4][5]][[4][5]][[4][5]]] [[[5][6]][[5][6]][[5 [6]]] [[[6][7]][[6][7]][[6][7]]] [[[7][8]][[7][8]][[7][8]]]]
y = [[1][2][3][4][5][6][7][8]]
我已经广泛寻找替代品,但没有找到。
好吧,我真的很努力地解决了你的问题,这也是我的问题。但最终,解决方案被证明是一种简单的方法。我的解决方案是让滑动 window 迭代器也滑动。
def input_3D(sequencia, lote, janela):
if lote > len(sequencia):
raise ValueError('Tamanho do lote maior que o conjunto dos dados')
if janela > len(sequencia):
raise ValueError('Tamanho da janela maior que o conjunto dos dados')
X_, y_ = [], []
for j in range (len(sequencia)):
if j+lote+janela > len(sequencia):
break
X, y = [], []
for i in range (j,j+lote,1):
end_ix = i+janela
prev_end_ix = end_ix - 1
seq_x, seq_y = sequencia[i:end_ix, :-1], sequencia[prev_end_ix:end_ix, -1]
X.append(np.array(seq_x))
y.append(np.array(seq_y[-1]))
X_.append(np.array(X))
y_.append(np.array(y[-1]))
return np.array(X_), np.array(y_)
假设您的输入是:
arr_x = list(range(0,100))
arr_y = list(range(0,100))
arr = np.stack([arr_x,arr_y])
arr = arr.T
那么你的输出将是:
[[[[ 0]
[ 1]
[ 2]
...
[ 7]
[ 8]
[ 9]]
[[ 1]
[ 2]
[ 3]
...
[ 8]
[ 9]
[10]]
[[ 2]
[ 3]
[ 4]
...
[ 9]
[10]
[11]]
[[ 3]
[ 4]
[ 5]
...
[10]
[11]
[12]]]
...
[[[86]
[87]
[88]
...
[93]
[94]
[95]]
[[87]
[88]
[89]
...
[94]
[95]
[96]]
[[88]
[89]
[90]
...
[95]
[96]
[97]]
[[89]
[90]
[91]
...
[96]
[97]
[98]]]] [12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98]