为什么 50-50 train/test 拆分对于使用此神经网络的 178 个观察数据集最有效?
Why does a 50-50 train/test split work best for a data-set of 178 observations with this neural network?
根据我的阅读,似乎 80% 训练 20% va 的分裂
验证数据接近最佳。随着测试数据集大小的增加,验证结果的方差应该以训练效率降低(验证准确性降低)为代价下降。
因此,我对以下结果感到困惑,这些结果似乎显示了 TEST_SIZE=0.5
的最佳准确性和低方差(每个试验 运行 多次,并且选择了一个试验来代表不同的测试尺寸)。
TEST_SIZE=0.1
,由于训练规模较大但方差较大(5 次试验的准确率在 16% 到 50% 之间变化),这应该有效。
Epoch 0, Loss 0.021541, Targets [ 1. 0. 0.], Outputs [ 0.979 0.011 0.01 ], Inputs [ 0.086 0.052 0.08 0.062 0.101 0.093 0.107 0.058 0.108 0.08 0.084 0.115 0.104]
Epoch 100, Loss 0.001154, Targets [ 0. 0. 1.], Outputs [ 0. 0.001 0.999], Inputs [ 0.083 0.099 0.084 0.079 0.085 0.061 0.02 0.103 0.038 0.083 0.078 0.053 0.067]
Epoch 200, Loss 0.000015, Targets [ 0. 0. 1.], Outputs [ 0. 0. 1.], Inputs [ 0.076 0.092 0.087 0.107 0.077 0.063 0.02 0.13 0.054 0.106 0.054 0.051 0.086]
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
50.0% overall accuracy for validation set.
TEST_SIZE=0.5
,这应该没问题(其他两种情况之间的准确性)- 5 次试验的准确性在 92% 和 97% 之间变化 出于某种原因。
Epoch 0, Loss 0.547218, Targets [ 1. 0. 0.], Outputs [ 0.579 0.087 0.334], Inputs [ 0.106 0.08 0.142 0.133 0.129 0.115 0.127 0.13 0.12 0.068 0.123 0.126 0.11 ]
Epoch 100, Loss 0.002716, Targets [ 0. 1. 0.], Outputs [ 0.003 0.997 0. ], Inputs [ 0.09 0.059 0.097 0.114 0.088 0.108 0.102 0.144 0.125 0.036 0.186 0.113 0.054]
Epoch 200, Loss 0.002874, Targets [ 0. 1. 0.], Outputs [ 0.003 0.997 0. ], Inputs [ 0.102 0.067 0.088 0.109 0.088 0.097 0.091 0.088 0.092 0.056 0.113 0.141 0.089]
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
97.75280898876404% overall accuracy for validation set.
TEST_SIZE=0.9
,由于训练样本较小,这应该很难概括 - 5 次试验的准确率在 38% 到 54% 之间变化。
Epoch 0, Loss 2.448474, Targets [ 0. 0. 1.], Outputs [ 0.707 0.206 0.086], Inputs [ 0.229 0.421 0.266 0.267 0.223 0.15 0.057 0.33 0.134 0.148 0.191 0.12 0.24 ]
Epoch 100, Loss 0.017506, Targets [ 1. 0. 0.], Outputs [ 0.983 0.017 0. ], Inputs [ 0.252 0.162 0.274 0.255 0.241 0.275 0.314 0.175 0.278 0.135 0.286 0.36 0.281]
Epoch 200, Loss 0.001819, Targets [ 0. 0. 1.], Outputs [ 0.002 0. 0.998], Inputs [ 0.245 0.348 0.248 0.274 0.284 0.153 0.167 0.212 0.191 0.362 0.145 0.125 0.183]
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
64.59627329192547% overall accuracy for validation set.
- 下面的关键函数片段:
导入和拆分数据集
import numpy as np
from sklearn.preprocessing import normalize
from sklearn.model_selection import train_test_split
def readInput(filename, delimiter, inputlen, outputlen, categories, test_size):
def onehot(num, categories):
arr = np.zeros(categories)
arr[int(num[0])-1] = 1
return arr
with open(filename) as file:
inputs = list()
outputs = list()
for line in file:
assert(len(line.split(delimiter)) == inputlen+outputlen)
outputs.append(onehot(list(map(lambda x: float(x), line.split(delimiter)))[:outputlen], categories))
inputs.append(list(map(lambda x: float(x), line.split(delimiter)))[outputlen:outputlen+inputlen])
inputs = np.array(inputs)
outputs = np.array(outputs)
inputs_train, inputs_val, outputs_train, outputs_val = train_test_split(inputs, outputs, test_size=test_size)
assert len(inputs_train) > 0
assert len(inputs_val) > 0
return normalize(inputs_train, axis=0), outputs_train, normalize(inputs_val, axis=0), outputs_val
一些参数
import numpy as np
import helper
FILE_NAME = 'data2.csv'
DATA_DELIM = ','
ACTIVATION_FUNC = 'tanh'
TESTING_FREQ = 100
EPOCHS = 200
LEARNING_RATE = 0.2
TEST_SIZE = 0.9
INPUT_SIZE = 13
HIDDEN_LAYERS = [5]
OUTPUT_SIZE = 3
主程序流程
def step(self, x, targets, lrate):
self.forward_propagate(x)
self.backpropagate_errors(targets)
self.adjust_weights(x, lrate)
def test(self, epoch, x, target):
predictions = self.forward_propagate(x)
print('Epoch %5i, Loss %2f, Targets %s, Outputs %s, Inputs %s' % (epoch, helper.crossentropy(target, predictions), target, predictions, x))
def train(self, inputs, targets, epochs, testfreq, lrate):
xindices = [i for i in range(len(inputs))]
for epoch in range(epochs):
np.random.shuffle(xindices)
if epoch % testfreq == 0:
self.test(epoch, inputs[xindices[0]], targets[xindices[0]])
for i in xindices:
self.step(inputs[i], targets[i], lrate)
self.test(epochs, inputs[xindices[0]], targets[xindices[0]])
def validate(self, inputs, targets):
correct = 0
targets = np.argmax(targets, axis=1)
for i in range(len(inputs)):
prediction = np.argmax(self.forward_propagate(inputs[i]))
if prediction == targets[i]: correct += 1
print('Target Class %s, Predicted Class %s' % (targets[i], prediction))
print('%s%% overall accuracy for validation set.' % (correct/len(inputs)*100))
np.random.seed()
inputs_train, outputs_train, inputs_val, outputs_val = helper.readInput(FILE_NAME, DATA_DELIM, inputlen=INPUT_SIZE, outputlen=1, categories=OUTPUT_SIZE, test_size=TEST_SIZE)
nn = Classifier([INPUT_SIZE] + HIDDEN_LAYERS + [OUTPUT_SIZE], ACTIVATION_FUNC)
nn.train(inputs_train, outputs_train, EPOCHS, TESTING_FREQ, LEARNING_RATE)
nn.validate(inputs_val, outputs_val)
1)样本量很小。您有 13 个维度,只有 178 个样本。由于您需要训练 5 层 NN 的参数,因此无论您如何拆分,都没有足够的数据。所以你的模型对于你拥有的数据量来说太复杂了,这会导致过度拟合。这意味着,您的模型不能很好地概括,并且在一般情况下不会为您提供良好的结果,也不会提供稳定的结果。
2) 训练数据集和测试数据集之间总是会有一些差异。在您的情况下,由于样本量小,测试数据和训练数据的统计数据之间的一致性大多是随机的。
3) 当你拆分 90-10 时,你的测试集只有 17 个样本。你无法从仅仅 17 次试验中获得太多价值。很难称得上"statistics"。尝试不同的分割,你的结果也会改变(你已经看到了这种现象,正如我上面提到的关于稳健性)
4) 始终将您的分类器与随机分类器的性能进行比较。在你的 3 类 的情况下,你至少应该比 33% 好。
5) 了解交叉验证和留一法。
根据我的阅读,似乎 80% 训练 20% va 的分裂 验证数据接近最佳。随着测试数据集大小的增加,验证结果的方差应该以训练效率降低(验证准确性降低)为代价下降。
因此,我对以下结果感到困惑,这些结果似乎显示了 TEST_SIZE=0.5
的最佳准确性和低方差(每个试验 运行 多次,并且选择了一个试验来代表不同的测试尺寸)。
TEST_SIZE=0.1
,由于训练规模较大但方差较大(5 次试验的准确率在 16% 到 50% 之间变化),这应该有效。
Epoch 0, Loss 0.021541, Targets [ 1. 0. 0.], Outputs [ 0.979 0.011 0.01 ], Inputs [ 0.086 0.052 0.08 0.062 0.101 0.093 0.107 0.058 0.108 0.08 0.084 0.115 0.104]
Epoch 100, Loss 0.001154, Targets [ 0. 0. 1.], Outputs [ 0. 0.001 0.999], Inputs [ 0.083 0.099 0.084 0.079 0.085 0.061 0.02 0.103 0.038 0.083 0.078 0.053 0.067]
Epoch 200, Loss 0.000015, Targets [ 0. 0. 1.], Outputs [ 0. 0. 1.], Inputs [ 0.076 0.092 0.087 0.107 0.077 0.063 0.02 0.13 0.054 0.106 0.054 0.051 0.086]
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
50.0% overall accuracy for validation set.
TEST_SIZE=0.5
,这应该没问题(其他两种情况之间的准确性)- 5 次试验的准确性在 92% 和 97% 之间变化 出于某种原因。
Epoch 0, Loss 0.547218, Targets [ 1. 0. 0.], Outputs [ 0.579 0.087 0.334], Inputs [ 0.106 0.08 0.142 0.133 0.129 0.115 0.127 0.13 0.12 0.068 0.123 0.126 0.11 ]
Epoch 100, Loss 0.002716, Targets [ 0. 1. 0.], Outputs [ 0.003 0.997 0. ], Inputs [ 0.09 0.059 0.097 0.114 0.088 0.108 0.102 0.144 0.125 0.036 0.186 0.113 0.054]
Epoch 200, Loss 0.002874, Targets [ 0. 1. 0.], Outputs [ 0.003 0.997 0. ], Inputs [ 0.102 0.067 0.088 0.109 0.088 0.097 0.091 0.088 0.092 0.056 0.113 0.141 0.089]
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 0
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 0
Target Class 1, Predicted Class 1
97.75280898876404% overall accuracy for validation set.
TEST_SIZE=0.9
,由于训练样本较小,这应该很难概括 - 5 次试验的准确率在 38% 到 54% 之间变化。
Epoch 0, Loss 2.448474, Targets [ 0. 0. 1.], Outputs [ 0.707 0.206 0.086], Inputs [ 0.229 0.421 0.266 0.267 0.223 0.15 0.057 0.33 0.134 0.148 0.191 0.12 0.24 ]
Epoch 100, Loss 0.017506, Targets [ 1. 0. 0.], Outputs [ 0.983 0.017 0. ], Inputs [ 0.252 0.162 0.274 0.255 0.241 0.275 0.314 0.175 0.278 0.135 0.286 0.36 0.281]
Epoch 200, Loss 0.001819, Targets [ 0. 0. 1.], Outputs [ 0.002 0. 0.998], Inputs [ 0.245 0.348 0.248 0.274 0.284 0.153 0.167 0.212 0.191 0.362 0.145 0.125 0.183]
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 2, Predicted Class 2
Target Class 0, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 0, Predicted Class 1
Target Class 1, Predicted Class 1
Target Class 2, Predicted Class 2
64.59627329192547% overall accuracy for validation set.
- 下面的关键函数片段:
导入和拆分数据集
import numpy as np
from sklearn.preprocessing import normalize
from sklearn.model_selection import train_test_split
def readInput(filename, delimiter, inputlen, outputlen, categories, test_size):
def onehot(num, categories):
arr = np.zeros(categories)
arr[int(num[0])-1] = 1
return arr
with open(filename) as file:
inputs = list()
outputs = list()
for line in file:
assert(len(line.split(delimiter)) == inputlen+outputlen)
outputs.append(onehot(list(map(lambda x: float(x), line.split(delimiter)))[:outputlen], categories))
inputs.append(list(map(lambda x: float(x), line.split(delimiter)))[outputlen:outputlen+inputlen])
inputs = np.array(inputs)
outputs = np.array(outputs)
inputs_train, inputs_val, outputs_train, outputs_val = train_test_split(inputs, outputs, test_size=test_size)
assert len(inputs_train) > 0
assert len(inputs_val) > 0
return normalize(inputs_train, axis=0), outputs_train, normalize(inputs_val, axis=0), outputs_val
一些参数
import numpy as np
import helper
FILE_NAME = 'data2.csv'
DATA_DELIM = ','
ACTIVATION_FUNC = 'tanh'
TESTING_FREQ = 100
EPOCHS = 200
LEARNING_RATE = 0.2
TEST_SIZE = 0.9
INPUT_SIZE = 13
HIDDEN_LAYERS = [5]
OUTPUT_SIZE = 3
主程序流程
def step(self, x, targets, lrate):
self.forward_propagate(x)
self.backpropagate_errors(targets)
self.adjust_weights(x, lrate)
def test(self, epoch, x, target):
predictions = self.forward_propagate(x)
print('Epoch %5i, Loss %2f, Targets %s, Outputs %s, Inputs %s' % (epoch, helper.crossentropy(target, predictions), target, predictions, x))
def train(self, inputs, targets, epochs, testfreq, lrate):
xindices = [i for i in range(len(inputs))]
for epoch in range(epochs):
np.random.shuffle(xindices)
if epoch % testfreq == 0:
self.test(epoch, inputs[xindices[0]], targets[xindices[0]])
for i in xindices:
self.step(inputs[i], targets[i], lrate)
self.test(epochs, inputs[xindices[0]], targets[xindices[0]])
def validate(self, inputs, targets):
correct = 0
targets = np.argmax(targets, axis=1)
for i in range(len(inputs)):
prediction = np.argmax(self.forward_propagate(inputs[i]))
if prediction == targets[i]: correct += 1
print('Target Class %s, Predicted Class %s' % (targets[i], prediction))
print('%s%% overall accuracy for validation set.' % (correct/len(inputs)*100))
np.random.seed()
inputs_train, outputs_train, inputs_val, outputs_val = helper.readInput(FILE_NAME, DATA_DELIM, inputlen=INPUT_SIZE, outputlen=1, categories=OUTPUT_SIZE, test_size=TEST_SIZE)
nn = Classifier([INPUT_SIZE] + HIDDEN_LAYERS + [OUTPUT_SIZE], ACTIVATION_FUNC)
nn.train(inputs_train, outputs_train, EPOCHS, TESTING_FREQ, LEARNING_RATE)
nn.validate(inputs_val, outputs_val)
1)样本量很小。您有 13 个维度,只有 178 个样本。由于您需要训练 5 层 NN 的参数,因此无论您如何拆分,都没有足够的数据。所以你的模型对于你拥有的数据量来说太复杂了,这会导致过度拟合。这意味着,您的模型不能很好地概括,并且在一般情况下不会为您提供良好的结果,也不会提供稳定的结果。
2) 训练数据集和测试数据集之间总是会有一些差异。在您的情况下,由于样本量小,测试数据和训练数据的统计数据之间的一致性大多是随机的。
3) 当你拆分 90-10 时,你的测试集只有 17 个样本。你无法从仅仅 17 次试验中获得太多价值。很难称得上"statistics"。尝试不同的分割,你的结果也会改变(你已经看到了这种现象,正如我上面提到的关于稳健性)
4) 始终将您的分类器与随机分类器的性能进行比较。在你的 3 类 的情况下,你至少应该比 33% 好。
5) 了解交叉验证和留一法。