每次我在 R 上重新训练神经网络时都会得到不同的时间序列预测值
Getting different predicted values of time series every time I re-train a neural network on R
我正在尝试用 R 上的包 nnet 拟合一些数据。
训练神经网络后,我想预测一些值,但如果我重新训练网络并再次预测,我会得到明显不同的值。
这是 copy/paste 的可重现代码,看看我在说什么。
# loading required package nnet
if(!require(nnet)){
install.packages("nnet")
library(nnet)
}
# reading data
data <- "year GDP n.households GDP.norm n.households.norm
1950 300.2 48902 -0.959913402290733 -1.64747365536208
1951 347.3 49673 -0.950771933085093 -1.61347968613569
1952 367.7 50474 -0.946812570626599 -1.57816299437132
1953 389.7 51435 -0.942542669936066 -1.53579178240432
1954 391.1 52799 -0.942270948983032 -1.47565199767698
1955 426.2 53557 -0.935458516517682 -1.44223120821706
1956 450.1 54764 -0.930819851676604 -1.38901367143853
1957 474.9 55270 -0.926006509080003 -1.36670375129774
1958 482 56149 -0.924628495675331 -1.32794798093459
1959 522.5 57436 -0.91676799667685 -1.27120318405475
1960 543.3 58406 -0.912730999660347 -1.22843515532636
1961 563.3 59236 -0.908849271759863 -1.19183983177526
1962 605.1 60813 -0.90073646044785 -1.12230871702817
1963 638.6 62214 -0.894234566214539 -1.06053757450397
1964 685.8 63401 -0.885073688369396 -1.00820185275077
1965 743.7 64778 -0.873836086097494 -0.947488888256956
1966 815 66676 -0.859997726132268 -0.863804642353359
1967 861.7 68251 -0.850933891484637 -0.794361709108803
1968 942.5 69859 -0.835251710766681 -0.723463781072457
1969 1019.9 71120 -0.820229423791807 -0.667865343725547
1970 1075.9 72867 -0.80936058567045 -0.590838801263173
1971 1167.8 74142 -0.791524045967725 -0.534623093398533
1972 1282.4 76030 -0.76928174509795 -0.451379755007599
1973 1428.5 77330 -0.740925722784913 -0.394061778361299
1974 1548.8 79108 -0.7175771294635 -0.315668422609668
1975 1688.9 80776 -0.690385625520608 -0.242125049497339
1976 1877.6 82368 -0.653761522779538 -0.171932573481255
1977 2086 83527 -0.613313918056492 -0.120831392763515
1978 2356.6 83918 -0.56079413956294 -0.103591909018359
1979 2632.1 85407 -0.507323337733769 -0.0379407803827123
1980 2862.5 85290 -0.46260583232019 -0.0430993982808793
1981 3210.9 86789 -0.394986132293754 0.0229926378674309
1982 3345 88458 -0.368959146721007 0.0965801017310265
1983 3638.1 89479 -0.31207242433941 0.141596758774005
1984 4040.7 91066 -0.233933241702662 0.211568781033757
1985 4346.7 91124 -0.174542804825252 0.214126044607207
1986 4590.1 92830 -0.127302176276358 0.289344866267659
1987 4870.2 93347 -0.0729385770300762 0.31213978467238
1988 5252.6 94312 0.00128006042718324 0.354687359644441
1989 5657.7 95669 0.0799044590514921 0.414518509112925
1990 5979.6 96391 0.142380869609787 0.446352031527254
1991 6174 96426 0.180111264802494 0.447895207821578
1992 6539.3 97107 0.251011024904839 0.477921009433985
1993 6878.7 98990 0.316883947376057 0.560943894068587
1994 7308.8 99627 0.400360505875972 0.589029702625274
1995 7664.1 101018 0.469319402028075 0.650359937636815
1996 7664.1 102528 0.469319402028075 0.716936972049055
1997 8608.5 103874 0.652614593488942 0.776283123253609
1998 9089.2 104705 0.745911923577082 0.812922537555974
1999 9660.6 108209 0.856812889693918 0.967416529993385
2000 10284.8 NA 0.977961617468032 NA
2001 10621.8 NA 1.04336873259119 NA
2002 10977.5 NA 1.1124052633013 NA
2003 11510.7 NA 1.21589212912822 NA
2004 12274.9 NA 1.36421295220572 NA
2005 13093.7 NA 1.52313089245155 NA
2006 13855.9 NA 1.671063542739 NA
2007 14477.6 NA 1.79172705452556 NA
2008 14718.6 NA 1.83850187572639 NA
2009 14418.6 NA 1.78027595721913 NA
2010 14964.4 NA 1.88620831162334 NA
2011 15517.9 NA 1.99363513126925 NA
2012 16163.2 NA 2.11887908197837 NA
2013 16768.1 NA 2.23628194232852 NA"
df <- read.table(text=data, header=TRUE)
# data for training the net
input <- data.frame(df[1:50, 4])
output <- data.frame(df[1:50, 5])
# data for predicting new values
new.data <- data.frame(df[, 4])
*************************************************************
# training the neural network
net <- nnet(x=input, y=output, size=3, linout=T)
# predicting
fitted <- predict(net, new.data)
# reconverting to have number of households
house.fitted <- sd(df$n.households, na.rm=T) * fitted + mean(df$n.households, na.rm=T)
# plot of real values against predicted values
plot(df$n.households)
lines(house.fitted, col="blue")
如果您重新运行 星号线下方的代码,您可以看到预测值在每个 运行 上有何显着差异。这里有两个图,你可以看到我指的是什么:
我尝试更改隐藏神经元的数量和最大迭代次数,但我得到了相同的行为。
我是 R 神经网络和一般神经网络的新手,所以我不知道我是否遗漏了代码或问题的一般方法。我知道 ANN 可能会陷入局部最小值,但我认为他们不应该每次都预测如此不同的值。
请让我明白我做错了什么,因为这只是我想做的许多模型中的一个,我真的很想理解人工神经网络。
正如您正确指出的那样,网络可能会陷入局部最小值。由于权重的随机初始化,最终结果可能会有很大差异。最小化泛化误差的一种方法是提前停止(即 maxit
、abstol
或 reltol
的不同参数值)。 nnet
支持的另一种方式是权重衰减。例如 decay = 0.001
和 maxit = 1000
,收敛前几乎没有停止,模型已经给出了更稳定的结果。
要获得更稳定的结果,您可以考虑使用 caret 包中的 avNNet 模型。它训练一定数量 (repeats
) 的神经网络,然后对结果进行平均。示例:
input <- data.frame(df[1:50, 4])
colnames(input) <- "input"
output <- data.frame(df[1:50, 5])
new.data <- data.frame(df[, 4])
colnames(new.data) <- "input"
library(caret)
myTrainControl <- trainControl(method = "none")
avNNet <- train(y = output$df.1.50..5.,
x = input,
tuneGrid = expand.grid(.size = 3,
.decay = 0.001,
.bag = F),
method = "avNNet", repeats = 15,
maxit = 1000, linout = T,
trControl = myTrainControl)
fitted <- predict(avNNet, new.data)
house.fitted <- sd(df$n.households, na.rm=T) * fitted + mean(df$n.households, na.rm=T)
plot(df$n.households)
lines(house.fitted, col="blue")
我正在尝试用 R 上的包 nnet 拟合一些数据。 训练神经网络后,我想预测一些值,但如果我重新训练网络并再次预测,我会得到明显不同的值。
这是 copy/paste 的可重现代码,看看我在说什么。
# loading required package nnet
if(!require(nnet)){
install.packages("nnet")
library(nnet)
}
# reading data
data <- "year GDP n.households GDP.norm n.households.norm
1950 300.2 48902 -0.959913402290733 -1.64747365536208
1951 347.3 49673 -0.950771933085093 -1.61347968613569
1952 367.7 50474 -0.946812570626599 -1.57816299437132
1953 389.7 51435 -0.942542669936066 -1.53579178240432
1954 391.1 52799 -0.942270948983032 -1.47565199767698
1955 426.2 53557 -0.935458516517682 -1.44223120821706
1956 450.1 54764 -0.930819851676604 -1.38901367143853
1957 474.9 55270 -0.926006509080003 -1.36670375129774
1958 482 56149 -0.924628495675331 -1.32794798093459
1959 522.5 57436 -0.91676799667685 -1.27120318405475
1960 543.3 58406 -0.912730999660347 -1.22843515532636
1961 563.3 59236 -0.908849271759863 -1.19183983177526
1962 605.1 60813 -0.90073646044785 -1.12230871702817
1963 638.6 62214 -0.894234566214539 -1.06053757450397
1964 685.8 63401 -0.885073688369396 -1.00820185275077
1965 743.7 64778 -0.873836086097494 -0.947488888256956
1966 815 66676 -0.859997726132268 -0.863804642353359
1967 861.7 68251 -0.850933891484637 -0.794361709108803
1968 942.5 69859 -0.835251710766681 -0.723463781072457
1969 1019.9 71120 -0.820229423791807 -0.667865343725547
1970 1075.9 72867 -0.80936058567045 -0.590838801263173
1971 1167.8 74142 -0.791524045967725 -0.534623093398533
1972 1282.4 76030 -0.76928174509795 -0.451379755007599
1973 1428.5 77330 -0.740925722784913 -0.394061778361299
1974 1548.8 79108 -0.7175771294635 -0.315668422609668
1975 1688.9 80776 -0.690385625520608 -0.242125049497339
1976 1877.6 82368 -0.653761522779538 -0.171932573481255
1977 2086 83527 -0.613313918056492 -0.120831392763515
1978 2356.6 83918 -0.56079413956294 -0.103591909018359
1979 2632.1 85407 -0.507323337733769 -0.0379407803827123
1980 2862.5 85290 -0.46260583232019 -0.0430993982808793
1981 3210.9 86789 -0.394986132293754 0.0229926378674309
1982 3345 88458 -0.368959146721007 0.0965801017310265
1983 3638.1 89479 -0.31207242433941 0.141596758774005
1984 4040.7 91066 -0.233933241702662 0.211568781033757
1985 4346.7 91124 -0.174542804825252 0.214126044607207
1986 4590.1 92830 -0.127302176276358 0.289344866267659
1987 4870.2 93347 -0.0729385770300762 0.31213978467238
1988 5252.6 94312 0.00128006042718324 0.354687359644441
1989 5657.7 95669 0.0799044590514921 0.414518509112925
1990 5979.6 96391 0.142380869609787 0.446352031527254
1991 6174 96426 0.180111264802494 0.447895207821578
1992 6539.3 97107 0.251011024904839 0.477921009433985
1993 6878.7 98990 0.316883947376057 0.560943894068587
1994 7308.8 99627 0.400360505875972 0.589029702625274
1995 7664.1 101018 0.469319402028075 0.650359937636815
1996 7664.1 102528 0.469319402028075 0.716936972049055
1997 8608.5 103874 0.652614593488942 0.776283123253609
1998 9089.2 104705 0.745911923577082 0.812922537555974
1999 9660.6 108209 0.856812889693918 0.967416529993385
2000 10284.8 NA 0.977961617468032 NA
2001 10621.8 NA 1.04336873259119 NA
2002 10977.5 NA 1.1124052633013 NA
2003 11510.7 NA 1.21589212912822 NA
2004 12274.9 NA 1.36421295220572 NA
2005 13093.7 NA 1.52313089245155 NA
2006 13855.9 NA 1.671063542739 NA
2007 14477.6 NA 1.79172705452556 NA
2008 14718.6 NA 1.83850187572639 NA
2009 14418.6 NA 1.78027595721913 NA
2010 14964.4 NA 1.88620831162334 NA
2011 15517.9 NA 1.99363513126925 NA
2012 16163.2 NA 2.11887908197837 NA
2013 16768.1 NA 2.23628194232852 NA"
df <- read.table(text=data, header=TRUE)
# data for training the net
input <- data.frame(df[1:50, 4])
output <- data.frame(df[1:50, 5])
# data for predicting new values
new.data <- data.frame(df[, 4])
*************************************************************
# training the neural network
net <- nnet(x=input, y=output, size=3, linout=T)
# predicting
fitted <- predict(net, new.data)
# reconverting to have number of households
house.fitted <- sd(df$n.households, na.rm=T) * fitted + mean(df$n.households, na.rm=T)
# plot of real values against predicted values
plot(df$n.households)
lines(house.fitted, col="blue")
如果您重新运行 星号线下方的代码,您可以看到预测值在每个 运行 上有何显着差异。这里有两个图,你可以看到我指的是什么:
我尝试更改隐藏神经元的数量和最大迭代次数,但我得到了相同的行为。
我是 R 神经网络和一般神经网络的新手,所以我不知道我是否遗漏了代码或问题的一般方法。我知道 ANN 可能会陷入局部最小值,但我认为他们不应该每次都预测如此不同的值。
请让我明白我做错了什么,因为这只是我想做的许多模型中的一个,我真的很想理解人工神经网络。
正如您正确指出的那样,网络可能会陷入局部最小值。由于权重的随机初始化,最终结果可能会有很大差异。最小化泛化误差的一种方法是提前停止(即 maxit
、abstol
或 reltol
的不同参数值)。 nnet
支持的另一种方式是权重衰减。例如 decay = 0.001
和 maxit = 1000
,收敛前几乎没有停止,模型已经给出了更稳定的结果。
要获得更稳定的结果,您可以考虑使用 caret 包中的 avNNet 模型。它训练一定数量 (repeats
) 的神经网络,然后对结果进行平均。示例:
input <- data.frame(df[1:50, 4])
colnames(input) <- "input"
output <- data.frame(df[1:50, 5])
new.data <- data.frame(df[, 4])
colnames(new.data) <- "input"
library(caret)
myTrainControl <- trainControl(method = "none")
avNNet <- train(y = output$df.1.50..5.,
x = input,
tuneGrid = expand.grid(.size = 3,
.decay = 0.001,
.bag = F),
method = "avNNet", repeats = 15,
maxit = 1000, linout = T,
trControl = myTrainControl)
fitted <- predict(avNNet, new.data)
house.fitted <- sd(df$n.households, na.rm=T) * fitted + mean(df$n.households, na.rm=T)
plot(df$n.households)
lines(house.fitted, col="blue")