将每次迭代存储在梯度下降函数中,以便可视化参数更新过程和 r 中的成本覆盖过程
storing each iteration in a gradient descent function in order to visualise the parameter updating process, and the cost covergence process in r
我正在尝试在 r 中编写一个批量梯度体面函数,用于训练和测试数据集。到目前为止,我有以下代码。然而,当我 运行 它时,它只打印出最后的参数和它的迭代 运行。我想存储每次迭代、测试错误并能够可视化成本收敛过程。我不确定将代码放在哪里或如何将代码合并到下面的函数中。
GradD <- function(x, y, alpha = 0.006, epsilon = 10^-10){
iter <- 0
i <- 0
x <- cbind(rep(1, nrow(x)), x)
theta <- matrix(c(1,1), ncol(x), 1)
cost <- (1/(2*nrow(x)))* t(x%*% theta - y) %*% (x %*% theta - y)
delta <- 1
while (delta > epsilon){
i <- i + 1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta - y))
cval <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
cost <- append(cost, cval)
delta <- abs(cost[i+1] - cost[i])
if((cost[i+1] - cost[i]) > 0){
print("The cost is increasing. Try reducing alpha.")
return()
}
iter <- append(iter, i)
}
print(sprintf("Completed in %i iterations.", i))
return(theta)
}
TPredict <- function(theta, x){
x <- cbind(rep(1,nrow(x)), x)
return(x %*% theta)
}
编辑
我试图创建一个包含每次迭代的列表...但是现在当我 运行 代码
时出现错误
error.cost <- function(x, y, theta){
sum( (X %*% theta - y)^2 ) / (2*length(y))
}
num_iters <- 2000
cost_history <- double(num_iters)
theta_history <- list(num_iters)
GradD <- function(x, y, alpha = 0.006, epsilon = 10^-10){
iter <- 2000
i <- 0
x <- cbind(rep(1,nrow(x)), x)
theta <- matrix(c(1,1),ncol(x),1)
cost <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
delta <- 1
while(delta > epsilon){
i <- i + 1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta - y))
cval <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
cost <- append(cost, cval)
delta <- abs(cost[i+1] - cost[i])
cost_history[i] <- error.cost(x, y, theta)
theta_history[[i]] <- theta
if((cost[i+1] - cost[i]) > 0){
print("The cost is increasing. Try reducing alpha.")
return()
}
iter <- append(iter, i)
}
print(sprintf("Completed in %i iterations.", i))
return(theta)
}
我在 nrow(x) %% theta 中得到 error:不一致的参数。如果我删除此函数中的 nrow():
error.cost <- function(x, y, theta){
sum( (x %*% theta - y)^2 ) / (2*length(y))
}
然后它打印出结果,但它们是错误的最终结果,我根本没有存储迭代
以下主要使用您的代码,仅是捕获历史概念的证明。 alpha 值是通过四处寻找一个可以通过 if cost increasing
以创建多于 1 条或两条记录的值得出的,这似乎是最初的问题:
GradD2 <- function(x, y, alpha = 0.0000056, epsilon = 10^-10) {
cost <- vector(mode = 'numeric')
iter <- vector(mode = 'integer')
delta_hist <- vector(mode = 'numeric')
i <- 0
iter <- 0
x <- cbind(rep(1, nrow(x)), x)
theta <- matrix(c(1,1), ncol(x), 1)
cost <- (1/(2*nrow(x))) * t(x%*% theta -y) %*% (x %*% theta -y)
delta <- 1.0
while(length(iter) < 1000 ) { #todo - change back to while(delta>epsilon)
i <- i +1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta -y))
cval <- (1/(2*nrow(x))) * t(x %*% theta -y) %*% (x %*% theta -y)
cost <- append(cost, cval, after = length(cost))
delta <- abs(cost[i+1] - cost[i])
delta_hist <- append(delta_hist, delta, after = length(delta_hist))
if((cost[i+1] - cost[i]) > 0) {
print('The cost is increasing. Try reducing alpha.')
return(list(theta = theta,cost = cost, delta_hist = delta_hist, iter = iter))
}
iter <- append(iter, i, after = length(iter))
}
print(sprintf('Completed in %i iterations.', i))
return(list(theta = theta,cost = cost, delta_hist = delta_hist, iter = iter))
}
结果:
> str(GradD2_tst)
List of 4
$ theta : num [1:2, 1] 1.693 0.707
$ cost : num [1:1000] 88564 87541 86587 85698 84870 ...
$ delta_hist: num [1:999] 1024 954 889 828 771 ...
$ iter : num [1:1000] 0 1 2 3 4 5 6 7 8 9 ...
> GradD2_tst$delta_hist[994:999]
[1] 0.08580117 0.08580094 0.08580071 0.08580048 0.08580025 0.08580002
> GradD2_tst$cost[994:999]
[1] 73493.72 73493.63 73493.54 73493.46 73493.37 73493.29
>
如果不知道,我的 x 和 y 是:
> x <- as.matrix(sample(1:1000, 400), ncol =1)
> y <- sample(1:1000, 400)
x 的这次选举,当与正确的 while(delta > epsilon) {
一起使用时,出现了 cval 的不匹配,从而产生了警告,而现在没有。该死的
用 epsilon 10^-1 再次钓鱼 alpha:
> GradD2_tst2 <- GradD2(x,y, alpha=0.006, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.001, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.0006, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.00006, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.000056, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.000009, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.0000056, epsilon = 10^-1)
[1] "Completed in 160 iterations."
>
> str(GradD2_tst2)
List of 4
$ theta : num [1:2, 1] 1.111 0.709
$ cost : num [1:161] 88564 87541 86587 85698 84870 ...
$ delta_hist: num [1:160] 1024 954 889 828 771 ...
$ iter : num [1:161] 0 1 2 3 4 5 6 7 8 9 ...
> GradD2_tst2$cost[155:161]
[1] 73566.06 73565.96 73565.85 73565.75 73565.65 73565.55 73565.45
> GradD2_tst2$delta_hist[155:161]
[1] 0.10493825 0.10364385 0.10243786 0.10131425 0.10026738 0.09929201 NA
>
在一个核心上 epsilon = 10^-10 时 运行 需要很长时间。我的从未在一夜之间完成,Ctl-C。 HTH 关于历史。
我正在尝试在 r 中编写一个批量梯度体面函数,用于训练和测试数据集。到目前为止,我有以下代码。然而,当我 运行 它时,它只打印出最后的参数和它的迭代 运行。我想存储每次迭代、测试错误并能够可视化成本收敛过程。我不确定将代码放在哪里或如何将代码合并到下面的函数中。
GradD <- function(x, y, alpha = 0.006, epsilon = 10^-10){
iter <- 0
i <- 0
x <- cbind(rep(1, nrow(x)), x)
theta <- matrix(c(1,1), ncol(x), 1)
cost <- (1/(2*nrow(x)))* t(x%*% theta - y) %*% (x %*% theta - y)
delta <- 1
while (delta > epsilon){
i <- i + 1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta - y))
cval <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
cost <- append(cost, cval)
delta <- abs(cost[i+1] - cost[i])
if((cost[i+1] - cost[i]) > 0){
print("The cost is increasing. Try reducing alpha.")
return()
}
iter <- append(iter, i)
}
print(sprintf("Completed in %i iterations.", i))
return(theta)
}
TPredict <- function(theta, x){
x <- cbind(rep(1,nrow(x)), x)
return(x %*% theta)
}
编辑 我试图创建一个包含每次迭代的列表...但是现在当我 运行 代码
时出现错误error.cost <- function(x, y, theta){
sum( (X %*% theta - y)^2 ) / (2*length(y))
}
num_iters <- 2000
cost_history <- double(num_iters)
theta_history <- list(num_iters)
GradD <- function(x, y, alpha = 0.006, epsilon = 10^-10){
iter <- 2000
i <- 0
x <- cbind(rep(1,nrow(x)), x)
theta <- matrix(c(1,1),ncol(x),1)
cost <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
delta <- 1
while(delta > epsilon){
i <- i + 1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta - y))
cval <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
cost <- append(cost, cval)
delta <- abs(cost[i+1] - cost[i])
cost_history[i] <- error.cost(x, y, theta)
theta_history[[i]] <- theta
if((cost[i+1] - cost[i]) > 0){
print("The cost is increasing. Try reducing alpha.")
return()
}
iter <- append(iter, i)
}
print(sprintf("Completed in %i iterations.", i))
return(theta)
}
我在 nrow(x) %% theta 中得到 error:不一致的参数。如果我删除此函数中的 nrow():
error.cost <- function(x, y, theta){
sum( (x %*% theta - y)^2 ) / (2*length(y))
}
然后它打印出结果,但它们是错误的最终结果,我根本没有存储迭代
以下主要使用您的代码,仅是捕获历史概念的证明。 alpha 值是通过四处寻找一个可以通过 if cost increasing
以创建多于 1 条或两条记录的值得出的,这似乎是最初的问题:
GradD2 <- function(x, y, alpha = 0.0000056, epsilon = 10^-10) {
cost <- vector(mode = 'numeric')
iter <- vector(mode = 'integer')
delta_hist <- vector(mode = 'numeric')
i <- 0
iter <- 0
x <- cbind(rep(1, nrow(x)), x)
theta <- matrix(c(1,1), ncol(x), 1)
cost <- (1/(2*nrow(x))) * t(x%*% theta -y) %*% (x %*% theta -y)
delta <- 1.0
while(length(iter) < 1000 ) { #todo - change back to while(delta>epsilon)
i <- i +1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta -y))
cval <- (1/(2*nrow(x))) * t(x %*% theta -y) %*% (x %*% theta -y)
cost <- append(cost, cval, after = length(cost))
delta <- abs(cost[i+1] - cost[i])
delta_hist <- append(delta_hist, delta, after = length(delta_hist))
if((cost[i+1] - cost[i]) > 0) {
print('The cost is increasing. Try reducing alpha.')
return(list(theta = theta,cost = cost, delta_hist = delta_hist, iter = iter))
}
iter <- append(iter, i, after = length(iter))
}
print(sprintf('Completed in %i iterations.', i))
return(list(theta = theta,cost = cost, delta_hist = delta_hist, iter = iter))
}
结果:
> str(GradD2_tst)
List of 4
$ theta : num [1:2, 1] 1.693 0.707
$ cost : num [1:1000] 88564 87541 86587 85698 84870 ...
$ delta_hist: num [1:999] 1024 954 889 828 771 ...
$ iter : num [1:1000] 0 1 2 3 4 5 6 7 8 9 ...
> GradD2_tst$delta_hist[994:999]
[1] 0.08580117 0.08580094 0.08580071 0.08580048 0.08580025 0.08580002
> GradD2_tst$cost[994:999]
[1] 73493.72 73493.63 73493.54 73493.46 73493.37 73493.29
>
如果不知道,我的 x 和 y 是:
> x <- as.matrix(sample(1:1000, 400), ncol =1)
> y <- sample(1:1000, 400)
x 的这次选举,当与正确的 while(delta > epsilon) {
一起使用时,出现了 cval 的不匹配,从而产生了警告,而现在没有。该死的
用 epsilon 10^-1 再次钓鱼 alpha:
> GradD2_tst2 <- GradD2(x,y, alpha=0.006, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.001, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.0006, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.00006, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.000056, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.000009, epsilon = 10^-1)
[1] "The cost is increasing. Try reducing alpha."
> GradD2_tst2 <- GradD2(x,y, alpha=0.0000056, epsilon = 10^-1)
[1] "Completed in 160 iterations."
>
> str(GradD2_tst2)
List of 4
$ theta : num [1:2, 1] 1.111 0.709
$ cost : num [1:161] 88564 87541 86587 85698 84870 ...
$ delta_hist: num [1:160] 1024 954 889 828 771 ...
$ iter : num [1:161] 0 1 2 3 4 5 6 7 8 9 ...
> GradD2_tst2$cost[155:161]
[1] 73566.06 73565.96 73565.85 73565.75 73565.65 73565.55 73565.45
> GradD2_tst2$delta_hist[155:161]
[1] 0.10493825 0.10364385 0.10243786 0.10131425 0.10026738 0.09929201 NA
>
在一个核心上 epsilon = 10^-10 时 运行 需要很长时间。我的从未在一夜之间完成,Ctl-C。 HTH 关于历史。