梯度下降算法需要很长时间才能完成 - 效率 - Python
Gradient Descent algorithm taking long time to complete - Efficiency - Python
我正在尝试使用 python 实现梯度下降算法,下面是我的代码,
def grad_des(xvalues, yvalues, R=0.01, epsilon = 0.0001, MaxIterations=1000):
xvalues= np.array(xvalues)
yvalues = np.array(yvalues)
length = len(xvalues)
alpha = 1
beta = 1
converged = False
i=0
cost = sum([(alpha + beta*xvalues[i] - yvalues[i])**2 for i in range(length)]) / (2 * length)
start_time = time.time()
while not converged:
alpha_deriv = sum([(alpha + beta*xvalues[i] - yvalues[i]) for i in range(length)]) / (length)
beta_deriv = sum([(alpha + beta*xvalues[i] - yvalues[i])*xvalues[i] for i in range(length)]) / (length)
alpha = alpha - R * alpha_deriv
beta = beta - R * beta_deriv
new_cost = sum( [ (alpha + beta*xvalues[i] - yvalues[i])**2 for i in range(length)] ) / (2*length)
if abs(cost - new_cost) <= epsilon:
print 'Converged'
print 'Number of Iterations:', i
converged = True
cost = new_cost
i = i + 1
if i == MaxIterations:
print 'Maximum Iterations Exceeded'
converged = True
print "Time taken: " + str(round(time.time() - start_time,2)) + " seconds"
return alpha, beta
此代码运行良好。但问题是,大约 600 次迭代需要超过 25 秒的时间。我觉得这不够有效,我尝试在进行计算之前将其转换为数组。这确实将时间从 300 秒减少到 25 秒。我仍然觉得它可以减少。谁能帮我改进这个算法?
谢谢
正如我评论的那样,我无法重现缓慢的过程,但是这里有一些潜在的问题:
看起来length
没有变化,但是你在重复调用range(length)
。在 Python 2.x 中,range
创建一个列表,重复执行此操作会减慢速度(对象创建并不便宜。)使用 xrange
(或导入 Py3 兼容的来自 six
或 future
的迭代器 range
) 并预先创建一次范围而不是每次都创建范围。
i
在这里以可能导致问题的方式重复使用。您正在尝试将其用作整体迭代计数,但每个使用 i
的列表推导都会在函数范围内覆盖 i
,这意味着 "iteration" 计数将始终以 length - 1
.
结尾
我能看到的最容易实现的目标是矢量化。你有很多列表理解;它们比 for 循环更快,但对正确使用 numpy 数组一无所知。
def grad_des_vec(xvalues, yvalues, R=0.01, epsilon=0.0001, MaxIterations=1000):
xvalues = np.array(xvalues)
yvalues = np.array(yvalues)
length = len(xvalues)
alpha = 1
beta = 1
converged = False
i = 0
cost = np.sum((alpha + beta * xvalues - yvalues)**2) / (2 * length)
start_time = time.time()
while not converged:
alpha_deriv = np.sum(alpha + beta * xvalues - yvalues) / length
beta_deriv = np.sum(
(alpha + beta * xvalues - yvalues) * xvalues) / length
alpha = alpha - R * alpha_deriv
beta = beta - R * beta_deriv
new_cost = np.sum((alpha + beta * xvalues - yvalues)**2) / (2 * length)
if abs(cost - new_cost) <= epsilon:
print('Converged')
print('Number of Iterations:', i)
converged = True
cost = new_cost
i = i + 1
if i == MaxIterations:
print('Maximum Iterations Exceeded')
converged = True
print("Time taken: " + str(round(time.time() - start_time, 2)) + " seconds")
return alpha, beta
比较
In[47]: grad_des(xval, yval)
Converged
Number of Iterations: 198
Time taken: 0.66 seconds
Out[47]:
(0.28264882215511067, 0.53289263416071131)
In [48]: grad_des_vec(xval, yval)
Converged
Number of Iterations: 198
Time taken: 0.03 seconds
Out[48]:
(0.28264882215511078, 0.5328926341607112)
速度大约提高了 20 倍(xval 和 yval 都是 1024 个元素的数组。)。
我正在尝试使用 python 实现梯度下降算法,下面是我的代码,
def grad_des(xvalues, yvalues, R=0.01, epsilon = 0.0001, MaxIterations=1000):
xvalues= np.array(xvalues)
yvalues = np.array(yvalues)
length = len(xvalues)
alpha = 1
beta = 1
converged = False
i=0
cost = sum([(alpha + beta*xvalues[i] - yvalues[i])**2 for i in range(length)]) / (2 * length)
start_time = time.time()
while not converged:
alpha_deriv = sum([(alpha + beta*xvalues[i] - yvalues[i]) for i in range(length)]) / (length)
beta_deriv = sum([(alpha + beta*xvalues[i] - yvalues[i])*xvalues[i] for i in range(length)]) / (length)
alpha = alpha - R * alpha_deriv
beta = beta - R * beta_deriv
new_cost = sum( [ (alpha + beta*xvalues[i] - yvalues[i])**2 for i in range(length)] ) / (2*length)
if abs(cost - new_cost) <= epsilon:
print 'Converged'
print 'Number of Iterations:', i
converged = True
cost = new_cost
i = i + 1
if i == MaxIterations:
print 'Maximum Iterations Exceeded'
converged = True
print "Time taken: " + str(round(time.time() - start_time,2)) + " seconds"
return alpha, beta
此代码运行良好。但问题是,大约 600 次迭代需要超过 25 秒的时间。我觉得这不够有效,我尝试在进行计算之前将其转换为数组。这确实将时间从 300 秒减少到 25 秒。我仍然觉得它可以减少。谁能帮我改进这个算法?
谢谢
正如我评论的那样,我无法重现缓慢的过程,但是这里有一些潜在的问题:
看起来
length
没有变化,但是你在重复调用range(length)
。在 Python 2.x 中,range
创建一个列表,重复执行此操作会减慢速度(对象创建并不便宜。)使用xrange
(或导入 Py3 兼容的来自six
或future
的迭代器range
) 并预先创建一次范围而不是每次都创建范围。i
在这里以可能导致问题的方式重复使用。您正在尝试将其用作整体迭代计数,但每个使用i
的列表推导都会在函数范围内覆盖i
,这意味着 "iteration" 计数将始终以length - 1
. 结尾
我能看到的最容易实现的目标是矢量化。你有很多列表理解;它们比 for 循环更快,但对正确使用 numpy 数组一无所知。
def grad_des_vec(xvalues, yvalues, R=0.01, epsilon=0.0001, MaxIterations=1000):
xvalues = np.array(xvalues)
yvalues = np.array(yvalues)
length = len(xvalues)
alpha = 1
beta = 1
converged = False
i = 0
cost = np.sum((alpha + beta * xvalues - yvalues)**2) / (2 * length)
start_time = time.time()
while not converged:
alpha_deriv = np.sum(alpha + beta * xvalues - yvalues) / length
beta_deriv = np.sum(
(alpha + beta * xvalues - yvalues) * xvalues) / length
alpha = alpha - R * alpha_deriv
beta = beta - R * beta_deriv
new_cost = np.sum((alpha + beta * xvalues - yvalues)**2) / (2 * length)
if abs(cost - new_cost) <= epsilon:
print('Converged')
print('Number of Iterations:', i)
converged = True
cost = new_cost
i = i + 1
if i == MaxIterations:
print('Maximum Iterations Exceeded')
converged = True
print("Time taken: " + str(round(time.time() - start_time, 2)) + " seconds")
return alpha, beta
比较
In[47]: grad_des(xval, yval)
Converged
Number of Iterations: 198
Time taken: 0.66 seconds
Out[47]:
(0.28264882215511067, 0.53289263416071131)
In [48]: grad_des_vec(xval, yval)
Converged
Number of Iterations: 198
Time taken: 0.03 seconds
Out[48]:
(0.28264882215511078, 0.5328926341607112)
速度大约提高了 20 倍(xval 和 yval 都是 1024 个元素的数组。)。