查找进度条的下载速度
Find the speed of download for a progressbar
我正在编写脚本以从网站下载视频。我添加了一个报告挂钩来获取下载进度。所以,到目前为止,它显示了下载数据的百分比和大小。我认为添加下载速度和 eta 会很有趣。
问题是,如果我使用简单的 speed = chunk_size/time
,显示的速度足够准确,但会疯狂地跳来跳去。所以,我使用了下载单个块所花费的时间的历史记录。比如,speed = chunk_size*n/sum(n_time_history)
.
现在它显示下载速度稳定,但它肯定是错误的,因为它的值是几bits/s,而下载的文件明显增长得更快。
有人可以告诉我哪里出错了吗?
这是我的代码。
def dlProgress(count, blockSize, totalSize):
global init_count
global time_history
try:
time_history.append(time.monotonic())
except NameError:
time_history = [time.monotonic()]
try:
init_count
except NameError:
init_count = count
percent = count*blockSize*100/totalSize
dl, dlu = unitsize(count*blockSize) #returns size in kB, MB, GB, etc.
tdl, tdlu = unitsize(totalSize)
count -= init_count #because continuation of partial downloads is supported
if count > 0:
n = 5 #length of time history to consider
_count = n if count > n else count
time_history = time_history[-_count:]
time_diff = [i-j for i,j in zip(time_history[1:],time_history[:-1])]
speed = blockSize*_count / sum(time_diff)
else: speed = 0
n = int(percent//4)
try:
eta = format_time((totalSize-blockSize*(count+1))//speed)
except:
eta = '>1 day'
speed, speedu = unitsize(speed, True) #returns speed in B/s, kB/s, MB/s, etc.
sys.stdout.write("\r" + percent + "% |" + "#"*n + " "*(25-n) + "| " + dl + dlu + "/" + tdl + tdlu + speed + speedu + eta)
sys.stdout.flush()
Edit:
Corrected the logic. Download speed shown is now much better.
As I increase the length of history used to calculate the speed, the stability increases but sudden changes in speed (if download stops, etc.) aren't shown.
How do I make it stable, yet sensitive to large changes?
我意识到这个问题现在更偏向于数学,但如果有人能帮助我或指出正确的方向,那就太好了。
另外,请告诉我是否有更有效的方法来完成此操作。
_count = n if count > n else count
time_history = time_history[-_count:]
time_weights = list(range(1,len(time_history))) #just a simple linear weights
time_diff = [(i-j)*k for i,j in zip(time_history[1:], time_history[:-1],time_weights)]
speed = blockSize*(sum(time_weights)) / sum(time_diff)
为了使其更稳定并且不会在下载高峰或下降时做出反应,您也可以添加以下内容:
_count = n if count > n else count
time_history = time_history[-_count:]
time_history.remove(min(time_history))
time_history.remove(max(time_history))
time_weights = list(range(1, len(time_history))) #just a simple linear weights
time_diff = [(i-j)*k for i,j in zip(time_history[1:], time_history[:-1],time_weights)]
speed = blockSize*(sum(time_weights)) / sum(time_diff)
这将消除 time_history
中的最高和最低峰值,这将使显示的数字更加稳定。如果你想挑剔,你可能可以在删除之前生成权重,然后使用 time_diff.index(min(time_diff))
.
过滤映射值
同时使用非线性函数(如 sqrt()
)生成权重会给您带来更好的结果。哦,正如我在评论中所说:将统计方法添加到过滤时间应该稍微好一些,但我怀疑它不值得增加开销。
我正在编写脚本以从网站下载视频。我添加了一个报告挂钩来获取下载进度。所以,到目前为止,它显示了下载数据的百分比和大小。我认为添加下载速度和 eta 会很有趣。
问题是,如果我使用简单的 speed = chunk_size/time
,显示的速度足够准确,但会疯狂地跳来跳去。所以,我使用了下载单个块所花费的时间的历史记录。比如,speed = chunk_size*n/sum(n_time_history)
.
现在它显示下载速度稳定,但它肯定是错误的,因为它的值是几bits/s,而下载的文件明显增长得更快。
有人可以告诉我哪里出错了吗?
这是我的代码。
def dlProgress(count, blockSize, totalSize):
global init_count
global time_history
try:
time_history.append(time.monotonic())
except NameError:
time_history = [time.monotonic()]
try:
init_count
except NameError:
init_count = count
percent = count*blockSize*100/totalSize
dl, dlu = unitsize(count*blockSize) #returns size in kB, MB, GB, etc.
tdl, tdlu = unitsize(totalSize)
count -= init_count #because continuation of partial downloads is supported
if count > 0:
n = 5 #length of time history to consider
_count = n if count > n else count
time_history = time_history[-_count:]
time_diff = [i-j for i,j in zip(time_history[1:],time_history[:-1])]
speed = blockSize*_count / sum(time_diff)
else: speed = 0
n = int(percent//4)
try:
eta = format_time((totalSize-blockSize*(count+1))//speed)
except:
eta = '>1 day'
speed, speedu = unitsize(speed, True) #returns speed in B/s, kB/s, MB/s, etc.
sys.stdout.write("\r" + percent + "% |" + "#"*n + " "*(25-n) + "| " + dl + dlu + "/" + tdl + tdlu + speed + speedu + eta)
sys.stdout.flush()
Edit:
Corrected the logic. Download speed shown is now much better.
As I increase the length of history used to calculate the speed, the stability increases but sudden changes in speed (if download stops, etc.) aren't shown.
How do I make it stable, yet sensitive to large changes?
我意识到这个问题现在更偏向于数学,但如果有人能帮助我或指出正确的方向,那就太好了。
另外,请告诉我是否有更有效的方法来完成此操作。
_count = n if count > n else count
time_history = time_history[-_count:]
time_weights = list(range(1,len(time_history))) #just a simple linear weights
time_diff = [(i-j)*k for i,j in zip(time_history[1:], time_history[:-1],time_weights)]
speed = blockSize*(sum(time_weights)) / sum(time_diff)
为了使其更稳定并且不会在下载高峰或下降时做出反应,您也可以添加以下内容:
_count = n if count > n else count
time_history = time_history[-_count:]
time_history.remove(min(time_history))
time_history.remove(max(time_history))
time_weights = list(range(1, len(time_history))) #just a simple linear weights
time_diff = [(i-j)*k for i,j in zip(time_history[1:], time_history[:-1],time_weights)]
speed = blockSize*(sum(time_weights)) / sum(time_diff)
这将消除 time_history
中的最高和最低峰值,这将使显示的数字更加稳定。如果你想挑剔,你可能可以在删除之前生成权重,然后使用 time_diff.index(min(time_diff))
.
同时使用非线性函数(如 sqrt()
)生成权重会给您带来更好的结果。哦,正如我在评论中所说:将统计方法添加到过滤时间应该稍微好一些,但我怀疑它不值得增加开销。