Dask : 在"auto" 模式下如何计算内存限制?
Dask : how the memory limit is calculated in "auto" mode?
文档在“自动”模式下显示了以下公式:
$ dask-worker .. --memory-limit=auto # TOTAL_MEMORY * min(1, nthreads / total_nthreads)
我的 CPU 规格:
Architecture: x86_64
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
我的内存规格:
MemTotal: 16282416 kB
MemFree: 1142108 kB
MemAvailable: 9397036 kB
当我触发 dask_worker
命令时,显示以下输出:
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Memory: 3.88 GiB
distributed.worker - INFO - -------------------------------------------------
能否解释一下 3.88 GiB memory
是如何找到的?好像和之前的公式不符
我怀疑 nthreads
指的是有多少线程 这个 特定的工作线程可用于安排任务,而 total_nthreads
指的是可用的线程总数在您的系统上。
dask-worker
CLI 命令具有与 LocalCluster
相同的默认值(参见 GitHub issue)。假设 LocalCluster
启动 n
workers 的默认值,其中 n
是您系统上可用内核的数量,并为每个 worker 分配 m
线程,其中 m
是每个内核的线程数:
n = 4 # number of cores
m = 1 # number of threads per core
TOTAL_MEMORY = 16282416 kB
TOTAL_MEMORY * min(1, 1 / 4)
> 4070604
4070604 kB 为 3.79 GiB
在此处查看文档:
https://docs.dask.org/en/latest/deploying-cli.html#dask-worker
--nthreads
Number of threads per process.
--nprocs
Deprecated. Use ‘–nworkers’ instead. Number of worker processes to
launch. If negative, then (CPU_COUNT + 1 + nprocs) is used. Set to
‘auto’ to set nprocs and nthreads dynamically based on CPU_COUNT
--nworkers <n_workers>
Number of worker processes to launch. If negative, then (CPU_COUNT +
1 + nworkers) is used. Set to ‘auto’ to set nworkers and nthreads
dynamically based on CPU_COUNT
另请参阅 LocalCluster
的 source 以了解如何设置默认值:
文档在“自动”模式下显示了以下公式:
$ dask-worker .. --memory-limit=auto # TOTAL_MEMORY * min(1, nthreads / total_nthreads)
我的 CPU 规格:
Architecture: x86_64
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
我的内存规格:
MemTotal: 16282416 kB
MemFree: 1142108 kB
MemAvailable: 9397036 kB
当我触发 dask_worker
命令时,显示以下输出:
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 1
distributed.worker - INFO - Memory: 3.88 GiB
distributed.worker - INFO - -------------------------------------------------
能否解释一下 3.88 GiB memory
是如何找到的?好像和之前的公式不符
我怀疑 nthreads
指的是有多少线程 这个 特定的工作线程可用于安排任务,而 total_nthreads
指的是可用的线程总数在您的系统上。
dask-worker
CLI 命令具有与 LocalCluster
相同的默认值(参见 GitHub issue)。假设 LocalCluster
启动 n
workers 的默认值,其中 n
是您系统上可用内核的数量,并为每个 worker 分配 m
线程,其中 m
是每个内核的线程数:
n = 4 # number of cores
m = 1 # number of threads per core
TOTAL_MEMORY = 16282416 kB
TOTAL_MEMORY * min(1, 1 / 4)
> 4070604
4070604 kB 为 3.79 GiB
在此处查看文档:
https://docs.dask.org/en/latest/deploying-cli.html#dask-worker
--nthreads
Number of threads per process.
--nprocs
Deprecated. Use ‘–nworkers’ instead. Number of worker processes to launch. If negative, then (CPU_COUNT + 1 + nprocs) is used. Set to ‘auto’ to set nprocs and nthreads dynamically based on CPU_COUNT
--nworkers <n_workers>
Number of worker processes to launch. If negative, then (CPU_COUNT + 1 + nworkers) is used. Set to ‘auto’ to set nworkers and nthreads dynamically based on CPU_COUNT
另请参阅 LocalCluster
的 source 以了解如何设置默认值: