CPU 对 MOSEK 使用 Python API 的亲和力问题
CPU affinity issue using Python API for MOSEK
我在 MOSEK. My program parallelizes using the multiprocessing
module in Python
, thus MOSEK is running concurrently on each process. The machine has 48 cores so I run 48 concurrent processes using the Pool
class. Their documentation states that the API is thread safe 中遇到 CPU 亲和性和线性整数规划问题。
程序启动后,下面是top
的输出。它表明约 50% 的 CPU 是空闲的。仅显示顶部输出的前 20 行。
top - 22:04:42 up 5 days, 14:38, 3 users, load average: 10.67, 13.65, 6.29
Tasks: 613 total, 47 running, 566 sleeping, 0 stopped, 0 zombie
%Cpu(s): 46.3 us, 3.8 sy, 0.0 ni, 49.2 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
GiB Mem: 503.863 total, 101.613 used, 402.250 free, 0.482 buffers
GiB Swap: 61.035 total, 0.000 used, 61.035 free. 96.250 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
115517 njmeyer 20 0 171752 27912 11632 R 98.7 0.0 0:02.52 python
115522 njmeyer 20 0 171088 27472 11632 R 98.7 0.0 0:02.79 python
115547 njmeyer 20 0 171140 27460 11568 R 98.7 0.0 0:01.82 python
115550 njmeyer 20 0 171784 27880 11568 R 98.7 0.0 0:01.64 python
115540 njmeyer 20 0 171136 27456 11568 R 92.5 0.0 0:01.91 python
115551 njmeyer 20 0 371636 31100 11632 R 92.5 0.0 0:02.93 python
115539 njmeyer 20 0 171132 27452 11568 R 80.2 0.0 0:01.97 python
115515 njmeyer 20 0 171748 27908 11632 R 74.0 0.0 0:03.02 python
115538 njmeyer 20 0 171128 27512 11632 R 74.0 0.0 0:02.51 python
115558 njmeyer 20 0 171144 27528 11632 R 74.0 0.0 0:02.28 python
115554 njmeyer 20 0 527980 28728 11632 R 67.8 0.0 0:02.15 python
115524 njmeyer 20 0 527956 28676 11632 R 61.7 0.0 0:02.34 python
115526 njmeyer 20 0 527956 28704 11632 R 61.7 0.0 0:02.80 python
我检查了文档的 MOSEK parameters 部分,但没有看到任何与 CPU affinity 相关的内容。他们有一些与优化器中的多线程相关的标志。这些标志默认设置为 off
,当冗余设置为 off
时没有变化。
我检查了 运行 宁 python 个作业的 cpu 亲和力,其中许多都绑定到相同的 cpu。但是,奇怪的是我无法设置 cpu affinity,或者至少它似乎在我更改后很快又被更改了。
我选择了其中一份工作,并通过 运行ning taskset -p 0xFFFFFFFFFFFF 115526
设置了 cpu 亲和力。我这样做 10 次,中间间隔 1 秒。这是每次 taskset
调用后的 cpu 亲和掩码。
pid 115526's current affinity mask: 10
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 7
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 200000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
似乎有什么东西在 运行 时间内不断改变 cpu 亲和力。
我也试过设置父进程的cpu亲和力,效果一样
这是我的代码 运行ning.
import mosek
import sys
import cPickle as pickle
import multiprocessing
import time
def mosekOptim(aCols,aVals,b,c,nCon,nVar,numTrt):
"""Solve the linear integer program.
Solve the program
max c' x
s.t. Ax <= b
"""
## setup mosek
with mosek.Env() as env, env.Task() as task:
task.appendcons(nCon)
task.appendvars(nVar)
inf = float("inf")
## c
for j,cj in enumerate(c):
task.putcj(j,cj)
## bounds on A
bkc = [mosek.boundkey.fx] + [mosek.boundkey.up
for i in range(nCon-1)]
blc = [float(numTrt)] + [-inf for i in range(nCon-1)]
buc = b
## bounds on x
bkx = [mosek.boundkey.ra for i in range(nVar)]
blx = [0.0]*nVar
bux = [1.0]*nVar
for j,a in enumerate(zip(aCols,aVals)):
task.putarow(j,a[0],a[1])
for j,bc in enumerate(zip(bkc,blc,buc)):
task.putconbound(j,bc[0],bc[1],bc[2])
for j,bx in enumerate(zip(bkx,blx,bux)):
task.putvarbound(j,bx[0],bx[1],bx[2])
task.putobjsense(mosek.objsense.maximize)
## integer type
task.putvartypelist(range(nVar),
[mosek.variabletype.type_int
for i in range(nVar)])
task.optimize()
task.solutionsummary(mosek.streamtype.msg)
prosta = task.getprosta(mosek.soltype.itg)
solsta = task.getsolsta(mosek.soltype.itg)
xx = mosek.array.zeros(nVar,float)
task.getxx(mosek.soltype.itg,xx)
if solsta not in [ mosek.solsta.integer_optimal,
mosek.solsta.near_integer_optimal ]:
print "".join(mosekMsg)
raise ValueError("Non optimal or infeasible.")
else:
return xx
def reps(secs,*args):
start = time.time()
while time.time() - start < secs:
for i in range(100):
mosekOptim(*args)
def main():
with open("data.txt","r") as f:
data = pickle.loads(f.read())
args = (60,) + data
pool = multiprocessing.Pool()
jobs = []
for i in range(multiprocessing.cpu_count()):
jobs.append(pool.apply_async(reps,args=args))
pool.close()
pool.join()
if __name__ == "__main__":
main()
代码取消了我预先计算的数据。这些对象是线性程序的约束和系数。我在这个 repository.
中托管了代码和这个数据文件
有没有其他人在使用 MOSEK 时遇到过这种行为?对如何进行有什么建议吗?
我联系了支持人员,他们建议将 MSK_IPAR_NUM_THREADS
设置为 1
。我的问题只需要几分之一秒就可以解决,所以它看起来不像是在使用多核。应该检查文档的默认值。
在我的代码中,我在 with
语句之后添加了 task.putintparam(mosek.iparam.num_threads,1)
。这解决了问题。
我在 MOSEK. My program parallelizes using the multiprocessing
module in Python
, thus MOSEK is running concurrently on each process. The machine has 48 cores so I run 48 concurrent processes using the Pool
class. Their documentation states that the API is thread safe 中遇到 CPU 亲和性和线性整数规划问题。
程序启动后,下面是top
的输出。它表明约 50% 的 CPU 是空闲的。仅显示顶部输出的前 20 行。
top - 22:04:42 up 5 days, 14:38, 3 users, load average: 10.67, 13.65, 6.29
Tasks: 613 total, 47 running, 566 sleeping, 0 stopped, 0 zombie
%Cpu(s): 46.3 us, 3.8 sy, 0.0 ni, 49.2 id, 0.7 wa, 0.0 hi, 0.0 si, 0.0 st
GiB Mem: 503.863 total, 101.613 used, 402.250 free, 0.482 buffers
GiB Swap: 61.035 total, 0.000 used, 61.035 free. 96.250 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
115517 njmeyer 20 0 171752 27912 11632 R 98.7 0.0 0:02.52 python
115522 njmeyer 20 0 171088 27472 11632 R 98.7 0.0 0:02.79 python
115547 njmeyer 20 0 171140 27460 11568 R 98.7 0.0 0:01.82 python
115550 njmeyer 20 0 171784 27880 11568 R 98.7 0.0 0:01.64 python
115540 njmeyer 20 0 171136 27456 11568 R 92.5 0.0 0:01.91 python
115551 njmeyer 20 0 371636 31100 11632 R 92.5 0.0 0:02.93 python
115539 njmeyer 20 0 171132 27452 11568 R 80.2 0.0 0:01.97 python
115515 njmeyer 20 0 171748 27908 11632 R 74.0 0.0 0:03.02 python
115538 njmeyer 20 0 171128 27512 11632 R 74.0 0.0 0:02.51 python
115558 njmeyer 20 0 171144 27528 11632 R 74.0 0.0 0:02.28 python
115554 njmeyer 20 0 527980 28728 11632 R 67.8 0.0 0:02.15 python
115524 njmeyer 20 0 527956 28676 11632 R 61.7 0.0 0:02.34 python
115526 njmeyer 20 0 527956 28704 11632 R 61.7 0.0 0:02.80 python
我检查了文档的 MOSEK parameters 部分,但没有看到任何与 CPU affinity 相关的内容。他们有一些与优化器中的多线程相关的标志。这些标志默认设置为 off
,当冗余设置为 off
时没有变化。
我检查了 运行 宁 python 个作业的 cpu 亲和力,其中许多都绑定到相同的 cpu。但是,奇怪的是我无法设置 cpu affinity,或者至少它似乎在我更改后很快又被更改了。
我选择了其中一份工作,并通过 运行ning taskset -p 0xFFFFFFFFFFFF 115526
设置了 cpu 亲和力。我这样做 10 次,中间间隔 1 秒。这是每次 taskset
调用后的 cpu 亲和掩码。
pid 115526's current affinity mask: 10
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 7
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 200000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 47
pid 115526's current affinity mask: ffffffffffff
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
pid 115526's current affinity mask: 800000000000
pid 115526's new affinity mask: ffffffffffff
pid 115526's current affinity list: 0-47
似乎有什么东西在 运行 时间内不断改变 cpu 亲和力。
我也试过设置父进程的cpu亲和力,效果一样
这是我的代码 运行ning.
import mosek
import sys
import cPickle as pickle
import multiprocessing
import time
def mosekOptim(aCols,aVals,b,c,nCon,nVar,numTrt):
"""Solve the linear integer program.
Solve the program
max c' x
s.t. Ax <= b
"""
## setup mosek
with mosek.Env() as env, env.Task() as task:
task.appendcons(nCon)
task.appendvars(nVar)
inf = float("inf")
## c
for j,cj in enumerate(c):
task.putcj(j,cj)
## bounds on A
bkc = [mosek.boundkey.fx] + [mosek.boundkey.up
for i in range(nCon-1)]
blc = [float(numTrt)] + [-inf for i in range(nCon-1)]
buc = b
## bounds on x
bkx = [mosek.boundkey.ra for i in range(nVar)]
blx = [0.0]*nVar
bux = [1.0]*nVar
for j,a in enumerate(zip(aCols,aVals)):
task.putarow(j,a[0],a[1])
for j,bc in enumerate(zip(bkc,blc,buc)):
task.putconbound(j,bc[0],bc[1],bc[2])
for j,bx in enumerate(zip(bkx,blx,bux)):
task.putvarbound(j,bx[0],bx[1],bx[2])
task.putobjsense(mosek.objsense.maximize)
## integer type
task.putvartypelist(range(nVar),
[mosek.variabletype.type_int
for i in range(nVar)])
task.optimize()
task.solutionsummary(mosek.streamtype.msg)
prosta = task.getprosta(mosek.soltype.itg)
solsta = task.getsolsta(mosek.soltype.itg)
xx = mosek.array.zeros(nVar,float)
task.getxx(mosek.soltype.itg,xx)
if solsta not in [ mosek.solsta.integer_optimal,
mosek.solsta.near_integer_optimal ]:
print "".join(mosekMsg)
raise ValueError("Non optimal or infeasible.")
else:
return xx
def reps(secs,*args):
start = time.time()
while time.time() - start < secs:
for i in range(100):
mosekOptim(*args)
def main():
with open("data.txt","r") as f:
data = pickle.loads(f.read())
args = (60,) + data
pool = multiprocessing.Pool()
jobs = []
for i in range(multiprocessing.cpu_count()):
jobs.append(pool.apply_async(reps,args=args))
pool.close()
pool.join()
if __name__ == "__main__":
main()
代码取消了我预先计算的数据。这些对象是线性程序的约束和系数。我在这个 repository.
中托管了代码和这个数据文件有没有其他人在使用 MOSEK 时遇到过这种行为?对如何进行有什么建议吗?
我联系了支持人员,他们建议将 MSK_IPAR_NUM_THREADS
设置为 1
。我的问题只需要几分之一秒就可以解决,所以它看起来不像是在使用多核。应该检查文档的默认值。
在我的代码中,我在 with
语句之后添加了 task.putintparam(mosek.iparam.num_threads,1)
。这解决了问题。