Condor 运行 python 成功,但不显示输出文件

Condor running python successfully, but doesn't show output files

我是 HTCondor 的新手,我正在尝试 运行 在 Condor 系统上编写 python 脚本。我想在我的代码中使用 cv2 和 numpy,同时能够在完成后读取我的印刷品和腌制数据。

当前代码 运行s 并完成(日志文件:return 值 0)。但是 condor_bin.out 是空的,我的打印应该出现的地方。并且没有文件 random_dat.pickle 传输。

我是不是做错了什么?

Python 脚本:

import numpy as np
import pickle
import cv2 as cv

print('test')
# setup cv2
sift = cv.SIFT_create()
img = cv.imread("0.jpg", cv.IMREAD_GRAYSCALE)

for i in range(25):
    # calc cv2
    kp, des = sift.detectAndCompute(img, None)
    # calc np
    norms = np.linalg.norm(des, axis=1)

# calc normal? python
index = []
for p in kp:
    temp = (p.pt, p.size, p.angle, p.response, p.octave, p.class_id)
    index.append(temp)

with open('./random_dat.pickle', 'wb') as handle:
    pickle.dump((123456, index, des, norms), handle)
    
print("finished")

Condor 安装文件 (test.info)

#Normal execution
Universe = vanilla

#I need just one CPU (which is the default)
RequestCpus    = 1
#No GPU
RequestGPUs    = 0
#I need disk spqce KB
RequestDisk = 150MB
#I need 2 GBytes of RAM (resident memory)
RequestMemory  = 150MB
#It will not run longer than 1 day
+RequestWalltime = 100

#retrieve data
#should_transfer_files = YES
#when_to_transfer_output = ON_EXIT

#I'm a nice person, I think...
NiceUser = true
#Mail me only if something is wrong
Notification = Always

# The job will 'cd' to this directory before starting, be sure you can _write_ here.
initialdir = /users/students/r0xxxxxx/Documents/testing_condor/
# This is the executable or script I want to run
executable = /users/students/r0xxxxxx/Documents/testing_condor/main.py

#Output of condors handling of the jobs, will be in 'initialdir'
Log          = condor_bin.log
#Standard output of the 'executable', in 'initialdir'
Output       = condor_bin.out
#Standard error of the 'executable', in 'initialdir'
Error        = condor_bin.err
#Standard error of the 'executable', in 'initialdir'

# Start just 1 instance of the job
Queue 1

我使用 condor_submit test.info 提交了它,结果在 condor_bin.log 中出现了以下日志:

...
000 (356.000.000) 2021-07-15 18:23:28 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=abcdefg.abcd.abcdefg.be&noUDP&sock=schedd_2422_de78>
...
000 (357.000.000) 2021-07-15 18:24:19 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=abcdefg.abcd.abcdefg.be&noUDP&sock=schedd_2422_de78>
...
040 (356.000.000) 2021-07-15 18:24:21 Started transferring input files
    Transferring to host: <10.xx.xx.xx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.abcd.abcdefg.be&noUDP&sock=slot1_1_123445_eb75_5374>
...
040 (356.000.000) 2021-07-15 18:24:21 Finished transferring input files
...
001 (356.000.000) 2021-07-15 18:24:22 Job executing on host: <10.xx.xx.xx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.abcd.abcdefg.be&noUDP&sock=startd_2178_815c>
...
006 (356.000.000) 2021-07-15 18:24:22 Image size of job updated: 1
    0  -  MemoryUsage of job (MB)
    0  -  ResidentSetSize of job (KB)
...
040 (356.000.000) 2021-07-15 18:24:22 Started transferring output files
...
040 (356.000.000) 2021-07-15 18:24:22 Finished transferring output files
...
005 (356.000.000) 2021-07-15 18:24:22 Job terminated.
    (1) Normal termination (return value 0)
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
        Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
    0  -  Run Bytes Sent By Job
    803  -  Run Bytes Received By Job
    0  -  Total Bytes Sent By Job
    803  -  Total Bytes Received By Job
    Partitionable Resources :    Usage  Request Allocated 
       Cpus                 :                 1         1 
       Disk (KB)            :       13   153600    782129 
       Gpus (Average)       :                 0         0 
       Memory (MB)          :        0      150       256 

    Job terminated of its own accord at 2021-07-15T16:24:22Z.
...

如您在 test.info 中所见,我已尝试使用

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

但这没有用。

如何查看打印语句以及完成后如何查看已腌制的数据?

非常感谢您的帮助!

尝试使用

启动脚本

#!/usr/bin/python

按照@Greg 的建议添加 #!/usr/bin/python 导致以下错误

Executable file 'my_file/path' is a script with CRLF (DOS/Windows) line endings.
This generally doesn't work, and you should probably run 'dos2unix myfile/path' -- or a similar tool -- before you resubmit.

我在 Linux 系统上生成了一个新的 Python 文件,其中添加了以下行作为前缀

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

当使用 test.info condor 文件中的 should_transfer_files = YESwhen_to_transfer_output = ON_EXIT 设置时,它在 condor 上成功运行。

TLDR; 运行 Python 在 Windows 中生成的代码可能会在 Linux 上的 Condor 系统 运行 上产生错误。修复:Write/copy 您的代码进入 Linux 生成的 Python 文件。