Linux 有时找不到可执行文件
Linux Executables Sometimes Not Found
问题:
如何解决批处理脚本期间有时会消失的可执行文件?
背景:
我有一系列脚本,旨在使用 mbgrid 和 gmt surface 等工具解析 xyz 高程数据、划分为网格、合并和插值这些数据。这些可以手动调用,但通常 运行 作为使用配置文件指定网格大小、单元格分辨率和目录结构等内容的计划任务。
这些脚本是用 python 编写的,使用多处理并通过子进程进行系统调用。
问题:
有时,我会遇到找不到命令的情况
(例如猫 180E48S_mbgrid_output.log
/bin/sh:1:mbgrid:未找到)
由于这些脚本是通过带有配置文件的 cron 调用的,因此我希望得到一致的结果。大多数时候(而且似乎总是在我观看时,即使我只是 运行 脚本 cron 启动),代码 运行 成功完成。但有时不是。
下面是一个日志文件片段,显示未找到 mbgrid(粗体),后跟一个成功 运行。然后在另一个成功 运行.
之前找不到 mbgrid 的另一个实例
grep '/bin/sh: 1: mbgrid: not found' 22E59N_mbgrid_output.log -B 10 -A 10
/bin/sh: 1: mbgrid: not found
Zgrid starting iterations Zgrid iteration 10 convergence test:
-0.001000 last:0.000000 Zgrid iteration 20 convergence test: -0.001000 last:-0.001000 Zgrid iteration 30 convergence test: -0.001000
last:-0.001000 Zgrid iteration 40 convergence test: -0.001000
last:-0.001000 Zgrid iteration 50 convergence test: -0.001000
last:-0.001000 Zgrid iteration 60 convergence test: -0.001000
last:-0.001000 Zgrid iteration 70 convergence test: -0.001000
last:-0.001000 Zgrid iteration 80 convergence test: -0.001000
last:-0.001000 Zgrid iteration 90 convergence test: -0.001000
last:-0.001000
...截取日志...
mbm_grdplot -I22E59Nweightedgrid.grd -G1 -C -V -L"File
22E59Nweightedgrid.grd - Topography Grid:Topography (m)"
executing mbm_grdplot... mbm_grdplot -I22E59Nweightedgrid_num.grd -G1
-W1/2 -V -L"File 22E59Nweightedgrid_num.grd - Topography Grid:Number of Topography Data Points"
executing mbm_grdplot... mbm_grdplot -I22E59Nweightedgrid_sd.grd -G1
-W1/2 -V -L"File 22E59Nweightedgrid_sd.grd - Topography Grid:Topography Standard Deviation (m)"
Done.
/bin/sh: 1: mbgrid: not found
Zgrid starting iterations Zgrid iteration 10 convergence test:
-0.001000 last:0.000000 Zgrid iteration 20 convergence test: -0.001000 last:-0.001000 Zgrid iteration 30 convergence test: -0.001000
last:-0.001000 Zgrid iteration 40 convergence test: -0.001000
last:-0.001000 Zgrid iteration 50 convergence test: -0.001000
last:-0.001000 Zgrid iteration 60 convergence test: -0.001000
last:-0.001000 Zgrid iteration 70 convergence test: -0.001000
last:-0.001000 Zgrid iteration 80 convergence test: -0.001000
last:-0.001000 Zgrid iteration 90 convergence test: -0.001000
last:-0.001000
...截取日志...
采取的步骤:
当我第一次遇到这个时,我将它与一些特别大且密集的网格联系起来。我很快看到我有 10 个进程,每个进程都需要 20% 的内存。调低进程数似乎可以解决这个问题。
但是,我已经看到这种情况发生,即使它看起来不像是内存压力问题,但我的系统监控中可能遗漏了一些东西。
我设置了 sar 监控,每隔几分钟抓取一次系统统计信息。我还设置了一个 cron,每 5 分钟获取一些最重要的统计数据。我真的没有在这些文件中看到任何会导致 linux 资源匮乏以至于无法找到可执行文件的内容。我会把它们放在底部,因为这已经很长了。
我已经尝试添加一些日志记录诊断来发现问题。
这是检查 mbgrid 是否存在的代码片段,如果找到,则记录正在执行的确切命令 运行:
if (os.path.exists(r'/usr/local/bin/mbgrid') and os.path.isfile(r'/usr/local/bin/mbgrid')):
try:
logging.info("mbgrid -I"+gridcellname+"mbdatalist.mbf -O"+ \
weightedname+" -R"+str(wbound)+"/"+str(ebound)+"/"+str(sbound)+"/"+str(nbound)+ \
" -A2 -P1 -V -F1 -C1 -N -M -E"+str(cellsize)+"/0.0/degrees! >> " + gridcellname + \
"_mbgrid_output.log 2>&1")
subprocess.call("mbgrid -I"+gridcellname+"mbdatalist.mbf -O"+ \
weightedname+" -R"+str(wbound)+"/"+str(ebound)+"/"+str(sbound)+"/"+str(nbound)+ \
" -A2 -P1 -V -F1 -C1 -N -M -E"+str(cellsize)+"/0.0/degrees! >> " + gridcellname + \
"_mbgrid_output.log 2>&1",cwd=griddir,shell=True)
logging.info("mbgrid complete")
except:
logging.error("Error creating weighted grid for "+gridcellname)
sys.exit("Error creating weighted grid for "+gridcellname)
else:
logging.error("Unable to find the mbgrid executable at /usr/local/bin/mbgrid for " + gridcellname)
来自给定配置文件的主日志:
otto@Tendua:~/automated/NZ_TopoBathy_30m$ grep '180E48S_mbgrid' NZ_TopoBathy_30m_grids.log -A 3 -B 3
INFO:09/08/2017 21:35:06:clipping and reformatting grid for 162E50S
INFO:09/08/2017 21:35:06:simplifying 165E34S_ngdc.xyz by converting to grid and back to xyz
INFO:09/08/2017 21:35:06:run mbgrid using the datalist to create weighted grid
INFO:09/08/2017 21:35:06:mbgrid -I180E48Smbdatalist.mbf -O180E48Sweightedgrid -R179.7/181.3/-48.3/-46.7 -A2 -P1 -V -F1 -C1 -N -M -E0.000277777777778/0.0/degrees! >> 180E48S_mbgrid_output.log 2>&1
INFO:09/08/2017 21:35:06:grid converted to netcdf3 format
INFO:09/08/2017 21:35:06:run mbgrid using the datalist to create weighted grid
INFO:09/08/2017 21:35:06:mbgrid -I175E30Smbdatalist.mbf -O175E30Sweightedgrid -R174.7/176.3/-30.3/-28.7 -A2 -P1 -V -F1 -C1 -N -M -E0.000277777777778/0.0/degrees! >> 175E30S_mbgrid_output.log 2>&1
并且在该确切网格的日志文件中:
cat 180E48S_mbgrid_output.log
/bin/sh: 1: mbgrid: not found
当我在正确的目录中发出确切的命令时,它 运行 是预期的。
我还应该寻找什么来识别和解决这个问题?
非常感谢您的宝贵时间。
============================================= ========
前十个进程,运行 每 5 分钟一次(它还跟踪数据驱动器上的磁盘 space):
otto@Tendua:~$ cat top_ten_process_monitor.sh
#!/bin/sh
date
free -m | awk 'NR==2{printf "Memory Usage: %s/%sMB (%.2f%%)\n", ,,*100/ }'
df -h | awk '[=12=] ~ "tendua" {printf "Disk Usage: %1.1f/%1.1fTB (%s)\n", ,,}'
top -bn1 | grep load | awk '{printf "CPU Load: %.2f\n", $(NF-2)}'
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head
(日志部分来自上述事件的时间)
Fri Sep 8 21:26:01 MDT 2017
Memory Usage: 63865/64239MB (99.42%)
Disk Usage: 2.7/3.6TB (74%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 1.57
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
4019 3886 nautilus --new-window 0.1 0.0
Fri Sep 8 21:31:01 MDT 2017
Memory Usage: 63852/64239MB (99.40%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 1.57
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
4019 3886 nautilus --new-window 0.1 0.0
Fri Sep 8 21:36:01 MDT 2017
Memory Usage: 63796/64239MB (99.31%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 4.59
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
1873 1871 gdal_translate -ot Float32 0.1 0.1
Fri Sep 8 21:41:01 MDT 2017
Memory Usage: 63878/64239MB (99.44%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 4.32
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
5636 5634 /usr/bin/python /usr/bin/gd 0.1 25.6
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
系统统计如下:
sar -f /var/log/sysstat/sa08|head -4 && sar -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM CPU %user %nice %system %iowait %steal %idle
12:05:03 AM all 51.97 0.00 1.32 0.89 0.00 45.82
07:55:01 PM all 9.44 0.00 0.59 3.51 0.00 86.45
08:05:01 PM all 1.50 0.00 1.12 9.39 0.00 87.99
08:15:01 PM all 1.48 0.00 1.12 9.16 0.00 88.24
08:25:01 PM all 1.46 0.00 1.13 9.29 0.00 88.12
08:35:01 PM all 1.50 0.00 1.24 8.65 0.00 88.61
08:45:01 PM all 1.52 0.00 1.57 7.87 0.00 89.04
08:55:01 PM all 1.55 0.00 1.64 7.96 0.00 88.85
09:05:01 PM all 1.53 0.00 1.70 8.08 0.00 88.69
09:15:01 PM all 1.54 0.00 1.74 8.04 0.00 88.68
09:25:01 PM all 1.50 0.00 1.69 8.16 0.00 88.65
09:35:01 PM all 1.55 0.00 1.69 7.98 0.00 88.78
09:45:01 PM all 2.17 0.00 1.96 11.67 0.00 84.20
09:55:01 PM all 1.71 0.00 1.87 18.06 0.00 78.36
10:05:01 PM all 1.70 0.00 2.13 17.57 0.00 78.60
10:15:01 PM all 1.73 0.00 1.93 18.64 0.00 77.70
10:25:01 PM all 1.75 0.00 1.96 18.36 0.00 77.93
10:35:01 PM all 3.91 0.00 1.80 15.40 0.00 78.89
10:45:01 PM all 1.66 0.00 1.90 19.23 0.00 77.20
10:55:01 PM all 1.68 0.00 1.82 19.05 0.00 77.45
11:05:01 PM all 1.69 0.00 1.77 19.31 0.00 77.23
11:15:01 PM all 7.69 0.00 1.03 8.68 0.00 82.60
sar -r -f /var/log/sysstat/sa08|head -4 && sar -r -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
12:05:03 AM 3859704 61921864 94.13 50184 52583736 17699508 25.29 29962700 29837068 611932
07:55:01 PM 392740 65388828 99.40 407920 60884772 10851876 15.51 2703424 60625128 2853088
08:05:01 PM 402104 65379464 99.39 410904 61032736 10845076 15.50 3520252 59959668 1336572
08:15:01 PM 371832 65409736 99.43 413876 61069912 10859576 15.52 3403804 60108248 1220840
08:25:01 PM 397468 65384100 99.40 416592 61053400 10820532 15.46 1277516 62219572 3004488
08:35:01 PM 426124 65355444 99.35 419376 61037072 10808244 15.45 1069828 62416140 306880
08:45:01 PM 453688 65327880 99.31 423508 60966356 10864072 15.53 1070752 62351248 929620
08:55:01 PM 367620 65413948 99.44 427112 61029896 10804544 15.44 1078392 62411352 197616
09:05:01 PM 377924 65403644 99.43 430608 60967232 10794432 15.43 1082116 62345104 234916
09:15:01 PM 415260 65366308 99.37 433856 60842704 10822332 15.47 1099380 62217480 1687536
09:25:01 PM 452492 65329076 99.31 436928 60832668 10842884 15.50 1097624 62204328 1594256
09:35:01 PM 445220 65336348 99.32 460300 60819540 10855512 15.51 1137008 62187660 528
09:45:01 PM 470044 65311524 99.29 463480 60443220 10960588 15.66 24385076 38668700 1765864
09:55:01 PM 472608 65308960 99.28 464708 60422328 11067312 15.82 28572380 34443408 1358136
10:05:01 PM 433176 65348392 99.34 465852 60438212 10957232 15.66 29127204 33924120 4382320
10:15:01 PM 450192 65331376 99.32 467016 60405816 11024924 15.76 30376748 32652588 2063424
10:25:01 PM 376448 65405120 99.43 468388 60579636 10831104 15.48 31660420 31440712 704
10:35:01 PM 560628 65220940 99.15 469856 60329228 10979960 15.69 38914816 23997528 4796836
10:45:01 PM 521660 65259908 99.21 471004 60329408 11008184 15.73 41755472 21194232 3680356
10:55:01 PM 556596 65224972 99.15 472160 60329584 10921484 15.61 44714800 18191336 2351468
11:05:01 PM 543664 65237904 99.17 473288 60329768 11025832 15.76 46713556 16209272 3482536
11:15:01 PM 408040 65373528 99.38 475420 60500764 10935492 15.63 41843776 21182704 168384
sar -b -f /var/log/sysstat/sa08|head -4 && sar -b -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM tps rtps wtps bread/s bwrtn/s
12:05:03 AM 135.06 125.31 9.75 32088.68 15436.97
07:55:01 PM 214.60 187.68 26.92 47229.75 31671.99
08:05:01 PM 407.57 334.90 72.67 85024.47 91730.47
08:15:01 PM 411.46 341.30 70.16 86763.79 87368.85
08:25:01 PM 420.48 353.86 66.62 90036.21 84149.34
08:35:01 PM 480.24 394.76 85.48 100597.74 109374.29
08:45:01 PM 634.61 530.99 103.62 134987.46 132954.73
08:55:01 PM 683.04 567.68 115.36 144695.57 147487.06
09:05:01 PM 712.35 594.86 117.50 151571.92 151528.64
09:15:01 PM 732.52 613.41 119.10 156291.37 151223.34
09:25:01 PM 716.90 598.66 118.24 152659.05 152950.26
09:35:01 PM 719.66 594.02 125.65 147751.96 153455.47
09:45:01 PM 449.31 379.28 70.03 96426.44 106934.35
09:55:01 PM 249.01 48.08 200.93 12311.36 304792.29
10:05:01 PM 225.01 32.41 192.60 8286.13 310338.90
10:15:01 PM 220.01 15.86 204.15 4051.83 312400.63
10:25:01 PM 210.24 8.80 201.44 1863.78 308831.78
10:35:01 PM 199.95 23.89 176.05 5375.87 249054.33
10:45:01 PM 204.32 0.00 204.32 0.00 322379.98
10:55:01 PM 201.33 0.00 201.33 0.00 314086.77
11:05:01 PM 206.65 0.00 206.64 0.04 317881.94
11:15:01 PM 168.05 32.33 135.72 2041.40 151844.90
sar -w -f /var/log/sysstat/sa08|head -4 && sar -w -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM proc/s cswch/s
12:05:03 AM 1.45 800.71
07:55:01 PM 0.82 1116.96
08:05:01 PM 0.86 1584.54
08:15:01 PM 0.82 1530.11
08:25:01 PM 0.70 1543.24
08:35:01 PM 0.71 1630.97
08:45:01 PM 0.72 1942.89
08:55:01 PM 0.78 1988.69
09:05:01 PM 0.77 2030.87
09:15:01 PM 1.03 2061.84
09:25:01 PM 0.86 2027.47
09:35:01 PM 0.82 2062.62
09:45:01 PM 19.65 1789.45
09:55:01 PM 0.73 945.56
10:05:01 PM 0.53 901.46
10:15:01 PM 0.68 873.46
10:25:01 PM 0.76 793.85
10:35:01 PM 7.85 950.96
10:45:01 PM 0.74 869.87
10:55:01 PM 0.79 913.40
11:05:01 PM 0.78 894.85
11:15:01 PM 0.80 1225.52
Since these scripts are called via cron with a configuration file, I would expect consistent results. Most of the time (and it seems always when I'm watching, even if I just run the script cron initiates), the code runs successfully to completion. But sometimes not.
看来是环境问题。尝试使用手动 运行 和 cron 作业记录 os.environ。
要查找的变量是PATH
。
PATH is an environmental variable in Linux and other Unix-like
operating systems that tells the shell which directories to search for
executable files (i.e., ready-to-run programs) in response to commands
issued by a user.
如果我的猜想是正确的,您可以通过手动设置来解决这个问题
os.environ['PATH'] = "path to binaries dir"
然后将 os.environ 添加到子进程调用
的 cwd
参数
问题: 如何解决批处理脚本期间有时会消失的可执行文件?
背景: 我有一系列脚本,旨在使用 mbgrid 和 gmt surface 等工具解析 xyz 高程数据、划分为网格、合并和插值这些数据。这些可以手动调用,但通常 运行 作为使用配置文件指定网格大小、单元格分辨率和目录结构等内容的计划任务。
这些脚本是用 python 编写的,使用多处理并通过子进程进行系统调用。
问题: 有时,我会遇到找不到命令的情况 (例如猫 180E48S_mbgrid_output.log /bin/sh:1:mbgrid:未找到)
由于这些脚本是通过带有配置文件的 cron 调用的,因此我希望得到一致的结果。大多数时候(而且似乎总是在我观看时,即使我只是 运行 脚本 cron 启动),代码 运行 成功完成。但有时不是。
下面是一个日志文件片段,显示未找到 mbgrid(粗体),后跟一个成功 运行。然后在另一个成功 运行.
之前找不到 mbgrid 的另一个实例grep '/bin/sh: 1: mbgrid: not found' 22E59N_mbgrid_output.log -B 10 -A 10
/bin/sh: 1: mbgrid: not found
Zgrid starting iterations Zgrid iteration 10 convergence test: -0.001000 last:0.000000 Zgrid iteration 20 convergence test: -0.001000 last:-0.001000 Zgrid iteration 30 convergence test: -0.001000 last:-0.001000 Zgrid iteration 40 convergence test: -0.001000 last:-0.001000 Zgrid iteration 50 convergence test: -0.001000 last:-0.001000 Zgrid iteration 60 convergence test: -0.001000 last:-0.001000 Zgrid iteration 70 convergence test: -0.001000 last:-0.001000 Zgrid iteration 80 convergence test: -0.001000 last:-0.001000 Zgrid iteration 90 convergence test: -0.001000 last:-0.001000
...截取日志...
mbm_grdplot -I22E59Nweightedgrid.grd -G1 -C -V -L"File 22E59Nweightedgrid.grd - Topography Grid:Topography (m)"
executing mbm_grdplot... mbm_grdplot -I22E59Nweightedgrid_num.grd -G1 -W1/2 -V -L"File 22E59Nweightedgrid_num.grd - Topography Grid:Number of Topography Data Points"
executing mbm_grdplot... mbm_grdplot -I22E59Nweightedgrid_sd.grd -G1 -W1/2 -V -L"File 22E59Nweightedgrid_sd.grd - Topography Grid:Topography Standard Deviation (m)"
Done.
/bin/sh: 1: mbgrid: not found
Zgrid starting iterations Zgrid iteration 10 convergence test: -0.001000 last:0.000000 Zgrid iteration 20 convergence test: -0.001000 last:-0.001000 Zgrid iteration 30 convergence test: -0.001000 last:-0.001000 Zgrid iteration 40 convergence test: -0.001000 last:-0.001000 Zgrid iteration 50 convergence test: -0.001000 last:-0.001000 Zgrid iteration 60 convergence test: -0.001000 last:-0.001000 Zgrid iteration 70 convergence test: -0.001000 last:-0.001000 Zgrid iteration 80 convergence test: -0.001000 last:-0.001000 Zgrid iteration 90 convergence test: -0.001000 last:-0.001000
...截取日志...
采取的步骤: 当我第一次遇到这个时,我将它与一些特别大且密集的网格联系起来。我很快看到我有 10 个进程,每个进程都需要 20% 的内存。调低进程数似乎可以解决这个问题。
但是,我已经看到这种情况发生,即使它看起来不像是内存压力问题,但我的系统监控中可能遗漏了一些东西。
我设置了 sar 监控,每隔几分钟抓取一次系统统计信息。我还设置了一个 cron,每 5 分钟获取一些最重要的统计数据。我真的没有在这些文件中看到任何会导致 linux 资源匮乏以至于无法找到可执行文件的内容。我会把它们放在底部,因为这已经很长了。
我已经尝试添加一些日志记录诊断来发现问题。
这是检查 mbgrid 是否存在的代码片段,如果找到,则记录正在执行的确切命令 运行:
if (os.path.exists(r'/usr/local/bin/mbgrid') and os.path.isfile(r'/usr/local/bin/mbgrid')):
try:
logging.info("mbgrid -I"+gridcellname+"mbdatalist.mbf -O"+ \
weightedname+" -R"+str(wbound)+"/"+str(ebound)+"/"+str(sbound)+"/"+str(nbound)+ \
" -A2 -P1 -V -F1 -C1 -N -M -E"+str(cellsize)+"/0.0/degrees! >> " + gridcellname + \
"_mbgrid_output.log 2>&1")
subprocess.call("mbgrid -I"+gridcellname+"mbdatalist.mbf -O"+ \
weightedname+" -R"+str(wbound)+"/"+str(ebound)+"/"+str(sbound)+"/"+str(nbound)+ \
" -A2 -P1 -V -F1 -C1 -N -M -E"+str(cellsize)+"/0.0/degrees! >> " + gridcellname + \
"_mbgrid_output.log 2>&1",cwd=griddir,shell=True)
logging.info("mbgrid complete")
except:
logging.error("Error creating weighted grid for "+gridcellname)
sys.exit("Error creating weighted grid for "+gridcellname)
else:
logging.error("Unable to find the mbgrid executable at /usr/local/bin/mbgrid for " + gridcellname)
来自给定配置文件的主日志:
otto@Tendua:~/automated/NZ_TopoBathy_30m$ grep '180E48S_mbgrid' NZ_TopoBathy_30m_grids.log -A 3 -B 3
INFO:09/08/2017 21:35:06:clipping and reformatting grid for 162E50S
INFO:09/08/2017 21:35:06:simplifying 165E34S_ngdc.xyz by converting to grid and back to xyz
INFO:09/08/2017 21:35:06:run mbgrid using the datalist to create weighted grid
INFO:09/08/2017 21:35:06:mbgrid -I180E48Smbdatalist.mbf -O180E48Sweightedgrid -R179.7/181.3/-48.3/-46.7 -A2 -P1 -V -F1 -C1 -N -M -E0.000277777777778/0.0/degrees! >> 180E48S_mbgrid_output.log 2>&1
INFO:09/08/2017 21:35:06:grid converted to netcdf3 format
INFO:09/08/2017 21:35:06:run mbgrid using the datalist to create weighted grid
INFO:09/08/2017 21:35:06:mbgrid -I175E30Smbdatalist.mbf -O175E30Sweightedgrid -R174.7/176.3/-30.3/-28.7 -A2 -P1 -V -F1 -C1 -N -M -E0.000277777777778/0.0/degrees! >> 175E30S_mbgrid_output.log 2>&1
并且在该确切网格的日志文件中:
cat 180E48S_mbgrid_output.log
/bin/sh: 1: mbgrid: not found
当我在正确的目录中发出确切的命令时,它 运行 是预期的。
我还应该寻找什么来识别和解决这个问题?
非常感谢您的宝贵时间。
============================================= ========
前十个进程,运行 每 5 分钟一次(它还跟踪数据驱动器上的磁盘 space):
otto@Tendua:~$ cat top_ten_process_monitor.sh
#!/bin/sh
date
free -m | awk 'NR==2{printf "Memory Usage: %s/%sMB (%.2f%%)\n", ,,*100/ }'
df -h | awk '[=12=] ~ "tendua" {printf "Disk Usage: %1.1f/%1.1fTB (%s)\n", ,,}'
top -bn1 | grep load | awk '{printf "CPU Load: %.2f\n", $(NF-2)}'
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head
(日志部分来自上述事件的时间)
Fri Sep 8 21:26:01 MDT 2017
Memory Usage: 63865/64239MB (99.42%)
Disk Usage: 2.7/3.6TB (74%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 1.57
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
4019 3886 nautilus --new-window 0.1 0.0
Fri Sep 8 21:31:01 MDT 2017
Memory Usage: 63852/64239MB (99.40%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 1.57
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
4019 3886 nautilus --new-window 0.1 0.0
Fri Sep 8 21:36:01 MDT 2017
Memory Usage: 63796/64239MB (99.31%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 4.59
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
1873 1871 gdal_translate -ot Float32 0.1 0.1
Fri Sep 8 21:41:01 MDT 2017
Memory Usage: 63878/64239MB (99.44%)
Disk Usage: 2.7/3.6TB (75%)
Disk Usage: 2.5/3.6TB (68%)
Disk Usage: 2.3/3.6TB (64%)
Disk Usage: 4.3/5.5TB (83%)
CPU Load: 4.32
PID PPID CMD %MEM %CPU
3973 3886 dbus-daemon --fork --sessio 0.6 0.0
4085 3886 /usr/lib/unity/unity-panel- 0.6 0.0
4253 4079 compiz 0.3 0.4
11350 3886 /opt/google/chrome/chrome 0.3 2.1
18562 11366 /opt/google/chrome/chrome - 0.3 8.8
5636 5634 /usr/bin/python /usr/bin/gd 0.1 25.6
17402 1109 smbd -F 0.1 0.0
3213 1 /opt/google/chrome/chrome - 0.1 0.1
1587 1450 /usr/bin/X -core :0 -seat s 0.1 0.3
系统统计如下:
sar -f /var/log/sysstat/sa08|head -4 && sar -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM CPU %user %nice %system %iowait %steal %idle
12:05:03 AM all 51.97 0.00 1.32 0.89 0.00 45.82
07:55:01 PM all 9.44 0.00 0.59 3.51 0.00 86.45
08:05:01 PM all 1.50 0.00 1.12 9.39 0.00 87.99
08:15:01 PM all 1.48 0.00 1.12 9.16 0.00 88.24
08:25:01 PM all 1.46 0.00 1.13 9.29 0.00 88.12
08:35:01 PM all 1.50 0.00 1.24 8.65 0.00 88.61
08:45:01 PM all 1.52 0.00 1.57 7.87 0.00 89.04
08:55:01 PM all 1.55 0.00 1.64 7.96 0.00 88.85
09:05:01 PM all 1.53 0.00 1.70 8.08 0.00 88.69
09:15:01 PM all 1.54 0.00 1.74 8.04 0.00 88.68
09:25:01 PM all 1.50 0.00 1.69 8.16 0.00 88.65
09:35:01 PM all 1.55 0.00 1.69 7.98 0.00 88.78
09:45:01 PM all 2.17 0.00 1.96 11.67 0.00 84.20
09:55:01 PM all 1.71 0.00 1.87 18.06 0.00 78.36
10:05:01 PM all 1.70 0.00 2.13 17.57 0.00 78.60
10:15:01 PM all 1.73 0.00 1.93 18.64 0.00 77.70
10:25:01 PM all 1.75 0.00 1.96 18.36 0.00 77.93
10:35:01 PM all 3.91 0.00 1.80 15.40 0.00 78.89
10:45:01 PM all 1.66 0.00 1.90 19.23 0.00 77.20
10:55:01 PM all 1.68 0.00 1.82 19.05 0.00 77.45
11:05:01 PM all 1.69 0.00 1.77 19.31 0.00 77.23
11:15:01 PM all 7.69 0.00 1.03 8.68 0.00 82.60
sar -r -f /var/log/sysstat/sa08|head -4 && sar -r -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
12:05:03 AM 3859704 61921864 94.13 50184 52583736 17699508 25.29 29962700 29837068 611932
07:55:01 PM 392740 65388828 99.40 407920 60884772 10851876 15.51 2703424 60625128 2853088
08:05:01 PM 402104 65379464 99.39 410904 61032736 10845076 15.50 3520252 59959668 1336572
08:15:01 PM 371832 65409736 99.43 413876 61069912 10859576 15.52 3403804 60108248 1220840
08:25:01 PM 397468 65384100 99.40 416592 61053400 10820532 15.46 1277516 62219572 3004488
08:35:01 PM 426124 65355444 99.35 419376 61037072 10808244 15.45 1069828 62416140 306880
08:45:01 PM 453688 65327880 99.31 423508 60966356 10864072 15.53 1070752 62351248 929620
08:55:01 PM 367620 65413948 99.44 427112 61029896 10804544 15.44 1078392 62411352 197616
09:05:01 PM 377924 65403644 99.43 430608 60967232 10794432 15.43 1082116 62345104 234916
09:15:01 PM 415260 65366308 99.37 433856 60842704 10822332 15.47 1099380 62217480 1687536
09:25:01 PM 452492 65329076 99.31 436928 60832668 10842884 15.50 1097624 62204328 1594256
09:35:01 PM 445220 65336348 99.32 460300 60819540 10855512 15.51 1137008 62187660 528
09:45:01 PM 470044 65311524 99.29 463480 60443220 10960588 15.66 24385076 38668700 1765864
09:55:01 PM 472608 65308960 99.28 464708 60422328 11067312 15.82 28572380 34443408 1358136
10:05:01 PM 433176 65348392 99.34 465852 60438212 10957232 15.66 29127204 33924120 4382320
10:15:01 PM 450192 65331376 99.32 467016 60405816 11024924 15.76 30376748 32652588 2063424
10:25:01 PM 376448 65405120 99.43 468388 60579636 10831104 15.48 31660420 31440712 704
10:35:01 PM 560628 65220940 99.15 469856 60329228 10979960 15.69 38914816 23997528 4796836
10:45:01 PM 521660 65259908 99.21 471004 60329408 11008184 15.73 41755472 21194232 3680356
10:55:01 PM 556596 65224972 99.15 472160 60329584 10921484 15.61 44714800 18191336 2351468
11:05:01 PM 543664 65237904 99.17 473288 60329768 11025832 15.76 46713556 16209272 3482536
11:15:01 PM 408040 65373528 99.38 475420 60500764 10935492 15.63 41843776 21182704 168384
sar -b -f /var/log/sysstat/sa08|head -4 && sar -b -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM tps rtps wtps bread/s bwrtn/s
12:05:03 AM 135.06 125.31 9.75 32088.68 15436.97
07:55:01 PM 214.60 187.68 26.92 47229.75 31671.99
08:05:01 PM 407.57 334.90 72.67 85024.47 91730.47
08:15:01 PM 411.46 341.30 70.16 86763.79 87368.85
08:25:01 PM 420.48 353.86 66.62 90036.21 84149.34
08:35:01 PM 480.24 394.76 85.48 100597.74 109374.29
08:45:01 PM 634.61 530.99 103.62 134987.46 132954.73
08:55:01 PM 683.04 567.68 115.36 144695.57 147487.06
09:05:01 PM 712.35 594.86 117.50 151571.92 151528.64
09:15:01 PM 732.52 613.41 119.10 156291.37 151223.34
09:25:01 PM 716.90 598.66 118.24 152659.05 152950.26
09:35:01 PM 719.66 594.02 125.65 147751.96 153455.47
09:45:01 PM 449.31 379.28 70.03 96426.44 106934.35
09:55:01 PM 249.01 48.08 200.93 12311.36 304792.29
10:05:01 PM 225.01 32.41 192.60 8286.13 310338.90
10:15:01 PM 220.01 15.86 204.15 4051.83 312400.63
10:25:01 PM 210.24 8.80 201.44 1863.78 308831.78
10:35:01 PM 199.95 23.89 176.05 5375.87 249054.33
10:45:01 PM 204.32 0.00 204.32 0.00 322379.98
10:55:01 PM 201.33 0.00 201.33 0.00 314086.77
11:05:01 PM 206.65 0.00 206.64 0.04 317881.94
11:15:01 PM 168.05 32.33 135.72 2041.40 151844.90
sar -w -f /var/log/sysstat/sa08|head -4 && sar -w -f /var/log/sysstat/sa08|grep '09:35:01 PM' -A 10 -B 10
Linux 3.19.0-80-generic (Tendua) 09/08/2017 _x86_64_ (12 CPU)
12:00:02 AM proc/s cswch/s
12:05:03 AM 1.45 800.71
07:55:01 PM 0.82 1116.96
08:05:01 PM 0.86 1584.54
08:15:01 PM 0.82 1530.11
08:25:01 PM 0.70 1543.24
08:35:01 PM 0.71 1630.97
08:45:01 PM 0.72 1942.89
08:55:01 PM 0.78 1988.69
09:05:01 PM 0.77 2030.87
09:15:01 PM 1.03 2061.84
09:25:01 PM 0.86 2027.47
09:35:01 PM 0.82 2062.62
09:45:01 PM 19.65 1789.45
09:55:01 PM 0.73 945.56
10:05:01 PM 0.53 901.46
10:15:01 PM 0.68 873.46
10:25:01 PM 0.76 793.85
10:35:01 PM 7.85 950.96
10:45:01 PM 0.74 869.87
10:55:01 PM 0.79 913.40
11:05:01 PM 0.78 894.85
11:15:01 PM 0.80 1225.52
Since these scripts are called via cron with a configuration file, I would expect consistent results. Most of the time (and it seems always when I'm watching, even if I just run the script cron initiates), the code runs successfully to completion. But sometimes not.
看来是环境问题。尝试使用手动 运行 和 cron 作业记录 os.environ。
要查找的变量是PATH
。
PATH is an environmental variable in Linux and other Unix-like operating systems that tells the shell which directories to search for executable files (i.e., ready-to-run programs) in response to commands issued by a user.
如果我的猜想是正确的,您可以通过手动设置来解决这个问题
os.environ['PATH'] = "path to binaries dir"
然后将 os.environ 添加到子进程调用
cwd
参数