python 子进程不会 运行 phantomjs,但在 linux 命令行中工作
python subprocess won't run phantomjs, but works in linux command line
当我在 CENTOS 7 服务器上 运行 时,它可以从 bash:
[myserver]$ /home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs /home/phantomjs-2.1.1-linux-x86_64/bin/thumbnails.js -3933029 91 q5975 "http://mysite/explore?viz=summary_slider"
Rendered 'http://mysite/explore?viz=summary_slider' at '/home/thumbnails/th-3933029c91q5975.png'
但是如果我在 python 中使用子进程执行此操作,我会收到错误消息:
import subprocess
phantomjs_call = u'{0}phantomjs {0}thumbnails.js {1}'.format(phantomjspath, link)
rendered = subprocess.check_output(phantomjs_call.split())
returns
/home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs /home/phantomjs-2.1.1-linux-x86_64/bin/thumbnails.js "http://mysite/explore?viz=summary_checkbox"
Unable to render '"http://mysite/explore?viz=summary_checkbox"'
子进程参数有什么奇怪的地方吗?还是 shell 环境不对?
接下来,我对其进行了调整,并将完整的字符串作为一个参数传入,然后我得到了 OSError:
rendered = subprocess.check_output(phantomjs_call)
# didn't split this into multiple arguments
>>>[Errno 2] no such file or directory"
那个怎么样
import subprocess
phantomjs_call = '{0}phantomjs {0}thumbnails.js {1}'.format(phantomjspath, link)
print(subprocess.check_output(phantomjs_call), shell=True)
或
import os
phantomjs_call = '{0}phantomjs {0}thumbnails.js {1}'.format(phantomjspath, link)
print(os.system(phantomjs_call))
因此,在 subprocess
上尝试了许多不同的变体后,这就是 phantomjs
的工作原理:subprocess32
!!!
import subprocess32 # not the default version; this supports timeouts
for (_id, link) in link_list:
phantomjs_call = u'{0}phantomjs {0}thumbnails.js {1} {2} {3} {4}'.format(phantomjspath, _id, link)
"""note: this generates a string like
/home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs
/home/phantomjs-2.1.1-linux-x86_64/bin/thumbnails.js 51514
"http://mysite/explore?viz=summary_text"
"""
try:
process = subprocess32.Popen(phantomjs_call, shell=True, stdout=subprocess32.PIPE)
# make sure phantomjs has time to download/process all the pages in the list
# but if we get nothing after 180 sec, just move on
except Exception as e:
print(phantomjs_call)
print('Popen failed', e)
try:
output, errors = process.communicate(timeout=180)
except Exception as e:
if debug == True:
print("\t\tException: %s" % e)
process.kill()
return "\t\tException: {0}".format(e)
# output will be weird, decode to utf-8 to save heartache
phantom_output = []
for out_line in output.splitlines():
phantom_output.append( out_line.decode('utf-8') )
这是 python2.7 -- 在 python3 中它可能更容易,但将它保存在这里是因为我花了很多时间来尝试和错误地使 subprocess32 与 phantomjs 一起工作。
此外 - 我没有共享 thumnails.js
文件,但 javascript 将命令行输入解析到 phantomjs 中,以获得任意数量的 url,并使用这些参数构建文件名。
当我在 CENTOS 7 服务器上 运行 时,它可以从 bash:
[myserver]$ /home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs /home/phantomjs-2.1.1-linux-x86_64/bin/thumbnails.js -3933029 91 q5975 "http://mysite/explore?viz=summary_slider"
Rendered 'http://mysite/explore?viz=summary_slider' at '/home/thumbnails/th-3933029c91q5975.png'
但是如果我在 python 中使用子进程执行此操作,我会收到错误消息:
import subprocess
phantomjs_call = u'{0}phantomjs {0}thumbnails.js {1}'.format(phantomjspath, link)
rendered = subprocess.check_output(phantomjs_call.split())
returns
/home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs /home/phantomjs-2.1.1-linux-x86_64/bin/thumbnails.js "http://mysite/explore?viz=summary_checkbox"
Unable to render '"http://mysite/explore?viz=summary_checkbox"'
子进程参数有什么奇怪的地方吗?还是 shell 环境不对?
接下来,我对其进行了调整,并将完整的字符串作为一个参数传入,然后我得到了 OSError:
rendered = subprocess.check_output(phantomjs_call)
# didn't split this into multiple arguments
>>>[Errno 2] no such file or directory"
那个怎么样
import subprocess
phantomjs_call = '{0}phantomjs {0}thumbnails.js {1}'.format(phantomjspath, link)
print(subprocess.check_output(phantomjs_call), shell=True)
或
import os
phantomjs_call = '{0}phantomjs {0}thumbnails.js {1}'.format(phantomjspath, link)
print(os.system(phantomjs_call))
因此,在 subprocess
上尝试了许多不同的变体后,这就是 phantomjs
的工作原理:subprocess32
!!!
import subprocess32 # not the default version; this supports timeouts
for (_id, link) in link_list:
phantomjs_call = u'{0}phantomjs {0}thumbnails.js {1} {2} {3} {4}'.format(phantomjspath, _id, link)
"""note: this generates a string like
/home/phantomjs-2.1.1-linux-x86_64/bin/phantomjs
/home/phantomjs-2.1.1-linux-x86_64/bin/thumbnails.js 51514
"http://mysite/explore?viz=summary_text"
"""
try:
process = subprocess32.Popen(phantomjs_call, shell=True, stdout=subprocess32.PIPE)
# make sure phantomjs has time to download/process all the pages in the list
# but if we get nothing after 180 sec, just move on
except Exception as e:
print(phantomjs_call)
print('Popen failed', e)
try:
output, errors = process.communicate(timeout=180)
except Exception as e:
if debug == True:
print("\t\tException: %s" % e)
process.kill()
return "\t\tException: {0}".format(e)
# output will be weird, decode to utf-8 to save heartache
phantom_output = []
for out_line in output.splitlines():
phantom_output.append( out_line.decode('utf-8') )
这是 python2.7 -- 在 python3 中它可能更容易,但将它保存在这里是因为我花了很多时间来尝试和错误地使 subprocess32 与 phantomjs 一起工作。
此外 - 我没有共享 thumnails.js
文件,但 javascript 将命令行输入解析到 phantomjs 中,以获得任意数量的 url,并使用这些参数构建文件名。