如何使用 xargs 以从文件获取输入的并行方式执行 python 脚本?

How to use xargs to execute python script in parallel fashion that takes input from file?

我有一个 python 脚本,它从标准输入获取输入:

from urllib.parse import urlparse
import sys
import asyncio

from wapitiCore.main.wapiti import Wapiti, logging


async def scan(url: str):
    wapiti = Wapiti(url)
    wapiti.set_max_scan_time(30)
    wapiti.set_max_links_per_page(20)
    wapiti.set_max_files_per_dir(10)

    wapiti.verbosity(2)
    wapiti.set_color()
    wapiti.set_timeout(20)
    wapiti.set_modules("xss")
    wapiti.set_bug_reporting(False)

    parts = urlparse(url)
    wapiti.set_output_file(f"/tmp/{parts.scheme}_{parts.netloc}.json")
    wapiti.set_report_generator_type("json")

    wapiti.set_attack_options({"timeout": 20, "level": 1})

    stop_event = asyncio.Event()
    await wapiti.init_persister()
    await wapiti.flush_session()
    await wapiti.browse(stop_event, parallelism=64)
    await wapiti.attack(stop_event)

if __name__ == "__main__":
    asyncio.run(scan(sys.argv[1]))

我如何使用 xargs 运行 以并行方式在文件中的多个 URL 上执行此脚本?

urls.txt

https://jeboekindewinkel.nl/
https://www.codestudyblog.com/

我相信像这样的 bash 文件会起作用。

cat urls.txt | while read line
do
python scriptName.py $line &
done

这将根据处理器的内核数并发执行您的脚本。

cat urls.txt | xargs -L1 -P0 python script.py

参考

-P maxprocs
    Parallel mode: run at most maxprocs invocations of utility at once.
    If maxprocs is set to 0, xargs will run as many processes as possible.

-L number
    Call utility for every number non-empty lines read.  A line ending with a
    space continues to the next non-empty line.  If EOF is reached and fewer
    lines have been read than number then utility will be called with the
    available lines.  The -L and -n options are mutually-exclusive; the last
    one given will be used.