在 bash 中多次读取 txt 文件(并行处理)
Multiple read from a txt file in bash (parallel processing )
这是一个简单的bash HTTP 状态码脚本
while read url
do
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 )
echo "$url $urlstatus" >> urlstatus.txt
done <
我正在从文本文件中读取 URL 但它一次只处理一个,花费太多时间,GNU parallel 和 xargs 也一次处理一行(已测试)
如何同时处理 URL 处理以改善时序?换句话说 URL 文件而不是 bash 命令的线程(GNU parallel 和 xargs 做的)
Input file is txt file and lines are separated as
ABC.Com
Bcd.Com
Any.Google.Com
Something like this
GNU parallel and xargs also process one line at time (tested)
你能举个例子吗?如果您使用 -j
那么您应该能够 运行 一次处理多个进程。
我会这样写:
doit() {
url=""
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 )
echo "$url $urlstatus"
}
export -f doit
cat "" | parallel -j0 -k doit >> urlstatus.txt
基于输入:
Input file is txt file and lines are separated as
ABC.Com
Bcd.Com
Any.Google.Com
Something like this
www.google.com
pi.dk
我得到输出:
Input file is txt file and lines are separated as 000
ABC.Com 301
Bcd.Com 301
Any.Google.Com 000
Something like this 000
www.google.com 302
pi.dk 200
哪个看起来正确:
000 if domain does not exist
301/302 for redirection
200 for success
这是一个简单的bash HTTP 状态码脚本
while read url
do
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 )
echo "$url $urlstatus" >> urlstatus.txt
done <
我正在从文本文件中读取 URL 但它一次只处理一个,花费太多时间,GNU parallel 和 xargs 也一次处理一行(已测试)
如何同时处理 URL 处理以改善时序?换句话说 URL 文件而不是 bash 命令的线程(GNU parallel 和 xargs 做的)
Input file is txt file and lines are separated as
ABC.Com
Bcd.Com
Any.Google.Com
Something like this
GNU parallel and xargs also process one line at time (tested)
你能举个例子吗?如果您使用 -j
那么您应该能够 运行 一次处理多个进程。
我会这样写:
doit() {
url=""
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 )
echo "$url $urlstatus"
}
export -f doit
cat "" | parallel -j0 -k doit >> urlstatus.txt
基于输入:
Input file is txt file and lines are separated as
ABC.Com
Bcd.Com
Any.Google.Com
Something like this
www.google.com
pi.dk
我得到输出:
Input file is txt file and lines are separated as 000
ABC.Com 301
Bcd.Com 301
Any.Google.Com 000
Something like this 000
www.google.com 302
pi.dk 200
哪个看起来正确:
000 if domain does not exist
301/302 for redirection
200 for success