wc -m 似乎在 bash 循环时停止

Question

我正在上 UNIX 入门课程 - 其中一部分是 bash 脚本。我似乎已经理解了这些概念，但是在这个特定的问题中我无法解决这个问题。

我有一个包含 1 列随机用户名的 txt 文件。然后将该 txt 文件用作我的 bash 脚本的参数，理想情况下使用用户名来获取页面并计算该页面上的字符数。如果页面获取成功，字符数将与用户名一起保存在不同的 txt 文件中。

这是一个代码：

#!/bin/bash
filename=

while read username; do
    curl -fs "http://example.website.domain/$username/index.html"
    if [ $? -eq 0 ]
    then
        x=$(wc -m)
        echo "$username $x" > output.txt
    else
        echo "The page doesn't exist"
    fi
done < $filename

现在我遇到的问题是，在一次成功获取后，它计算字符数，将它们输出到文件，然后完成循环并退出程序。如果我专门删除 "wc -m" 位，代码运行得很好。

问：这是否应该发生，我应该如何绕过它来实现我的目标？还是我在其他地方弄错了？

Answer 1

wc 程序（以及您可以在 Linux 上找到的许多其他实用程序）默认情况下期望其输入在 stdin（标准输入）上提供给它，并且将其输出提供给 stdout（标准输出）。

在您的情况下，您希望 wc 对 curl 调用的结果进行操作。您可以通过将 curl 的结果存储在变量中并将变量的内容传递给 wc

来实现此目的

data=$(curl -fs "http://example.website.domain/$username/index.html")
...
x=$(echo "$data" | wc -m)

或者，您可以将整个命令放在一个管道中，这可能更好（尽管您可能想要 set -o pipefail 以便捕获来自 curl 的错误）：

x=$(curl -fs "http://example.website.domain/$username/index.html" | wc -m)

否则，正如@Dominique 所述，您的 wc 将无限期地等待，直到它获得一些输入。

Answer 2

显示的代码不符合您的想法（并在您的问题中声称）。

您的 curl 命令获取网络并将其发送到标准输出：您没有保留此信息供将来使用。然后，你的 wc 没有任何参数，所以 它开始从标准输入 读取。在 stdin 中，您有来自 $filename 的用户名列表，因此计算出的数字不是网络的字符，而是文件的剩余字符。一旦计算完毕，stdin 中就没有任何内容可供读取，因此循环结束，因为它到达了文件末尾。

您正在寻找类似的东西：

#!/bin/bash
filename=""

set -o pipefail
rm -f output.txt
while read username; do
    x=$(curl -fs "http://example.website.domain/$username/index.html" | wc -m)
    if [ $? -eq 0 ]
    then
        echo "$username $x" >> output.txt
    else
        echo "The page doesn't exist"
    fi
done < "$filename"

这里，抓取的页面直接馈送到wc。如果 curl 失败你将看不到（一系列管道命令的退出代码默认是最后一个命令的退出代码），所以我们使用 set -o pipefail 来获取最右边的退出代码，其值不同于零。现在你可以检查是否一切正常，如果是这样，你可以写下结果。

我还添加了一个 rm 输出文件，以确保我们不会增长现有文件，并将输出文件的重定向更改为追加，以避免在每次迭代时重新创建文件，并且以最后一次迭代的结果结束（感谢@tripleee 注意到这一点）。

更新（应大众要求）：

模式：

<cmd>
if [ $? -eq 0 ]...

通常是个坏主意。最好去：

if <cmd>...

所以如果你切换到:

会更好

if x=$(curl -fs "http://example.website.domain/$username/index.html" | wc -m); then
    echo...

Answer 3

正如其他人已经指出的那样，只有 wc 会 "hang" 因为它希望您在标准输入上提供输入。

您似乎在寻找类似

的东西

#!/bin/bash
filename=
# Use read -r
while read -r username; do
    if page=$(curl -fs "http://example.website.domain/$username/index.html"); then
        # Feed the results from curl to wc
        x=$(wc -m <<<"$page")
        # Don't overwrite output file on every iteration
        echo "$username $x"
    else
        # Include parameter in error message; print to stderr
        echo "[=10=]: The page for $username doesn't exist" >&2
    fi
# Note proper quoting
# Collect all output redirection here, too
done < "$filename" >output.txt

wc -m 似乎在 bash 循环时停止

wc -m seems to stop while loop in bash

unix

bash

shell

wc