如何仅 grep bash 中没有任何扩展名的目录

How to grep only the directories without any extensions in bash

假设我有一个名为 URL.txt 的 URL 列表,我只想输出目录而不是文件或扩展名,例如 .html、.php 等。如果它在 URL 中找到任何扩展名或任何文件,脚本应该继续到下一个 URL

- https://example.com/tradings/trade/trading?currency=usdt&dest=btc&tab=limit
- https://example.com/account/signup/accounts/signin/account.html

我想要这样的结果:

- https://example.com/tradings/
- https://example.com/tradings/trade/
- https://example.com/account/
- https://example.com/account/signup/
- https://example.com/account/signup/accounts/
- https://example.com/account/signup/accounts/signin/

我试过这个命令,但它不会转换成完整的 URL 端点。我想要一个没有任何扩展的完整 URL 端点。

cat Urls.txt | rev | cut -d'/' -f 2 | sort -u | rev

Perl 来拯救!

perl -lne '@parts = split m{/}; print join "/", @parts[0 .. $_] for 3 .. $#parts - 1' < URL.txt
  • -n逐行读取输入并运行每一行的代码
  • -l 从输入中删除换行符并将它们添加到 print
  • 每行在 / 上拆分。然后我们重新连接从 3 到最后一个部分的部分。
  • 有关详细信息,请参阅 split and join

我建议使用awk:

awk 'BEGIN{FS=OFS="/"}{$NF=""}!seen[[=10=]]++' URLS.txt

解释:

# Set the input field separator (FS) and the
# output fields separator (OFS) to a forward slash /
BEGIN{
    FS=OFS="/"
}

{
    # NF is a speacial variable and contains the number of fields.
    # Therefore $NF is the last field. Assign an empty to string to it
    $NF=""
}

# The variable 'seen' is an associative array, initialized on demand
# upon first usage. We are using it as a lookup to prevent printing
# the same url path twice.
!seen[[=11=]]++

PS:你的初始命令几乎可以工作,只是 cut 命令是错误的:你正在使用 cut -f2,它会打印第二个字段,但你想要 cut -f2-,这将打印倒数第二个字段:

rev Urls.txt  | cut -d'/' -f 2- | sort -u | rev

如果你想把它做成单行,

[gnm]awk 'BEGIN {OFS=FS="/"} (1<NF) && _==__[$(_^--NF)]++' 

让我帮忙破译这个:

  • awk errors out when u try to assign zero into NF, so (1 < NF) is a safety check. making that shorter with $NF check has a pitfall - if the input data in last column resembles a numeric zero, that condition would inadvertently evaluate to False
  • _ is a variable never initialized, so it would be same as 0/False. I write it this way cuz my shell scripts act up occasionally with that "!" mark that bash is too eager to expand
  • __ is the seen array
  • --NF automatically clear out the right-most column, aka basename
  • since we've previously ensured NF >= 2, regardless of input, $(_^--NF) evaluates to $(0), since zero-to-any-non-zero power is always zero.

其他的和上面其他人详细解释的一样