如何查找并打印 bash 中的所有 AWK 匹配项?
How to find and print all AWK matches in bash?
我在变量中存储了很多文本。
text="This is sentence! this is not sentence! This is sentence. this is not sencence."
我正在通过此命令查找句子:
echo $text | awk 'match([=12=],/([A-Z])([^!?.]*)([!?.])/) { print substr([=12=],RSTART,RLENGTH) }'
我的输出是:
This is sentence!
预期输出:
This is sentence!
This is sentence.
更多样本:
课文中有语法正确和错误的句子。正确的句子由开头和结尾字符 (.?!) 的大写字母标识。我只想打印正确的句子。
text="incorrect sentence! this is not sentence! This is sentence. this is not sencence. This is correct sentence."
预期输出:
This is sentence.
This is correct sentence.
我能找到第一个匹配项,但不是所有匹配项。感谢您的帮助:)
您需要 while()
和 match()
:
$ echo $text | awk '
{
while(match([=10=],/([A-Z])([^!?.]*)([!?.])/)) { # while there are matches
print substr([=10=],RSTART,RLENGTH) # output them
[=10=]=substr([=10=],RSTART+RLENGTH) # and move forward
}
}'
输出:
This is sentence!
This is sentence.
您可以将 GNU awk 用于 multi-char RS:
$ echo "$text" | awk -v RS='[A-Z][^!?.]*[!?.]' 'RT{print RT}'
This is sentence!
This is sentence.
或 FPAT 的 GNU awk:
$ echo "$text" | awk -v FPAT='[A-Z][^!?.]*[!?.]' '{for (i=1; i<=NF; i++) print $i}'
This is sentence!
This is sentence.
或 -o
的 GNU grep:
$ echo "$text" | grep -o '[A-Z][^!?.]*[!?.]'
This is sentence!
This is sentence.
如果一个句子可以包含换行符,则只有上述第一个有效。
我在变量中存储了很多文本。
text="This is sentence! this is not sentence! This is sentence. this is not sencence."
我正在通过此命令查找句子:
echo $text | awk 'match([=12=],/([A-Z])([^!?.]*)([!?.])/) { print substr([=12=],RSTART,RLENGTH) }'
我的输出是:
This is sentence!
预期输出:
This is sentence!
This is sentence.
更多样本: 课文中有语法正确和错误的句子。正确的句子由开头和结尾字符 (.?!) 的大写字母标识。我只想打印正确的句子。
text="incorrect sentence! this is not sentence! This is sentence. this is not sencence. This is correct sentence."
预期输出:
This is sentence.
This is correct sentence.
我能找到第一个匹配项,但不是所有匹配项。感谢您的帮助:)
您需要 while()
和 match()
:
$ echo $text | awk '
{
while(match([=10=],/([A-Z])([^!?.]*)([!?.])/)) { # while there are matches
print substr([=10=],RSTART,RLENGTH) # output them
[=10=]=substr([=10=],RSTART+RLENGTH) # and move forward
}
}'
输出:
This is sentence!
This is sentence.
您可以将 GNU awk 用于 multi-char RS:
$ echo "$text" | awk -v RS='[A-Z][^!?.]*[!?.]' 'RT{print RT}'
This is sentence!
This is sentence.
或 FPAT 的 GNU awk:
$ echo "$text" | awk -v FPAT='[A-Z][^!?.]*[!?.]' '{for (i=1; i<=NF; i++) print $i}'
This is sentence!
This is sentence.
或 -o
的 GNU grep:
$ echo "$text" | grep -o '[A-Z][^!?.]*[!?.]'
This is sentence!
This is sentence.
如果一个句子可以包含换行符,则只有上述第一个有效。