Grep：只捕获数字

Question

我正在尝试使用 grep 来捕获字符串中的数字，但我遇到了困难。

echo "There are <strong>54</strong> cities | grep -o "([0-9]+)"

我怎么可能只有 return“54”？我已经尝试了上面的grep命令，但它不起作用。

echo "You have <strong>54</strong>" | grep -o '[0-9]' 似乎有点工作，但它打印

5
4

而不是54

Answer 1

您需要使用 "E" 选项来扩展正则表达式支持（或使用 egrep）。在我的 Mac OSX:

$ echo "There are <strong>54</strong> cities" | grep -Eo "[0-9]+"
54

您还需要考虑一行中是否会出现不止一次数字。那应该是什么行为呢？

编辑 1：由于您现在已经指定要求是 <strong> 标记之间的数字，我建议使用 sed。在我的平台上，grep 没有用于 perl 样式正则表达式的 "P" 选项。在我的另一个盒子上，grep 的版本指定这是一个实验性功能，所以在这种情况下我会选择 sed。

$  echo "There are <strong>54</strong> 12 cities" | sed  -rn 's/^.*<strong>\s*([0-9]+)\s*<\/strong>.*$//p'
54

此处 "r" 用于扩展正则表达式。

编辑 2：如果您的 grep 版本中有 "PCRE" 选项，您还可以利用以下内容进行正向后视和前视。

$  echo "There are <strong>54 </strong> 12 cities" | grep -o -P "(?<=<strong>)\s*([0-9]+)\s*(?=<\/strong>)"
54

Answer 2

$ echo "There are <strong>54</strong> cities " |
    xmllint --html --xpath '//strong/text()' -

勾选RegEx match open tags except XHTML self-contained tags

Grep: Capture just number