我怎样才能使这个正则表达式相对 URL 提取在 grep 中工作？

Question

文件中有这个字符串，只想提取相关的 link:

<a href="/FreeCAD/FreeCAD-Bundle/releases/download/weekly-builds/FreeCAD_weekly-builds-28909-2022-05-20-conda-Linux-x86_64-py39.AppImage" rel="nofollow" data-skip-pjax>

这适用于 https://regexr.com/6m4vg :

/FreeCAD/[^]*AppImage

但是 returns grep 中没有任何内容。

grep -E '/FreeCAD/\[^]*AppImage' somefile

我怎样才能让它发挥作用？谢谢。

编辑：源文件：

wget https://github.com/FreeCAD/FreeCAD-Bundle/releases/tag/weekly-builds

期望的输出：

/FreeCAD/FreeCAD-Bundle/releases/download/weekly-builds/FreeCAD_weekly-builds-28909-2022-05-20-conda-Linux-x86_64-py39.AppImage

Answer 1

您需要使用 [^"]* 而不是 [^]*:

grep -o '/FreeCAD/[^"]*AppImage' somefile

/FreeCAD/[^]*AppImage 在线工作，因为您针对 ECMAScript 引擎测试模式，但 grep -E 使用 POSIX ERE 正则表达式风格，其中否定括号表达式不应为空。

[^] 在 ECMAScript 正则表达式中匹配任何字符，所以在这里，由于 grep 逐行工作，您可以将其替换为 .*.

但是，由于您要匹配的文本不能包含 "，您还可以使用更合适的 [^"]* 模式来匹配 " 字符以外的零个或多个字符.

我怎样才能使这个正则表达式相对 URL 提取在 grep 中工作？

How can I make this regex relative URL extraction work in grep?

regex

grep