正则表达式从除多个 www 之外的字符串中获取 link

RegExp getting link from String except multiple www

当我尝试从字符串中获取 link 时,例如

"hello world https://www.sample.com/voices/2020/my-sound-www.sample.com"

我从这里得到了多个 link 因为我有多个 www,我怎么能除外呢?

输出:

  1. https://www.sample.com/voices/2020/my-sound-www.sample.com
  2. www.sample.com

这个输出不正确,应该是一个 link 而不是两个 link

https://www.sample.com/voices/2020/my-sound-www.sample.com

我的正则表达式模式:

r"((https?:www\.)|(https?:\/\/)|(www\.))[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9]{1,6}(\/[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)?"

你可以使用

final reg = RegExp(r'(?:https?:(?:\?\/\?\/|www\.)|www\.)[^\s<>"'']*\.mp3');
final m = reg.firstMatch(test);
print(m.group(0));
// => https://www.caferilik.com/wp-content/uploads/2020/11/Anne-Baba-Biz-Suçluyuz-Muhafazakar-Ailelerde-Kuşak-Çatışması-Sesli-Kitap-www.caferilik.com_.mp3

这里的模式是

(?:https?:(?:\?\/\?\/|www\.)|www\.)[^\s<>"']*\.mp3

regex demo

详情:

  • (?:https?:(?:\?\/\?\/|www\.)|www\.) - http,后跟可选的 s 字符,然后是 :,然后是 // 和可选的 \在每个 /www. 之前,只是或 www.
  • [^\s<>"']* - 除空格之外的零个或多个字符,<>"'
  • \.mp3 - .mp3 字符串。