正则表达式匹配 Telegram 用户名并删除 PHP 中的整行

Question

我想匹配消息文本中的 Telegram 用户名并删除整行，我试过这种模式，但问题是它也匹配电子邮件：

.*(@(?=.{5,64}(?:\s|$))(?![_])(?!.*[_]{2})[a-zA-Z0-9_]+(?<![_.])).*

模式应匹配所有这些行：

嗨@username你好吗？

你好@username.how是吗？

@用户名。

并且不应像这样匹配电子邮件：

嗨，给某事发电子邮件@domain.com

Answer 1

.*[\W](@(?=.{5,64}(?:\s|$))(?![_])(?!.*[_]{2})[a-zA-Z0-9_]+(?<![_.])).*

我在 @ 符号前添加了 [\W] non-word 个字符。在这里你可以查看结果 https://regex101.com/r/yFGegO/1

Answer 2

使用

.*\B@(?=\w{5,32}\b)[a-zA-Z0-9]+(?:_[a-zA-Z0-9]+)*.*

见proof

\B before @ 表示 @.

之前必须有一个 non-word 字符或字符串开头

解释

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  \B                       the boundary between two word chars (\w)
                           or two non-word chars (\W)
--------------------------------------------------------------------------------
  @                        '@'
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \w{5,32}                 word characters (a-z, A-Z, 0-9, _)
                             (between 5 and 32 times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    \b                       the boundary between a word char (\w)
                             and something that is not a word char
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
  [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to 'Z',
                           '0' to '9' (1 or more times (matching the
                           most amount possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    _                        '_'
--------------------------------------------------------------------------------
    [a-zA-Z0-9]+             any character of: 'a' to 'z', 'A' to
                             'Z', '0' to '9' (1 or more times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))

Answer 3

太阳底下无新鲜事，但基本上其他花样可以归结为：

.*?\B@\w{5}.*

demo

或最终：

.*?\B\w{5,64}\b.*

如果你想更精确，但真的需要吗？

注意：如果您也想删除换行序列，请在模式末尾添加 \R?。

正则表达式匹配 Telegram 用户名并删除 PHP 中的整行

Regex match Telegram username and delete whole line in PHP

php

regex

telegram-bot

php-telegram-bot