标签后跟常规文本

Question

我想检查 python 字符串中某个话题标签后跟的是常规文本还是另一个话题标签。例如案例：

"my adjectives names #Day #Night which are in the description"

，我错了，因为在第一个主题标签之后又出现了一个主题标签。但在其他情况下，例如

"my adjectives names #Day which is in the description"

我会说真的。我如何使用 Python 中的正则表达式操作来做到这一点？

我试过了：

tweet_text = "my adjectives names #Day #Night which are in the description"
pattern = re.findall(r'\B#\w*[a-zA-Z0-9]+\B#\w*[a-zA-Z0-9]*', tweet_text)
print(pattern)

但它没有给我任何输出。

Answer 1

口译员举例：

>>> import re
>>> pat = re.compile(r'(#\w+\s+){2,}')
>>>
>>> text = 'my adjectives names #Day  which are in the description'
>>> pat.search(text)
>>>
>>> text = 'my adjectives names #Day #Night which are in the description'
>>> pat.search(text)
<_sre.SRE_Match object; span=(20, 32), match='#Day #Night '>

Answer 2

对于不是的话题标签后跟另一个话题标签使用：

input = "my adjectives names #Day #Night which are in the description"
matches = re.findall(r'#[^#\s]+\b(?!\s+#[^#]+)', input)
print(matches)

['#Night']

对于后跟另一个主题标签的主题标签，只需将负面前瞻替换为正面前瞻：

matches = re.findall(r'#[^#\s]+\b(?=\s+#[^#]+)', input)
print(matches)

['#Day']

标签后跟常规文本

Hashtag followed after by a regular text

python

regex

tweets