寻找提取子串

Question

我正在寻找一种从一段文本中提取一个/多个子字符串的方法。

我需要能够从下面的字符串中提取#Covid19 和#VaccineRecovery。

Significant milestone today. First day with no reported #CoVid19 deaths since March 21st. This is a day of hope. We will prevail #VaccineRecovery.

基本上我需要任何以“#”开头的子字符串，直到下一个空白 space。从每个句子中可能提取 1 个或 1 个以上的哈希标签。

Answer 1

以下似乎可以完成这项工作。将字符串拆分为单词并检查哪个单词以 #

开头

data = 'Significant milestone today. First day with no reported #CoVid19 deaths since March 21st. This is a day of hope. We will prevail #VaccineRecovery.'
words = [x for x in data.split() if x and x[0] == '#']
print(words)

输出

['#CoVid19', '#VaccineRecovery.']

寻找提取子串

Looking To Extract A Substring

python

substring