寻找提取子串
Looking To Extract A Substring
我正在寻找一种从一段文本中提取一个/多个子字符串的方法。
我需要能够从下面的字符串中提取#Covid19 和#VaccineRecovery。
Significant milestone today. First day with no reported #CoVid19
deaths since March 21st. This is a day of hope. We will prevail #VaccineRecovery.
基本上我需要任何以“#”开头的子字符串,直到下一个空白 space。从每个句子中可能提取 1 个或 1 个以上的哈希标签。
以下似乎可以完成这项工作。将字符串拆分为单词并检查哪个单词以 #
开头
data = 'Significant milestone today. First day with no reported #CoVid19 deaths since March 21st. This is a day of hope. We will prevail #VaccineRecovery.'
words = [x for x in data.split() if x and x[0] == '#']
print(words)
输出
['#CoVid19', '#VaccineRecovery.']
我正在寻找一种从一段文本中提取一个/多个子字符串的方法。
我需要能够从下面的字符串中提取#Covid19 和#VaccineRecovery。
Significant milestone today. First day with no reported #CoVid19 deaths since March 21st. This is a day of hope. We will prevail #VaccineRecovery.
基本上我需要任何以“#”开头的子字符串,直到下一个空白 space。从每个句子中可能提取 1 个或 1 个以上的哈希标签。
以下似乎可以完成这项工作。将字符串拆分为单词并检查哪个单词以 #
data = 'Significant milestone today. First day with no reported #CoVid19 deaths since March 21st. This is a day of hope. We will prevail #VaccineRecovery.'
words = [x for x in data.split() if x and x[0] == '#']
print(words)
输出
['#CoVid19', '#VaccineRecovery.']