在字符串中查找连续的大写单词,包括撇号
Find consecutive capitalized words in a string, including apostrophes
我正在使用正则表达式查找全部大写的连续单词实例,其中一些连续单词包含撇号,即(“The mother-daughter bakery, Molly's Munchies, was founded in 2009”) .我已经写了几行代码来做到这一点:
string = "The mother-daughter bakery, Molly’s Munchies, was founded in 2009"
test = re.findall("([A-Z][a-z]+(?=\s[A-Z])(?:\s[A-Z][a-z]+)+)", string)
print(test)
问题是我无法打印结果('Molly's Munchies')
相反,我的输出是:
('[]')
期望的输出:
("Molly's Munchies")
感谢任何帮助,谢谢!
您可以在 python:
中使用此正则表达式
r"\b[A-Z][a-z'’]*(?:\s+[A-Z][a-z'’]*)+"
正则表达式详细信息:
\b
: 单词匹配
[A-Z]
:匹配一个大写字母
[a-z'’]*
:匹配0个或多个包含小写字母或'
或’
的字符
(?:\s+[A-Z][a-z'’]*)+
匹配1个或多个这样的大写字母词
您需要在定义“单词”的两个地方添加它。您只在一个地方添加了它。
string = "The Cow goes moo, and the Dog's Name is orange"
# e.g. both here and here
# v v
print(re.findall("([A-Z][a-z']+(?=\s[A-Z])(?:\s[A-Z][a-z']+)+)", string))
['The Cow', "Dog's Name"]
我正在使用正则表达式查找全部大写的连续单词实例,其中一些连续单词包含撇号,即(“The mother-daughter bakery, Molly's Munchies, was founded in 2009”) .我已经写了几行代码来做到这一点:
string = "The mother-daughter bakery, Molly’s Munchies, was founded in 2009"
test = re.findall("([A-Z][a-z]+(?=\s[A-Z])(?:\s[A-Z][a-z]+)+)", string)
print(test)
问题是我无法打印结果('Molly's Munchies')
相反,我的输出是:
('[]')
期望的输出:
("Molly's Munchies")
感谢任何帮助,谢谢!
您可以在 python:
中使用此正则表达式r"\b[A-Z][a-z'’]*(?:\s+[A-Z][a-z'’]*)+"
正则表达式详细信息:
\b
: 单词匹配[A-Z]
:匹配一个大写字母[a-z'’]*
:匹配0个或多个包含小写字母或'
或’
的字符
(?:\s+[A-Z][a-z'’]*)+
匹配1个或多个这样的大写字母词
您需要在定义“单词”的两个地方添加它。您只在一个地方添加了它。
string = "The Cow goes moo, and the Dog's Name is orange"
# e.g. both here and here
# v v
print(re.findall("([A-Z][a-z']+(?=\s[A-Z])(?:\s[A-Z][a-z']+)+)", string))
['The Cow', "Dog's Name"]