python / re.match 的正则表达式不起作用

Question

我有这样的 stringText

sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/> 
"""

我想收到：

cooking food blog 5 years

我尝试了很多不同的正则表达式

喜欢：

p = re.compile('<mainDescription description=\"([^\"]+)\"\/>')
print re.match(p, sText)

或

p = re.compile(ur'<mainDescription description="([^"]+)"\/>')

并使用 (.+) 根据 regex101.com 我的正则表达式应该可以正常工作，但事实并非如此。我不知道为什么

Answer 1

似乎是因为您正在使用 re.match() instead of re.search()。 re.match() 从字符串的开头搜索，而 re.search() 搜索任意位置。这有效：

sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/> 
"""
p = re.compile('<mainDescription description=\"([^\"]+)\"\/>')
print re.search(p, sText).group(1)

顺便说一句，如果您使用 '，则不需要转义引号 (")，这就足够了：

re.search('<mainDescription description="([^"]+)"/>', sText)

Answer 2

尝试使用 findall():

print re.findall('<mainDescription description=\"([^\"]+)\"\/>', sText)

输出：

['cooking food blog 5 years']

Answer 3

re.match returns 一个 match 对象，您需要从中检索所需的组。

sText ="""<firstName name="hello morning" id="2342"/>
<mainDescription description="cooking food blog 5 years"/>
<special description="G10X, U16X, U17X, G26X, C32X, G34X, G37X, U39X, C40X, G46X,C49X, U54X, U55X, A58X"/> 
"""
r = re.compile("""<mainDescription description="(?P<description>[^"]+)"\/>""")
m = r.match(sText)
print m.group('description')

请注意，也可以使用索引（在本例中为 0）访问组，但我更喜欢指定关键字。

python / re.match 的正则表达式不起作用

regex with python / re.match doesn't work

python

regex

regex-negation

regex-greedy