如何用正则表达式同时搜索两个可能的引号?
How to simultaniously search for two possible quotation marks with regular expressions?
如果引号中的单词是一两个单词长,我想将它们提取出来。这适用于以下代码。
mysentences = ['Kids, you "tried" your "best" and you failed miserably. The "lesson" is, "never try."',
"Just because I don’t 'care' doesn’t mean I don’t understand."]
quotation = []
rx = r'"((?:\w+[ .]*){1,2})"'
for sentence in mysentences:
quotation.append(re.findall(rx, sentence))
print(quotation)
但这并没有让我从第二个句子中得到 'care',因为第二个句子用双引号引起来。我可以通过以下方式获取它
r"'((?:\w+[ .]*){1,2})'"
问题是,加盟条件如何?
rx = r'"((?:\w+[ .]*){1,2})"' or r"'((?:\w+[ .]*){1,2})'"
它只会让我得到第一个提到的条件。
使用您当前的模式,您可以使用 capturing group 和反向引用 </code> 来匹配随附的单引号或双引号。</p>
<p>比赛现在将在第二个捕获组中。</p>
<pre><code>(['"])((?:\w+[ .]*){1,2})
请注意,重复字符 class [ .]*
也可能匹配 never try... ....
如果你想匹配 1 或 2 个单词,最后可以有一个可选的点,你可以匹配 1+ 个单词字符后跟一个可选组来匹配 1+ 个空格和 1+ 个单词字符后跟一个可选的点。
(['"])(\w+(?: +\w+)?\.?)
例如
import re
mysentences = ['Kids, you "tried" your "best" and you failed miserably. The "lesson" is, "never try."',
"Just because I don’t 'care' doesn’t mean I don’t understand."]
quotation = []
rx = r"(['\"])((?:\w+[ .]*){1,2})"
for sentence in mysentences:
for m in re.findall(rx, sentence):
quotation.append(m[1])
print(quotation)
结果
['tried', 'best', 'lesson', 'never try.', 'care']
如果引号中的单词是一两个单词长,我想将它们提取出来。这适用于以下代码。
mysentences = ['Kids, you "tried" your "best" and you failed miserably. The "lesson" is, "never try."',
"Just because I don’t 'care' doesn’t mean I don’t understand."]
quotation = []
rx = r'"((?:\w+[ .]*){1,2})"'
for sentence in mysentences:
quotation.append(re.findall(rx, sentence))
print(quotation)
但这并没有让我从第二个句子中得到 'care',因为第二个句子用双引号引起来。我可以通过以下方式获取它
r"'((?:\w+[ .]*){1,2})'"
问题是,加盟条件如何?
rx = r'"((?:\w+[ .]*){1,2})"' or r"'((?:\w+[ .]*){1,2})'"
它只会让我得到第一个提到的条件。
使用您当前的模式,您可以使用 capturing group 和反向引用 </code> 来匹配随附的单引号或双引号。</p>
<p>比赛现在将在第二个捕获组中。</p>
<pre><code>(['"])((?:\w+[ .]*){1,2})
请注意,重复字符 class [ .]*
也可能匹配 never try... ....
如果你想匹配 1 或 2 个单词,最后可以有一个可选的点,你可以匹配 1+ 个单词字符后跟一个可选组来匹配 1+ 个空格和 1+ 个单词字符后跟一个可选的点。
(['"])(\w+(?: +\w+)?\.?)
例如
import re
mysentences = ['Kids, you "tried" your "best" and you failed miserably. The "lesson" is, "never try."',
"Just because I don’t 'care' doesn’t mean I don’t understand."]
quotation = []
rx = r"(['\"])((?:\w+[ .]*){1,2})"
for sentence in mysentences:
for m in re.findall(rx, sentence):
quotation.append(m[1])
print(quotation)
结果
['tried', 'best', 'lesson', 'never try.', 'care']