如何通过括号和非括号标记 python 拆分字符串?
How to split string by bracketed and non bracketed tokens python?
有一串单词用空格分隔,几个单词用方括号联合。例如
word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)
我想用空格分割它,但将括号中的单词计为单个单词,因此对于上面的示例,结果将是
[word1, word1, (word3 word4 word5), word6, (word7 word8), word9, word10, word11, (word12 word13 word14)]
带或不带括号的单词是否出现在结果列表中并不重要。重要的是将括号中的单词算作单个单词。我该怎么做?
我想你可以制作一个正则表达式来查找括号内的任何内容或 word-characters。这可能看起来像:
import re
s = 'word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)'
re.findall(r'(?:\(.*?\))|(?:\w+)', s)
哪个会给你:
['word1',
'word2',
'(word3 word4 word5)',
'word6',
'(word7 word8)',
'word9',
'word10',
'word11',
'(word12 word13 word14)']
有一串单词用空格分隔,几个单词用方括号联合。例如
word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)
我想用空格分割它,但将括号中的单词计为单个单词,因此对于上面的示例,结果将是
[word1, word1, (word3 word4 word5), word6, (word7 word8), word9, word10, word11, (word12 word13 word14)]
带或不带括号的单词是否出现在结果列表中并不重要。重要的是将括号中的单词算作单个单词。我该怎么做?
我想你可以制作一个正则表达式来查找括号内的任何内容或 word-characters。这可能看起来像:
import re
s = 'word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)'
re.findall(r'(?:\(.*?\))|(?:\w+)', s)
哪个会给你:
['word1',
'word2',
'(word3 word4 word5)',
'word6',
'(word7 word8)',
'word9',
'word10',
'word11',
'(word12 word13 word14)']