Pandas 在多个起始词和多个停止词之间提取
Pandas extract between multiple Start words and multiple stop words
继 之后,是否也可以将解决方案扩展到多个起始词?
例子不应该从字面上理解:
df
0 start_word1 text1 end_word1
1 start_word2 text2 end_word2
预期输出
df
0 text1
1 text2
您可以使用non-capturing groups来定义start/stop个词替代:
df['COLUMN_NAME'].str.extract('(?:start_word1|start_word2)\s+(.*)\s+(?:end_word1|end_word2)')
继
df
0 start_word1 text1 end_word1
1 start_word2 text2 end_word2
预期输出
df
0 text1
1 text2
您可以使用non-capturing groups来定义start/stop个词替代:
df['COLUMN_NAME'].str.extract('(?:start_word1|start_word2)\s+(.*)\s+(?:end_word1|end_word2)')