使用 Python 在给定的单词集之间提取文本

Extract text between given set of words using Python

我在发帖前浏览了各种答案,它们都是基于正则表达式的,涉及符号和特殊字符。

这是我的输入文本和预期的输出。我想提取 'Investment Objective' 和 'Investment Policy'

之间的文本

input_text

"Investment Objective To provide long - term capital growth by investing primarily in a portfolio of African companies. Investment Policy"

output_text:

"To provide long - term capital growth by investing primarily in a portfolio of African companies."

比方说,你列入黑名单的词是:

black = ["Investment Objective","Investment Policy"]

现在让我们删除它:

for i in black:
    input_text = input_text.replace(i,'').strip()

这给出:

'To provide long        -  term capital growth by investing primarily in a portfolio of African companies.'

Joshua 的替代答案:

input_text="Investment Objective    To provide long        -  term capital growth by investing primarily in a portfolio of African companies.  Investment Policy"

start_str = "Investment Objective"
startpos = input_text.find(start_str)

end_str = "Investment Policy"
endpos = input_text.find(end_str)

output_str = input_text[startpos + len(start_str):endpos]
output_str_nospaces = output_str.strip()

print(f"'{output_str}'")
print(f"'{output_str_nospaces}'")

打印:

'    To provide long        -  term capital growth by investing primarily in a portfolio of African companies.  '
'To provide long        -  term capital growth by investing primarily in a portfolio of African companies.'