仅替换出现在字符串开头的短语

Question

例如，必须按照 "the "、"and "、"a "、"an "、"this " 或 "that " 的顺序删除，只有如果它们在字符串的开头：

输入--->"the computer is the machine in charge of data processing processes"

输出--->"computer is the machine in charge of data processing processes"

重要的是，如果我设法找到句子以其中一个词开头，我将其删除，然后不要继续尝试删除其他词。在本例中，它会检测字符串开头的单词 "the "，将其删除，并且不再尝试其余单词。

要得出不应删除任何内容的结论，您必须选择是，或者如果您已尝试删除所有 6 个选项（"the "、"and "、"a "、"an "、"this " 或 "that ")，如果您没有发现输入短语以这些选项中的任何一个开头，则假设您不应删除任何内容。

我试过类似的方法，但问题是它会进行所有检查，而不仅仅是在匹配之前尝试查找。

input_phrase.replace("the ","")

input_phrase = "An airplane is an aircraft with a higher density than the air."
input_phrase = input_phrase.lower()

input_phrase = input_phrase.replace("the ","",1)
input_phrase = input_phrase.replace("and ","",1)
input_phrase = input_phrase.replace("a ","",1)
input_phrase = input_phrase.replace("an ","",1)
input_phrase = input_phrase.replace("this ","",1)
input_phrase = input_phrase.replace("that ","",1)

output_phrase = input_phrase

print(repr(output_phrase))

该代码的问题在于，它不仅会删除开头的单词，还会删除第一个出现的单词，并且还会使用所有 .remove() 并且在已经删除后不会停止其中一场比赛。

Answer 1

这是一种使用正则表达式的方法：

import re

input_phrase = "An airplane is an aircraft with a higher density than the air."
output_phrase = re.sub(r"^(the|and|a|an|this|that) ", '', input_phrase, flags=re.IGNORECASE)
print(output_phrase)

re.ignorecase 标志允许 An 和 an 工作。
^用于断言字符串开头的位置。

没有正则表达式，您可以使用 startswith() 并循环遍历关键字。

input_phrase = "An airplane is an aircraft with a higher density than the air."
keywords = ["the ", "and ", "a ", "an ", "this ", "that "]

output_phrase = input_phrase
for word in keywords:
    if input_phrase.lower().startswith(word):
        output_phrase = input_phrase[len(word):]
        break
print(output_phrase)

break用于退出for循环，以免浪费时间检查其他单词。

Answer 2

input_phrase = "An airplane is an aircraft with a higher density than the air.".lower()

output_phrase = ''

words = ["the", "and ", "a ", "an ", "this", "that"]

if list(filter(input_phrase.startswith, words)) != []:
    input_phrase = input_phrase.split()
    input_phrase = input_phrase[1:]

for word in input_phrase:
    output_phrase += ' ' + word

print(output_phrase)

仅替换出现在字符串开头的短语

Replace a phrase only if it appears at the beginning of a character string

python

regex

string

replace

python-3.x