仅替换出现在字符串开头的短语

Replace a phrase only if it appears at the beginning of a character string

例如,必须按照 "the ""and ""a ""an ""this ""that " 的顺序删除,只有如果它们在字符串的开头:

输入--->"the computer is the machine in charge of data processing processes"

输出--->"computer is the machine in charge of data processing processes"

重要的是,如果我设法找到句子以其中一个词开头,我将其删除,然后不要继续尝试删除其他词。 在本例中,它会检测字符串开头的单词 "the ",将其删除,并且不再尝试其余单词。

要得出不应删除任何内容的结论,您必须选择是,或者如果您已尝试删除所有 6 个选项("the ""and ""a ""an ""this ""that "),如果您没有发现输入短语以这些选项中的任何一个开头,则假设您不应删除任何内容。

我试过类似的方法,但问题是它会进行所有检查,而不仅仅是在匹配之前尝试查找。

input_phrase.replace("the ","")

input_phrase = "An airplane is an aircraft with a higher density than the air."
input_phrase = input_phrase.lower()

input_phrase = input_phrase.replace("the ","",1)
input_phrase = input_phrase.replace("and ","",1)
input_phrase = input_phrase.replace("a ","",1)
input_phrase = input_phrase.replace("an ","",1)
input_phrase = input_phrase.replace("this ","",1)
input_phrase = input_phrase.replace("that ","",1)

output_phrase = input_phrase

print(repr(output_phrase))

该代码的问题在于,它不仅会删除开头的单词,还会删除第一个出现的单词,并且还会使用所有 .remove() 并且在已经删除后不会停止其中一场比赛。

这是一种使用正则表达式的方法:

import re

input_phrase = "An airplane is an aircraft with a higher density than the air."
output_phrase = re.sub(r"^(the|and|a|an|this|that) ", '', input_phrase, flags=re.IGNORECASE)
print(output_phrase)
  • re.ignorecase 标志允许 Anan 工作。
  • ^用于断言字符串开头的位置。

没有正则表达式,您可以使用 startswith() 并循环遍历关键字。

input_phrase = "An airplane is an aircraft with a higher density than the air."
keywords = ["the ", "and ", "a ", "an ", "this ", "that "]

output_phrase = input_phrase
for word in keywords:
    if input_phrase.lower().startswith(word):
        output_phrase = input_phrase[len(word):]
        break
print(output_phrase)
  • break用于退出for循环,以免浪费时间检查其他单词。
input_phrase = "An airplane is an aircraft with a higher density than the air.".lower()

output_phrase = ''

words = ["the", "and ", "a ", "an ", "this", "that"]

if list(filter(input_phrase.startswith, words)) != []:
    input_phrase = input_phrase.split()
    input_phrase = input_phrase[1:]

for word in input_phrase:
    output_phrase += ' ' + word

print(output_phrase)