仅替换出现在字符串开头的短语
Replace a phrase only if it appears at the beginning of a character string
例如,必须按照 "the "
、"and "
、"a "
、"an "
、"this "
或 "that "
的顺序删除,只有如果它们在字符串的开头:
输入--->"the computer is the machine in charge of data processing processes"
输出--->"computer is the machine in charge of data processing processes"
重要的是,如果我设法找到句子以其中一个词开头,我将其删除,然后不要继续尝试删除其他词。
在本例中,它会检测字符串开头的单词 "the "
,将其删除,并且不再尝试其余单词。
要得出不应删除任何内容的结论,您必须选择是,或者如果您已尝试删除所有 6 个选项("the "
、"and "
、"a "
、"an "
、"this "
或 "that "
),如果您没有发现输入短语以这些选项中的任何一个开头,则假设您不应删除任何内容。
我试过类似的方法,但问题是它会进行所有检查,而不仅仅是在匹配之前尝试查找。
input_phrase.replace("the ","")
input_phrase = "An airplane is an aircraft with a higher density than the air."
input_phrase = input_phrase.lower()
input_phrase = input_phrase.replace("the ","",1)
input_phrase = input_phrase.replace("and ","",1)
input_phrase = input_phrase.replace("a ","",1)
input_phrase = input_phrase.replace("an ","",1)
input_phrase = input_phrase.replace("this ","",1)
input_phrase = input_phrase.replace("that ","",1)
output_phrase = input_phrase
print(repr(output_phrase))
该代码的问题在于,它不仅会删除开头的单词,还会删除第一个出现的单词,并且还会使用所有 .remove()
并且在已经删除后不会停止其中一场比赛。
这是一种使用正则表达式的方法:
import re
input_phrase = "An airplane is an aircraft with a higher density than the air."
output_phrase = re.sub(r"^(the|and|a|an|this|that) ", '', input_phrase, flags=re.IGNORECASE)
print(output_phrase)
re.ignorecase
标志允许 An
和 an
工作。
^
用于断言字符串开头的位置。
没有正则表达式,您可以使用 startswith()
并循环遍历关键字。
input_phrase = "An airplane is an aircraft with a higher density than the air."
keywords = ["the ", "and ", "a ", "an ", "this ", "that "]
output_phrase = input_phrase
for word in keywords:
if input_phrase.lower().startswith(word):
output_phrase = input_phrase[len(word):]
break
print(output_phrase)
break
用于退出for循环,以免浪费时间检查其他单词。
input_phrase = "An airplane is an aircraft with a higher density than the air.".lower()
output_phrase = ''
words = ["the", "and ", "a ", "an ", "this", "that"]
if list(filter(input_phrase.startswith, words)) != []:
input_phrase = input_phrase.split()
input_phrase = input_phrase[1:]
for word in input_phrase:
output_phrase += ' ' + word
print(output_phrase)
例如,必须按照 "the "
、"and "
、"a "
、"an "
、"this "
或 "that "
的顺序删除,只有如果它们在字符串的开头:
输入--->"the computer is the machine in charge of data processing processes"
输出--->"computer is the machine in charge of data processing processes"
重要的是,如果我设法找到句子以其中一个词开头,我将其删除,然后不要继续尝试删除其他词。
在本例中,它会检测字符串开头的单词 "the "
,将其删除,并且不再尝试其余单词。
要得出不应删除任何内容的结论,您必须选择是,或者如果您已尝试删除所有 6 个选项("the "
、"and "
、"a "
、"an "
、"this "
或 "that "
),如果您没有发现输入短语以这些选项中的任何一个开头,则假设您不应删除任何内容。
我试过类似的方法,但问题是它会进行所有检查,而不仅仅是在匹配之前尝试查找。
input_phrase.replace("the ","")
input_phrase = "An airplane is an aircraft with a higher density than the air."
input_phrase = input_phrase.lower()
input_phrase = input_phrase.replace("the ","",1)
input_phrase = input_phrase.replace("and ","",1)
input_phrase = input_phrase.replace("a ","",1)
input_phrase = input_phrase.replace("an ","",1)
input_phrase = input_phrase.replace("this ","",1)
input_phrase = input_phrase.replace("that ","",1)
output_phrase = input_phrase
print(repr(output_phrase))
该代码的问题在于,它不仅会删除开头的单词,还会删除第一个出现的单词,并且还会使用所有 .remove()
并且在已经删除后不会停止其中一场比赛。
这是一种使用正则表达式的方法:
import re
input_phrase = "An airplane is an aircraft with a higher density than the air."
output_phrase = re.sub(r"^(the|and|a|an|this|that) ", '', input_phrase, flags=re.IGNORECASE)
print(output_phrase)
re.ignorecase
标志允许An
和an
工作。^
用于断言字符串开头的位置。
没有正则表达式,您可以使用 startswith()
并循环遍历关键字。
input_phrase = "An airplane is an aircraft with a higher density than the air."
keywords = ["the ", "and ", "a ", "an ", "this ", "that "]
output_phrase = input_phrase
for word in keywords:
if input_phrase.lower().startswith(word):
output_phrase = input_phrase[len(word):]
break
print(output_phrase)
break
用于退出for循环,以免浪费时间检查其他单词。
input_phrase = "An airplane is an aircraft with a higher density than the air.".lower()
output_phrase = ''
words = ["the", "and ", "a ", "an ", "this", "that"]
if list(filter(input_phrase.startswith, words)) != []:
input_phrase = input_phrase.split()
input_phrase = input_phrase[1:]
for word in input_phrase:
output_phrase += ' ' + word
print(output_phrase)