将文本中句子的第一个单词大写
Capitalize the first word of a sentence in a text
我想确保文本中的每个句子都以大写字母开头。
例如"we have good news and bad news about your emissaries to our world,"外星大使通知首相。好消息是它们尝起来像鸡肉。”应该变成
"We have good news and bad news about your emissaries to our world,"外星大使通知首相。好消息是它们尝起来像鸡肉。"
我尝试使用 split() 来拆分句子。然后,我将每行的第一个字符大写。我将字符串的其余部分附加到大写字符。
text = input("Enter the text: \n")
lines = text.split('. ') #Split the sentences
for line in lines:
a = line[0].capitalize() # capitalize the first word of sentence
for i in range(1, len(line)):
a = a + line[i]
print(a)
我要"We have good news and bad news about your emissaries to our world,"外星大使通知首相。好消息是它们尝起来像鸡肉。"
我得到"We have good news and bad news about your emissaries to our world,"外星大使通知首相
好消息是它们尝起来像鸡肉。"
split 拆分字符串 AND none 新字符串包含 delimiter - 或 string/character你分开了。
将您的代码更改为:
text = input("Enter the text: \n")
lines = text.split('. ') #Split the sentences
final_text = ". ".join([line[0].upper()+line[1:] for line in lines])
print(final_text)
当您按 ". "
拆分字符串时,会从字符串中删除 ". "
并将其余部分放入列表中。您需要将丢失的句号添加到句子中才能完成这项工作。
此外,这可能导致最后一个句子有两个句点,因为它的结尾只有 "."
,而不是 ". "
。我们需要在开头删除句点(如果存在)以确保我们不会得到双句点。
text = input("Enter the text: \n")
output = ""
if (text[-1] == '.'):
# remove the last period to avoid double periods in the last sentence
text = text[:-1]
lines = text.split('. ') #Split the sentences
for line in lines:
a = line[0].capitalize() # capitalize the first word of sentence
for i in range(1, len(line)):
a = a + line[i]
a = a + '.' # add the removed period
output = output + a
print (output)
我们还可以让这个解决方案更干净:
text = input("Enter the text: \n")
output = ""
if (text[-1] == '.'):
# remove the last period to avoid double periods in the last sentence
text = text[:-1]
lines = text.split('. ') #Split the sentences
for line in lines:
a = line[0].capitalize() + line [1:] + '.'
output = output + a
print (output)
通过使用 str[1:]
,您可以获得删除第一个字符的字符串副本。使用 str[:-1]
将为您提供删除了最后一个字符的字符串副本。
此代码应该有效:
text = input("Enter the text: \n")
lines = text.split('. ') # Split the sentences
for index, line in enumerate(lines):
lines[index] = line[0].upper() + line[1:]
print(". ".join(lines))
您的代码中的错误是 str.split(chars)
删除了分隔符 char
,这就是删除句点的原因。
抱歉没有提供详尽的描述,因为我想不出该说些什么。请随时在评论中提问。
编辑:让我试着解释一下我做了什么。
- 第 1-2 行:接受输入并按
'. '
拆分为列表。在示例输入中,这给出:['"We have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister', 'the good news is they tasted like chicken.']
。请注意,句号从第一句拆分处消失了。
- 第 4 行:
enumerate
是一个生成器,遍历一个迭代器,在 tuple
. 中返回迭代器中每个项目的索引和项目
- 第 5 行:将
lines
中 line
的位置替换为第一个字符的大写加上该行的其余部分。
- 第 6 行:打印消息。
". ".join(lines)
基本上颠倒了你对拆分所做的事情。 str.join(l)
采用字符串迭代器 l
,并将它们与所有项目之间的 str
粘在一起。没有这个,您就会错过月经。
下面可以处理多种句子类型(以“.”、“!”、“?”等结尾),并将每个句子的第一个单词大写。由于您想保留现有的大写字母,因此使用 capitalize 函数将不起作用(因为它会使 none 句子的起始词小写)。您可以将 lambda 函数放入列表 comp 以利用每个句子的第一个字母上的 upper() ,这使句子的其余部分完全保持不变。
import re
original_sentence = 'we have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister. the good news is they tasted like chicken.'
val = re.split('([.!?] *)', original_sentence)
new_sentence = ''.join([(lambda x: x[0].upper() + x[1:])(each) if len(each) > 1 else each for each in val])
print(new_sentence)
"new_sentence" 列表理解等同于说:
sentence = []
for each in val:
sentence.append((lambda x: x[0].upper() + x[1:])(each) if len(each) > 1 else each)
print(''.join(sentence))
您可以使用 re.sub 函数将模式 . \w
之后的所有字符替换为其对应的大写字母。
import re
original_sentence = 'we have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister. the good news is they tasted like chicken.'
def replacer(match_obj):
return match_obj.group(0).upper()
# Replace the very first characer or any other following a dot and a space by its upper case version.
re.sub(r"(?<=\. )(\w)|^\w", replacer, original_sentence)
>>> 'We have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister. The good news is they tasted like chicken.'
我想确保文本中的每个句子都以大写字母开头。
例如"we have good news and bad news about your emissaries to our world,"外星大使通知首相。好消息是它们尝起来像鸡肉。”应该变成
"We have good news and bad news about your emissaries to our world,"外星大使通知首相。好消息是它们尝起来像鸡肉。"
我尝试使用 split() 来拆分句子。然后,我将每行的第一个字符大写。我将字符串的其余部分附加到大写字符。
text = input("Enter the text: \n")
lines = text.split('. ') #Split the sentences
for line in lines:
a = line[0].capitalize() # capitalize the first word of sentence
for i in range(1, len(line)):
a = a + line[i]
print(a)
我要"We have good news and bad news about your emissaries to our world,"外星大使通知首相。好消息是它们尝起来像鸡肉。"
我得到"We have good news and bad news about your emissaries to our world,"外星大使通知首相 好消息是它们尝起来像鸡肉。"
split 拆分字符串 AND none 新字符串包含 delimiter - 或 string/character你分开了。
将您的代码更改为:
text = input("Enter the text: \n")
lines = text.split('. ') #Split the sentences
final_text = ". ".join([line[0].upper()+line[1:] for line in lines])
print(final_text)
当您按 ". "
拆分字符串时,会从字符串中删除 ". "
并将其余部分放入列表中。您需要将丢失的句号添加到句子中才能完成这项工作。
此外,这可能导致最后一个句子有两个句点,因为它的结尾只有 "."
,而不是 ". "
。我们需要在开头删除句点(如果存在)以确保我们不会得到双句点。
text = input("Enter the text: \n")
output = ""
if (text[-1] == '.'):
# remove the last period to avoid double periods in the last sentence
text = text[:-1]
lines = text.split('. ') #Split the sentences
for line in lines:
a = line[0].capitalize() # capitalize the first word of sentence
for i in range(1, len(line)):
a = a + line[i]
a = a + '.' # add the removed period
output = output + a
print (output)
我们还可以让这个解决方案更干净:
text = input("Enter the text: \n")
output = ""
if (text[-1] == '.'):
# remove the last period to avoid double periods in the last sentence
text = text[:-1]
lines = text.split('. ') #Split the sentences
for line in lines:
a = line[0].capitalize() + line [1:] + '.'
output = output + a
print (output)
通过使用 str[1:]
,您可以获得删除第一个字符的字符串副本。使用 str[:-1]
将为您提供删除了最后一个字符的字符串副本。
此代码应该有效:
text = input("Enter the text: \n")
lines = text.split('. ') # Split the sentences
for index, line in enumerate(lines):
lines[index] = line[0].upper() + line[1:]
print(". ".join(lines))
您的代码中的错误是 str.split(chars)
删除了分隔符 char
,这就是删除句点的原因。
抱歉没有提供详尽的描述,因为我想不出该说些什么。请随时在评论中提问。
编辑:让我试着解释一下我做了什么。
- 第 1-2 行:接受输入并按
'. '
拆分为列表。在示例输入中,这给出:['"We have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister', 'the good news is they tasted like chicken.']
。请注意,句号从第一句拆分处消失了。 - 第 4 行:
enumerate
是一个生成器,遍历一个迭代器,在tuple
. 中返回迭代器中每个项目的索引和项目
- 第 5 行:将
lines
中line
的位置替换为第一个字符的大写加上该行的其余部分。 - 第 6 行:打印消息。
". ".join(lines)
基本上颠倒了你对拆分所做的事情。str.join(l)
采用字符串迭代器l
,并将它们与所有项目之间的str
粘在一起。没有这个,您就会错过月经。
下面可以处理多种句子类型(以“.”、“!”、“?”等结尾),并将每个句子的第一个单词大写。由于您想保留现有的大写字母,因此使用 capitalize 函数将不起作用(因为它会使 none 句子的起始词小写)。您可以将 lambda 函数放入列表 comp 以利用每个句子的第一个字母上的 upper() ,这使句子的其余部分完全保持不变。
import re
original_sentence = 'we have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister. the good news is they tasted like chicken.'
val = re.split('([.!?] *)', original_sentence)
new_sentence = ''.join([(lambda x: x[0].upper() + x[1:])(each) if len(each) > 1 else each for each in val])
print(new_sentence)
"new_sentence" 列表理解等同于说:
sentence = []
for each in val:
sentence.append((lambda x: x[0].upper() + x[1:])(each) if len(each) > 1 else each)
print(''.join(sentence))
您可以使用 re.sub 函数将模式 . \w
之后的所有字符替换为其对应的大写字母。
import re
original_sentence = 'we have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister. the good news is they tasted like chicken.'
def replacer(match_obj):
return match_obj.group(0).upper()
# Replace the very first characer or any other following a dot and a space by its upper case version.
re.sub(r"(?<=\. )(\w)|^\w", replacer, original_sentence)
>>> 'We have good news and bad news about your emissaries to our world," the extraterrestrial ambassador informed the Prime Minister. The good news is they tasted like chicken.'