如何将可能包含数字的字符串中的首字母大写

How to capitalize first letter in strings that may contain numbers

我想通读一个文件并使用 Python 将字符串中的第一个字母大写,但某些字符串可能首先包含数字。具体来说,该文件可能如下所示:

"hello world"
"11hello world"
"66645world hello"

我希望这样:

"Hello world"
"11Hello world"
"66645World hello"

我已经尝试了以下方法,但这只在字母位于第一个位置时才大写。

with open('input.txt') as input, open("output.txt", "a") as output:
    for line in input:
        output.write(line[0:1].upper()+line[1:-1].lower()+"\n")

有什么建议吗? :-)

您可以编写一个带有 for 循环的函数:

x = "hello world"
y = "11hello world"
z = "66645world hello"

def capper(mystr):
    for idx, i in enumerate(mystr):
        if not i.isdigit():  # or if i.isalpha()
            return ''.join(mystr[:idx] + mystr[idx:].capitalize())
    return mystr

print(list(map(capper, (x, y, z))))

['Hello world', '11Hello world', '66645World hello']

可能值得一试...

>>> s = '11hello World'
>>> for i, c in enumerate(s):
...     if not c.isdigit():
...         break
... 
>>> s[:i] + s[i:].capitalize()
'11Hello world'

您可以找到第一个字母字符并将其大写,如下所示:

with open("input.txt") as in_file, open("output.txt", "w") as out_file:
    for line in in_file:
        pos = next((i for i, e in enumerate(line) if e.isalpha()), 0)
        line = line[:pos] + line[pos].upper() + line[pos + 1:]
        out_file.write(line)

哪些输出:

Hello world
11Hello world
66645World hello

您可以使用正则表达式找到第一个字母表的位置,然后在该索引上使用 upper() 将该字符大写。这样的事情应该有效:

import re

s =  "66645hello world"
m = re.search(r'[a-zA-Z]', s)
index = m.start()

使用正则表达式:

for line in output:
    m = re.search('[a-zA-Z]', line);
    if m is not None:
        index = m.start()
        output.write(line[0:index] + line[index].upper() + line[index + 1:])

这个怎么样?

import re

text = "1234hello"
index = re.search("[a-zA-Z]", text).start()
text_list = list(text)
text_list[index] = text_list[index].upper()

''.join(text_list)

结果是:1234Hello

像这样,例如:

import re

re_numstart = re.compile(r'^([0-9]*)(.*)')

def capfirst(s):
    ma = re_numstart.match(s)
    return ma.group(1) + ma.group(2).capitalize()

试试这个:

with open('input.txt') as input, open("output.txt", "a") as output:
for line in input:
    t_line = ""
    for c in line:
        if c.isalpha():
            t_line += c.capitalize()
            t_line += line[line.index(c)+1:]
            break
        else:
            t_line += c
    output.write(t_line)

执行结果:

Hello world
11Hello world
66645World hello

可能有一种单行 REGEX 方法,但使用 title() 也应该有效:

def capitalise_first_letter(s):
    spl = s.split()
    return spl[0].title() + ' ' + ' '.join(spl[1:])

s = ['123hello world',
"hello world",
"11hello world",
"66645world hello"]


for i in s:
    print(capitalise_first_letter(i))

制作中:

Hello world
11Hello world
66645World hello

你可以使用 regular expression

import re

line = "66645world hello"

regex = re.compile(r'\D')
tofind = regex.search(line)
pos = line.find(tofind.group(0))+1

line = line[0:pos].upper()+line[pos:-pos].lower()+"\n"

print(line)

输出:66645World

好的,已经有很多答案了,应该可以。

我发现它们过于复杂或复杂...

这是一个更简单的解决方案:

for s in ("hello world", "11hello world", "66645world hello"):
    first_letter = next(c for c in s if not c.isdigit())
    print(s.replace(first_letter, first_letter.upper(), 1))

title() 方法将字符串的第一个字母字符大写,并忽略它之前的数字。它也适用于非 ASCII 字符,与使用 [a-zA-Z].

的正则表达式方法相反

来自文档:

str.title()

Return a titlecased version of the string where words start with an uppercase character and the remaining characters are lowercase. [...] The algorithm uses a simple language-independent definition of a word as groups of consecutive letters. The definition works in many contexts but it means that apostrophes in contractions and possessives form word boundaries, which may not be the desired result:

我们可以这样利用它:

def my_capitalize(s):
    first, rest = s.split(maxsplit=1)
    split_on_quote = first.split("'", maxsplit=1)
    split_on_quote[0] = split_on_quote[0].title()
    first = "'".join(split_on_quote)

    return first + ' ' + rest

一些测试:

tests = ["hello world", "11hello world", "66645world hello", "123ça marche!", "234i'm good"]
for s in tests:
    print(my_capitalize(s))

# Hello world
# 11Hello world
# 66645World hello
# 123Ça marche!  # The non-ASCII ç was turned to uppercase
# 234I'm good    # Words containing a quote are treated properly

re.subrepl为函数:

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

def capitalize(m):
    return m.group(1) + m.group(2).upper() + m.group(3)

lines = ["hello world", "11hello world", "66645world hello"]
for line in lines:
    print re.sub(r'(\d*)(\D)(.*)', capitalize, line)

输出:

Hello world
11Hello world
66645World hello

对字符串使用 isdigit() 和 title():

s = ['123hello world', "hello world", "11hello world", "66645world hello"]
print [each if each[0].isdigit() else each.title() for each in s ]


# ['123hello world', 'Hello World', '11hello world', '66645world hello']                                                                          

如果想把以字符开头的字符串转换成数字后面的字符不转大写,可以试试这段代码:

def solve(s):
    str1 =""
    for i in s.split(' '):
        str1=str1+str(i.capitalize()+' ') #capitalizes the first character of the string
    return str1

>>solve('hello 5g')
>>Hello 5g