Python 2 中的方法对 Unicode 友好吗？

Question

我在我的代码中使用这一行来计算字符串中的大写字母：

text = "Áno"
count = sum(1 for c in text if c.isupper())

此代码 returns 0，但我希望是 1（因为“Á”是大写）我如何计算带有 Unicode 字符的大写字母？

Answer 1

对于python2你需要加一个u，你的字符串实际上不是unicode:

text = u"Áno"

您也可以将表达式写成 count = sum(c.isupper() for c in text)，c.isupper() 将 return True or False so 1 or 0.

In [1]: text = "Áno"

In [2]: count = sum(c.isupper() for c in text)

In [3]: count
Out[3]: 0    
In [4]: text = u"Áno"
In [5]: count = sum(c.isupper() for c in text)    
In [6]: count
Out[6]: 1
In [7]: text = "Áno".decode("utf-8")   
In [8]: count = sum(c.isupper() for c in text)    
In [9]: count
Out[9]: 1

Answer 2

在Python2中，str.isupper()方法只对ASCII字符有效。你几乎肯定有一个 Python 2 bytestring，它取决于编码你在那里有什么确切的字节，但它们不会是有效的 ASCII 字节。

将字符串解码为Unicode值或使用Unicode文字（u'Áno'），这样unicode.isupper()就可以根据Unicode标准判断大写字符：

>>> u'Áno'[0].isupper()
True

您可能需要阅读 Python 和 Unicode：

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) 作者：乔尔·斯波尔斯基
Python Unicode HOWTO
Pragmatic Unicode 作者：内德·巴切尔德

Answer 3

对于英文单词，有一个适用于所有大写字母的模块。如果你把你所有的大写字母都放在一个变量中，下面的代码也可以工作：

import string
a=string.ascii_uppercase
s='ThiS is A tEst'
count=0
for i in s:
    if i in a:
        count+=1

print(count)

Python 2 中的方法对 Unicode 友好吗？

Method isupper Unicode-friendly in Python 2?

python

unicode

python-2.x