如何使用 Python startswith 方法组合 unicode 和 ascii 字符串？

Question

在我的例子中，当我使用

时，我有 Unicode text_string 和前缀作为 ASCII 字符串

text-string.startswith(prefix)

我用这种方式遇到异常

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128)

如何比较两个字符串，我尝试使用 unicode(string) 方法将 ASCII 字符串转换为 Unicode，但仍然出现相同的异常。

如何解决这个问题？在最坏的情况下如何在比较时抑制这个异常？

text - u'PreciChrom I/II is a lyophilized control based on human citrated plasma.'
prefix - 'Reagents – working solutions'

Answer 1

text_string.encode().startswith(prefix)

Answer 2

您的前缀字符串是不是 ASCII。正如错误消息所说，您在第 9 位有一个非 ASCII 字符；破折号，–。该字符串可能是 utf-8.

您可以将前缀解码为 unicode:

text_string.startswith(prefix.decode('utf-8'))

How to use Python startswith method for combination of unicode and ascii string?