如何使用 python 将其中包含 unicode 的字符串转换为 unicode

Question

我正在 python

上使用 xlrd 从 excel 导入一堆数据

我得到的所有数据都是这样的字符串：text:u'L\xc9GENDE'

我操作这些数据并尝试将它们放回 excel（使用 xlsxwriter），当我这样做时，我得到相同的文本块 text:u'L\xc9GENDE'而不是传说。

什么对我有用：

#!/usr/bin/env python
# -*- coding: latin-1 -*-
import xlsxwriter
import sys

workbook = xlsxwriter.Workbook('hello.xlsx')
worksheet = workbook.add_worksheet()
data = u'L\xc9GENDE'
worksheet.write('A1',data)
workbook.close()

这行得通，我将在 A1 单元格中获取 LÉGENDE

但如果我尝试操纵一个字符串，我已经给我 u'L\xc9GENDE'，它只会在 A1 单元格中显示 L\xc9GENDE

---- 编辑 ---- 我用来从 excel

检索数据的代码

from xlrd import open_workbook

def grabexcelfile():
    wb = open_workbook('leg.xls',encoding_override='latin-1')    
    log = []
    txt = ''
    for s in wb.sheets():         
        for row in range(s.nrows):              
            values = []
            for col in range(s.ncols):
                 txt = str(s.cell(row,col))
                 txt.replace('-',' ',10) 
                 log.append(txt) 
    return log            

x = grabexcelfile()
print type(x[0]),x[0]

印刷品给我：text:u'L\xc9GENDE'

Answer 1

试试这个。

import unicodedata
data = u'L\xc9GENDE'
unicodedata.normalize('NFKD',data).encode('ascii','ignore')

您可以参考这里了解更多 -> Convert a Unicode string to a string in Python (containing extra symbols)

Answer 2

我没有尝试操纵 text:u'L\xc9GENDE' ，而是更改了 var excel 的类型：

from xlrd import open_workbook

def grabexcelfile():
    wb = open_workbook('leg.xls',encoding_override='latin-1')    
    log = []
    txt = ''
    for s in wb.sheets():         
        for row in range(s.nrows):              
            values = []
            for col in range(s.ncols):
                 #next line is changed
                 txt = sheet.cell(row,col).value
                 txt.replace('-',' ',10) 
                 log.append(txt) 
    return log            

x = grabexcelfile()
print type(x[0]),x[0]

如何使用 python 将其中包含 unicode 的字符串转换为 unicode

How to convert a string with unicode in it to unicode using python

python

unicode

excel