从 python 中的字符串中删除 text:u

Question

我正在使用 xlrd 库将值从 excel 文件导入到 python 列表。我在 excel 文件中只有一列，并且按行提取数据。但问题是我在列表中得到的数据是

list = ["text:u'__string__'","text:u'__string__'",.....so on]

如何从中删除 text:u 以获得带字符串的自然列表？

此处代码使用 python2.7

book = open_workbook("blabla.xlsx")
sheet = book.sheet_by_index(0)
documents = []

for row in range(1, 50): #start from 1, to leave out row 0
    documents.append(sheet.cell(row, 0)) #extract from first col

data = [str(r) for r in documents]
print data

Answer 1

迭代项目并从每个单词中删除多余的字符：

s=[]   
for x in list:
    s.append(x[7:-1]) # Slice from index 7 till lastindex - 1

Answer 2

如果这是你的标准输入列表，你可以用简单的 split

[s.split("'")[1] for s in list]

# if your string itself has got "'" in between, using regex is always safe
import re
[re.findall(r"u'(.*)'", s)[0] for s in list]

#Output
#['__string__', '__string__']

Answer 3

我遇到了同样的问题。以下代码对我有帮助。

list = ["text:u'__string__'","text:u'__string__'",.....so on]
for index, item in enumerate(list):
      list[index] = list[index][7:] #Deletes first 7 xharacters
      list[index] = list[index][:-1] #Deletes last character

从 python 中的字符串中删除 text:u

Remove text:u from strings in python

python

string

excel

parsing

document