从 Unicode 行而不是 Char 读取 Word

Question

我有以下代码：

for line in contentText:
          print type(line), #-> o/p is unicode
          word = line.strip().split()
          print word, #-> o/p is <type 'list'>
          print type(word),

当我执行 line.strip().split() 时，会显示每个字符。

例如，如果我的行是“Read Word from a Unicode Line instead of Char”，那么 o/p 是： R 电子一个 d

w o r d

一个 . . 等等

我想将其读作 'Read'、'word'，从单词而不是字符进行进一步处理..

我怎样才能做到这一点？

另外，如何删除空格以进行进一步处理？

Answer 1

迭代字符串产生单字符字符串：

>>> text = 'Read word'
>>> for x in text:
...     print x
... 
R
e
a
d

w
o
r
d

先拆分得到单词列表，然后迭代列表：

>>> text.split()  # str.split remove space characters
['Read', 'word']

>>> for x in text.split():
...     print x
... 
Read
word

从 Unicode 行而不是 Char 读取 Word

Read Word from a Unicode Line instead of Char

python

string

unicode

python-2.7