Python 将多个单词列表转换为单个单词
Python convert list of multiple words to single words
我有一个单词列表,例如:
words = ['one','two','three four','five','six seven']
# 引用缺失
我正在尝试创建一个新列表,其中列表中的每个项目都只有一个词,所以我会:
words = ['one','two','three','four','five','six','seven']
最好的办法是将整个列表连接成一个字符串,然后标记该字符串吗?像这样:
word_string = ' '.join(words)
tokenize_list = nltk.tokenize(word_string)
或者有更好的选择吗?
words = ['one','two','three four','five','six seven']
带循环:
words_result = []
for item in words:
for word in item.split():
words_result.append(word)
或者作为理解:
words = [word for item in words for word in item.split()]
您可以使用 space 分隔符加入,然后再次拆分:
In [22]:
words = ['one','two','three four','five','six seven']
' '.join(words).split()
Out[22]:
['one', 'two', 'three', 'four', 'five', 'six', 'seven']
这是一个稍微使用正则表达式的解决方案:
import re
words = ['one','two','three four','five','six seven']
result = re.findall(r'[a-zA-Z]+', str(words))
我有一个单词列表,例如:
words = ['one','two','three four','five','six seven']
# 引用缺失
我正在尝试创建一个新列表,其中列表中的每个项目都只有一个词,所以我会:
words = ['one','two','three','four','five','six','seven']
最好的办法是将整个列表连接成一个字符串,然后标记该字符串吗?像这样:
word_string = ' '.join(words)
tokenize_list = nltk.tokenize(word_string)
或者有更好的选择吗?
words = ['one','two','three four','five','six seven']
带循环:
words_result = []
for item in words:
for word in item.split():
words_result.append(word)
或者作为理解:
words = [word for item in words for word in item.split()]
您可以使用 space 分隔符加入,然后再次拆分:
In [22]:
words = ['one','two','three four','five','six seven']
' '.join(words).split()
Out[22]:
['one', 'two', 'three', 'four', 'five', 'six', 'seven']
这是一个稍微使用正则表达式的解决方案:
import re
words = ['one','two','three four','five','six seven']
result = re.findall(r'[a-zA-Z]+', str(words))