如何拆分 Python 中嵌套列表中的字符串?
How do I split strings within nested lists in Python?
我知道如何使用这些字符串将字符串列表拆分为嵌套列表,但我不确定现在如何将这些字符串拆分为多个字符串。
例如:
def inputSplit(file_name):
with open(file_name) as f:
content = f.read().splitlines()
i = 0
contentLists = [content[i:i+1] for i in range(0, len(content), 1)]
会给我这样的东西:
[['these are some words'], ['these are some more words'], ['these are even more words'], ['these are the last words']]
我不确定如何使用字符串拆分使我的输出看起来像这样:
[['these', 'are', 'some', 'words'], ['these', 'are', 'some', 'more', 'words'], ['these', 'are', 'even', 'more', 'words'], ['these', 'are', 'the', 'last', 'words']]
我有办法解决这个问题吗?
x=[['these are some words'], ['these are some more words'], ['these are even more words'], ['these are the last words']]
print [i[0].split() for i in x]
输出:[['these', 'are', 'some', 'words'], ['these', 'are', 'some', 'more', 'words'], ['these', 'are', 'even', 'more', 'words'], ['these', 'are', 'the', 'last', 'words']]
简单 list comprehension
可以为您做到。
如果,比如说,
x = [['these are some words'], ['these are some more words'], ['these are even more words'], ['these are the last words']]
然后
y = [sublist[0].split() for sublist in x]
会给你
[['these', 'are', 'some', 'words'], ['these', 'are', 'some', 'more', 'words'], ['these', 'are', 'even', 'more', 'words'], ['these', 'are', 'the', 'last', 'words']]
随心所欲。
但是,如果您的原始表达式
contentLists = [content[i:i+1] for i in range(0, len(content), 1)]
生成我在这里称为 x
的列表,这毫无意义——为什么首先要构建一个长度为 1 的子列表列表?!
看起来像你想要的,直接:
y = [item.split() for item in content]
而不是生成 contentLists
,又名 x
,然后从中生成 y
,不是吗?
您可以像这样以高效的方式实现您想要的:
with open(file_path) as input_file:
content_lists = [line.split() for line in input_file]
实际上,f.read()
首先将整个文件加载到内存中,然后.splitlines()
创建一个分成行的副本:不需要这两个数据结构,因为您可以简单地读取逐行归档并依次拆分每一行,如上所述。这样更高效也更简单。
我知道如何使用这些字符串将字符串列表拆分为嵌套列表,但我不确定现在如何将这些字符串拆分为多个字符串。
例如:
def inputSplit(file_name):
with open(file_name) as f:
content = f.read().splitlines()
i = 0
contentLists = [content[i:i+1] for i in range(0, len(content), 1)]
会给我这样的东西:
[['these are some words'], ['these are some more words'], ['these are even more words'], ['these are the last words']]
我不确定如何使用字符串拆分使我的输出看起来像这样:
[['these', 'are', 'some', 'words'], ['these', 'are', 'some', 'more', 'words'], ['these', 'are', 'even', 'more', 'words'], ['these', 'are', 'the', 'last', 'words']]
我有办法解决这个问题吗?
x=[['these are some words'], ['these are some more words'], ['these are even more words'], ['these are the last words']]
print [i[0].split() for i in x]
输出:[['these', 'are', 'some', 'words'], ['these', 'are', 'some', 'more', 'words'], ['these', 'are', 'even', 'more', 'words'], ['these', 'are', 'the', 'last', 'words']]
简单 list comprehension
可以为您做到。
如果,比如说,
x = [['these are some words'], ['these are some more words'], ['these are even more words'], ['these are the last words']]
然后
y = [sublist[0].split() for sublist in x]
会给你
[['these', 'are', 'some', 'words'], ['these', 'are', 'some', 'more', 'words'], ['these', 'are', 'even', 'more', 'words'], ['these', 'are', 'the', 'last', 'words']]
随心所欲。
但是,如果您的原始表达式
contentLists = [content[i:i+1] for i in range(0, len(content), 1)]
生成我在这里称为 x
的列表,这毫无意义——为什么首先要构建一个长度为 1 的子列表列表?!
看起来像你想要的,直接:
y = [item.split() for item in content]
而不是生成 contentLists
,又名 x
,然后从中生成 y
,不是吗?
您可以像这样以高效的方式实现您想要的:
with open(file_path) as input_file:
content_lists = [line.split() for line in input_file]
实际上,f.read()
首先将整个文件加载到内存中,然后.splitlines()
创建一个分成行的副本:不需要这两个数据结构,因为您可以简单地读取逐行归档并依次拆分每一行,如上所述。这样更高效也更简单。