itertools.chain.from_iterable 适用于嵌套数字列表但不适用于字符串列表?
itertools.chain.from_iterable works on nested numeric list but not string list?
我的数据框中有一列由列表组成,我想将每一行的所有列表合并为一个单元格中的一个列表。
这是专栏的样子
df.terms.dropna()
0 [Algorithms, Brain, Brain Mapping, Computer Si...
4 [Adult, Algorithms, Cerebrovascular Circulatio...
5 [Algorithms, Brain, Brain Mapping, Hemodynamic...
7 [Adult, Algorithms, Brain, Cerebrovascular Cir...
10 [Animals, Base Composition, Birds, Genetic Var...
Name: mesh_terms, dtype: object
我设法将它们组合在一起得到了
0 [[Algorithms, Brain, Brain Mapping, Computer S...],[Adult, Algorithms, Cerebrovascular Circulatio...],[Algorithms, Brain, Brain Mapping, Hemodynamic...],[list_index_7],[list_index_10]]
Name: mesh_terms, dtype: object
但我想要一个包含所有字符串的长列表,例如 [Algorithms, Brain, Brain Mapping, Computer Si..., ... , Animals, Base Composition, Birds, Genetic Var...]
我试过使用 itertools,但它仍然给了我一个嵌套列表,但它适用于这个例子
list2d = [[1,2,3],[4,5,6], [7], [8,9]]
list(itertools.chain.from_iterable(list2d))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
也试过 flattened = [val for sublist in list_of_lists for val in sublist]
也没用。
请帮忙!
这是所有子列表的完整列表
['Algorithms', 'Brain', 'Brain Mapping', 'Computer Simulation', 'Hemodynamics', 'Humans', 'Linear Models', 'Magnetic Resonance Imaging', 'Models, Neurological'] ['Adult', 'Algorithms', 'Cerebrovascular Circulation', 'Computer Simulation', 'Female', 'Functional Laterality', 'Globus Pallidus', 'Humans', 'Image Processing, Computer-Assisted', 'Magnetic Resonance Imaging', 'Male', 'Models, Neurological', 'Nonlinear Dynamics', 'Reinforcement (Psychology)', 'Reward', 'Young Adult'] ['Algorithms', 'Brain', 'Brain Mapping', 'Hemodynamics', 'Humans', 'Image Interpretation, Computer-Assisted', 'Linear Models', 'Magnetic Resonance Imaging', 'Models, Neurological'] ['Adult', 'Algorithms', 'Brain', 'Cerebrovascular Circulation', 'Female', 'Hemodynamics', 'Humans', 'Image Interpretation, Computer-Assisted', 'Magnetic Resonance Imaging', 'Male', 'Statistics, Nonparametric', 'Young Adult'] ['Animals', 'Base Composition', 'Birds', 'Genetic Variation', 'Genome', 'Genomics', 'Mammals', 'Molecular Sequence Data', 'Phylogeny', 'Reptiles', 'Retroelements', 'Tandem Repeat Sequences']
将值转换为列表,然后转换为 DataFrame
或 Series
构造函数:
df_mesh = pd.DataFrame({'terms': [['Algorithms','Brain'],['Adult','Algorithms']]})
print (df_mesh)
terms
0 [Algorithms, Brain]
1 [Adult, Algorithms]
df = pd.DataFrame({'new': [df_mesh['terms'].tolist()]})
print (df)
new
0 [[Algorithms, Brain], [Adult, Algorithms]]
s = pd.Series([df_mesh['terms'].tolist()])
print (s)
0 [[Algorithms, Brain], [Adult, Algorithms]]
dtype: object
编辑:
s1 = pd.Series([[val for sublist in df_mesh['terms'] for val in sublist]])
print (s1)
0 [Algorithms, Brain, Adult, Algorithms]
dtype: object
或:
s1 = pd.Series([list(itertools.chain.from_iterable(df_mesh['terms']))])
我的数据框中有一列由列表组成,我想将每一行的所有列表合并为一个单元格中的一个列表。
这是专栏的样子
df.terms.dropna()
0 [Algorithms, Brain, Brain Mapping, Computer Si...
4 [Adult, Algorithms, Cerebrovascular Circulatio...
5 [Algorithms, Brain, Brain Mapping, Hemodynamic...
7 [Adult, Algorithms, Brain, Cerebrovascular Cir...
10 [Animals, Base Composition, Birds, Genetic Var...
Name: mesh_terms, dtype: object
我设法将它们组合在一起得到了
0 [[Algorithms, Brain, Brain Mapping, Computer S...],[Adult, Algorithms, Cerebrovascular Circulatio...],[Algorithms, Brain, Brain Mapping, Hemodynamic...],[list_index_7],[list_index_10]]
Name: mesh_terms, dtype: object
但我想要一个包含所有字符串的长列表,例如 [Algorithms, Brain, Brain Mapping, Computer Si..., ... , Animals, Base Composition, Birds, Genetic Var...]
我试过使用 itertools,但它仍然给了我一个嵌套列表,但它适用于这个例子
list2d = [[1,2,3],[4,5,6], [7], [8,9]]
list(itertools.chain.from_iterable(list2d))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
也试过 flattened = [val for sublist in list_of_lists for val in sublist]
也没用。
请帮忙!
这是所有子列表的完整列表
['Algorithms', 'Brain', 'Brain Mapping', 'Computer Simulation', 'Hemodynamics', 'Humans', 'Linear Models', 'Magnetic Resonance Imaging', 'Models, Neurological'] ['Adult', 'Algorithms', 'Cerebrovascular Circulation', 'Computer Simulation', 'Female', 'Functional Laterality', 'Globus Pallidus', 'Humans', 'Image Processing, Computer-Assisted', 'Magnetic Resonance Imaging', 'Male', 'Models, Neurological', 'Nonlinear Dynamics', 'Reinforcement (Psychology)', 'Reward', 'Young Adult'] ['Algorithms', 'Brain', 'Brain Mapping', 'Hemodynamics', 'Humans', 'Image Interpretation, Computer-Assisted', 'Linear Models', 'Magnetic Resonance Imaging', 'Models, Neurological'] ['Adult', 'Algorithms', 'Brain', 'Cerebrovascular Circulation', 'Female', 'Hemodynamics', 'Humans', 'Image Interpretation, Computer-Assisted', 'Magnetic Resonance Imaging', 'Male', 'Statistics, Nonparametric', 'Young Adult'] ['Animals', 'Base Composition', 'Birds', 'Genetic Variation', 'Genome', 'Genomics', 'Mammals', 'Molecular Sequence Data', 'Phylogeny', 'Reptiles', 'Retroelements', 'Tandem Repeat Sequences']
将值转换为列表,然后转换为 DataFrame
或 Series
构造函数:
df_mesh = pd.DataFrame({'terms': [['Algorithms','Brain'],['Adult','Algorithms']]})
print (df_mesh)
terms
0 [Algorithms, Brain]
1 [Adult, Algorithms]
df = pd.DataFrame({'new': [df_mesh['terms'].tolist()]})
print (df)
new
0 [[Algorithms, Brain], [Adult, Algorithms]]
s = pd.Series([df_mesh['terms'].tolist()])
print (s)
0 [[Algorithms, Brain], [Adult, Algorithms]]
dtype: object
编辑:
s1 = pd.Series([[val for sublist in df_mesh['terms'] for val in sublist]])
print (s1)
0 [Algorithms, Brain, Adult, Algorithms]
dtype: object
或:
s1 = pd.Series([list(itertools.chain.from_iterable(df_mesh['terms']))])