如果pandasseries的值是一个list,如何得到每个元素的subList?
If the value of pandas series is a list, how to get a subList of each element?
使用两个Pandas系列:series1和series2,我愿意制作series3 。
series1的每个值都是一个列表,series2的每个值都是series1对应的一个索引。
>>> print(series1)
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6...
1 [64, 80, 79, 147, 14, 20, 56, 288, 12, 208, 26...
4 [5, 6, 152, 31, 295, 127, 711, 5, 271, 291, 11...
5 [363, 121, 727, 249, 483, 122, 241, 494, 555]
7 [112, 20, 41, 9, 104, 131, 26, 298, 65, 214, 1...
9 [129, 797, 19, 151, 448, 47, 19, 106, 299, 144...
11 [72, 35, 25, 200, 122, 5, 75, 30, 208, 24, 14,...
18 [137, 339, 71, 14, 19, 54, 61, 15, 73, 104, 43...
>>> print(series2)
0 0
1 3
4 1
5 6
7 4
9 5
11 7
18 2
我的期望:
>>> print(series3)
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6...
1 [147, 14, 20, 56, 288, 12, 208, 26...
4 [6, 152, 31, 295, 127, 711, 5, 271, 291, 11...
5 [241, 494, 555]
7 [104, 131, 26, 298, 65, 214, 1...
9 [47, 19, 106, 299, 144...
11 [30, 208, 24, 14,...
18 [71, 14, 19, 54, 61, 15, 73, 104, 43...
我的解决方案1:
由于 series1 和 series2 的长度相等,我可以制作一个 for 循环来迭代 series1 并计算类似 series1.ix[i][series2.ix[i]]
的东西并制作一个新系列(series3)来保存结果。
我的方案2:
使用 df = pd_concat([series1, series2])
生成数据帧 df,并创建一个新列(使用应用函数进行逐行操作 - 例如,df['series3'] = df.apply(lambda x: subList(x),轴=1).
但是,我认为以上两种解决方案都不是实现我想要的东西的好方法。如果您提出更简洁的解决方案,我将不胜感激!
您基本上可以连接指定轴(0=行,1 列)的系列,最好具有相同的长度
series3=pd.concat([series2, series1], axis=1).reset_index()
如果您希望避免创建中间 pd.DataFrame
,而只想创建一个新的 pd.Series
,您可以在 map
对象上使用 pd.Series
构造函数。所以给出:
In [6]: S1
Out[6]:
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6]
1 [64, 80, 79, 147, 14, 20, 56, 288, 12, 208, 26]
2 [5, 6, 152, 31, 295, 127, 711, 5, 271, 291, 11]
3 [363, 121, 727, 249, 483, 122, 241, 494, 555]
4 [112, 20, 41, 9, 104, 131, 26, 298, 65, 214, 1]
5 [129, 797, 19, 151, 448, 47, 19, 106, 299, 144]
6 [72, 35, 25, 200, 122, 5, 75, 30, 208, 24, 14]
7 [137, 339, 71, 14, 19, 54, 61, 15, 73, 104, 43]
dtype: object
In [7]: S2
Out[7]:
0 0
1 3
2 1
3 6
4 4
5 5
6 7
7 2
dtype: int64
你可以这样做:
In [8]: pd.Series(map(lambda x,y : x[y:], S1, S2), index=S1.index)
Out[8]:
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6]
1 [147, 14, 20, 56, 288, 12, 208, 26]
2 [6, 152, 31, 295, 127, 711, 5, 271, 291, 11]
3 [241, 494, 555]
4 [104, 131, 26, 298, 65, 214, 1]
5 [47, 19, 106, 299, 144]
6 [30, 208, 24, 14]
7 [71, 14, 19, 54, 61, 15, 73, 104, 43]
dtype: object
如果您想修改S1
而不创建中间容器,您可以使用for循环:
In [10]: for i, x in enumerate(map(lambda x,y : x[y:], S1, S2)):
...: S1.iloc[i] = x
...:
In [11]: S1
Out[11]:
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6]
1 [147, 14, 20, 56, 288, 12, 208, 26]
2 [6, 152, 31, 295, 127, 711, 5, 271, 291, 11]
3 [241, 494, 555]
4 [104, 131, 26, 298, 65, 214, 1]
5 [47, 19, 106, 299, 144]
6 [30, 208, 24, 14]
7 [71, 14, 19, 54, 61, 15, 73, 104, 43]
dtype: object
使用两个Pandas系列:series1和series2,我愿意制作series3 。 series1的每个值都是一个列表,series2的每个值都是series1对应的一个索引。
>>> print(series1)
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6...
1 [64, 80, 79, 147, 14, 20, 56, 288, 12, 208, 26...
4 [5, 6, 152, 31, 295, 127, 711, 5, 271, 291, 11...
5 [363, 121, 727, 249, 483, 122, 241, 494, 555]
7 [112, 20, 41, 9, 104, 131, 26, 298, 65, 214, 1...
9 [129, 797, 19, 151, 448, 47, 19, 106, 299, 144...
11 [72, 35, 25, 200, 122, 5, 75, 30, 208, 24, 14,...
18 [137, 339, 71, 14, 19, 54, 61, 15, 73, 104, 43...
>>> print(series2)
0 0
1 3
4 1
5 6
7 4
9 5
11 7
18 2
我的期望:
>>> print(series3)
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6...
1 [147, 14, 20, 56, 288, 12, 208, 26...
4 [6, 152, 31, 295, 127, 711, 5, 271, 291, 11...
5 [241, 494, 555]
7 [104, 131, 26, 298, 65, 214, 1...
9 [47, 19, 106, 299, 144...
11 [30, 208, 24, 14,...
18 [71, 14, 19, 54, 61, 15, 73, 104, 43...
我的解决方案1:
由于 series1 和 series2 的长度相等,我可以制作一个 for 循环来迭代 series1 并计算类似 series1.ix[i][series2.ix[i]]
的东西并制作一个新系列(series3)来保存结果。
我的方案2:
使用 df = pd_concat([series1, series2])
生成数据帧 df,并创建一个新列(使用应用函数进行逐行操作 - 例如,df['series3'] = df.apply(lambda x: subList(x),轴=1).
但是,我认为以上两种解决方案都不是实现我想要的东西的好方法。如果您提出更简洁的解决方案,我将不胜感激!
您基本上可以连接指定轴(0=行,1 列)的系列,最好具有相同的长度
series3=pd.concat([series2, series1], axis=1).reset_index()
如果您希望避免创建中间 pd.DataFrame
,而只想创建一个新的 pd.Series
,您可以在 map
对象上使用 pd.Series
构造函数。所以给出:
In [6]: S1
Out[6]:
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6]
1 [64, 80, 79, 147, 14, 20, 56, 288, 12, 208, 26]
2 [5, 6, 152, 31, 295, 127, 711, 5, 271, 291, 11]
3 [363, 121, 727, 249, 483, 122, 241, 494, 555]
4 [112, 20, 41, 9, 104, 131, 26, 298, 65, 214, 1]
5 [129, 797, 19, 151, 448, 47, 19, 106, 299, 144]
6 [72, 35, 25, 200, 122, 5, 75, 30, 208, 24, 14]
7 [137, 339, 71, 14, 19, 54, 61, 15, 73, 104, 43]
dtype: object
In [7]: S2
Out[7]:
0 0
1 3
2 1
3 6
4 4
5 5
6 7
7 2
dtype: int64
你可以这样做:
In [8]: pd.Series(map(lambda x,y : x[y:], S1, S2), index=S1.index)
Out[8]:
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6]
1 [147, 14, 20, 56, 288, 12, 208, 26]
2 [6, 152, 31, 295, 127, 711, 5, 271, 291, 11]
3 [241, 494, 555]
4 [104, 131, 26, 298, 65, 214, 1]
5 [47, 19, 106, 299, 144]
6 [30, 208, 24, 14]
7 [71, 14, 19, 54, 61, 15, 73, 104, 43]
dtype: object
如果您想修改S1
而不创建中间容器,您可以使用for循环:
In [10]: for i, x in enumerate(map(lambda x,y : x[y:], S1, S2)):
...: S1.iloc[i] = x
...:
In [11]: S1
Out[11]:
0 [481, 12, 11, 220, 24, 24, 645, 153, 15, 13, 6]
1 [147, 14, 20, 56, 288, 12, 208, 26]
2 [6, 152, 31, 295, 127, 711, 5, 271, 291, 11]
3 [241, 494, 555]
4 [104, 131, 26, 298, 65, 214, 1]
5 [47, 19, 106, 299, 144]
6 [30, 208, 24, 14]
7 [71, 14, 19, 54, 61, 15, 73, 104, 43]
dtype: object