子集 pandas df 使用列索引切片的串联
Subset pandas df using concatenation of column indices slices
我有一个大型数据框,我试图仅使用列索引对其进行子集化。我正在使用以下代码:
df = df.ix[:, [3,21:28,30:34,36:57,61:64,67:]]
代码很容易解释。我试图通过保留第 3、21 到 28 列等来对 df 进行子集化。但是,我收到以下错误:
File "<ipython-input-44-3108b602b220>", line 1
df = df.ix[:, [3,21:28,30:34,36:57,61:64,67:]]
^
SyntaxError: invalid syntax
我错过了什么?
df = df.iloc[:, np.r_[3,21:28,30:34,36:57,61:64,67:df.shape[1]]]
演示:
In [39]: df = pd.DataFrame(np.random.randint(5, size=(2, 100)))
In [40]: df
Out[40]:
0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
0 3 1 0 3 2 4 1 2 1 3 ... 2 1 4 2 1 2 1 3 3 4
1 0 2 4 1 1 1 0 0 3 4 ... 4 4 0 3 2 3 0 2 0 1
[2 rows x 100 columns]
In [41]: df.iloc[:, np.r_[3,21:28,30:34,36:57,61:64,67:df.shape[1]]]
Out[41]:
3 21 22 23 24 25 26 27 30 31 ... 90 91 92 93 94 95 96 97 98 99
0 3 4 1 2 0 3 0 3 2 2 ... 2 1 4 2 1 2 1 3 3 4
1 1 1 0 2 1 4 4 4 1 3 ... 4 4 0 3 2 3 0 2 0 1
[2 rows x 69 columns]
我有一个大型数据框,我试图仅使用列索引对其进行子集化。我正在使用以下代码:
df = df.ix[:, [3,21:28,30:34,36:57,61:64,67:]]
代码很容易解释。我试图通过保留第 3、21 到 28 列等来对 df 进行子集化。但是,我收到以下错误:
File "<ipython-input-44-3108b602b220>", line 1
df = df.ix[:, [3,21:28,30:34,36:57,61:64,67:]]
^
SyntaxError: invalid syntax
我错过了什么?
df = df.iloc[:, np.r_[3,21:28,30:34,36:57,61:64,67:df.shape[1]]]
演示:
In [39]: df = pd.DataFrame(np.random.randint(5, size=(2, 100)))
In [40]: df
Out[40]:
0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
0 3 1 0 3 2 4 1 2 1 3 ... 2 1 4 2 1 2 1 3 3 4
1 0 2 4 1 1 1 0 0 3 4 ... 4 4 0 3 2 3 0 2 0 1
[2 rows x 100 columns]
In [41]: df.iloc[:, np.r_[3,21:28,30:34,36:57,61:64,67:df.shape[1]]]
Out[41]:
3 21 22 23 24 25 26 27 30 31 ... 90 91 92 93 94 95 96 97 98 99
0 3 4 1 2 0 3 0 3 2 2 ... 2 1 4 2 1 2 1 3 3 4
1 1 1 0 2 1 4 4 4 1 3 ... 4 4 0 3 2 3 0 2 0 1
[2 rows x 69 columns]