在多索引 pandas DataFrame 中打开 'pretty viewing'
Turning on 'pretty viewing' in multi-indexed pandas DataFrame
我有一个多索引 pandas 数据框,如下所示:
RFI
(Smad3_pS423/425_customer, 0, 1) 0.664263
(Smad3_pS423/425_customer, 0, 2) 0.209911
(Smad3_pS423/425_customer, 0, 3) 0.099809
(Smad3_pS423/425_customer, 5, 1) 0.059652
我在文档中看到 'sparce-ified' 非常漂亮的多索引数据框。所以在这种情况下,它看起来像这样:
RFI
Smad3_pS423/425_customer 0 1 0.664263
2 0.209911
3 0.099809
5 1 0.059652
有人知道如何打开这个选项吗?我试过 pandas.set_option('display.multi_sparse', True)
但这没有用。
=============编辑=============================== ===
创建我使用的多索引:
df.index=df[['Antibody','Time','Repeats']]
df.drop(['Antibody','Time','Repeats'],axis=1,inplace=True)
当我使用 df.index 时,我得到以下输出:
Index([ (u'Smad3_pS423/425_customer', u'0', u'1'),
(u'Smad3_pS423/425_customer', u'0', u'2'),
(u'Smad3_pS423/425_customer', u'0', u'3'),
(u'Smad3_pS423/425_customer', u'5', u'1'),
(u'Smad3_pS423/425_customer', u'5', u'2'),
(u'Smad3_pS423/425_customer', u'5', u'3'),
(u'Smad3_pS423/425_customer', u'10', u'1'),
(u'Smad3_pS423/425_customer', u'10', u'2'),
(u'Smad3_pS423/425_customer', u'10', u'3'),
(u'Smad3_pS423/425_customer', u'20', u'1'),
...
(u'a-Tubulin', u'120', u'3'),
(u'a-Tubulin', u'180', u'1'),
(u'a-Tubulin', u'180', u'2'),
(u'a-Tubulin', u'180', u'3'),
(u'a-Tubulin', u'240', u'1'),
(u'a-Tubulin', u'240', u'2'),
(u'a-Tubulin', u'240', u'3'),
(u'a-Tubulin', u'300', u'1'),
(u'a-Tubulin', u'300', u'2'),
(u'a-Tubulin', u'300', u'3')],
dtype='object', length=216)
可以使用MultiIndex.from_tuples
,因为index
包含tuples
:
df = pd.DataFrame({'RFI':[0.664263, 0.209911, 0.099809, 0.059652]},
index=[('Smad3_pS423/425_customer', 0, 1),
('Smad3_pS423/425_customer', 0, 2),
('Smad3_pS423/425_customer', 0, 3),
('Smad3_pS423/425_customer', 5, 1) ])
print (df)
RFI
(Smad3_pS423/425_customer, 0, 1) 0.664263
(Smad3_pS423/425_customer, 0, 2) 0.209911
(Smad3_pS423/425_customer, 0, 3) 0.099809
(Smad3_pS423/425_customer, 5, 1) 0.059652
df.index = pd.MultiIndex.from_tuples(df.index)
print (df)
RFI
Smad3_pS423/425_customer 0 1 0.664263
2 0.209911
3 0.099809
5 1 0.059652
编辑:
看来你需要set_index
:
df = df.set_index(['Antibody','Time','Repeats'])
我有一个多索引 pandas 数据框,如下所示:
RFI
(Smad3_pS423/425_customer, 0, 1) 0.664263
(Smad3_pS423/425_customer, 0, 2) 0.209911
(Smad3_pS423/425_customer, 0, 3) 0.099809
(Smad3_pS423/425_customer, 5, 1) 0.059652
我在文档中看到 'sparce-ified' 非常漂亮的多索引数据框。所以在这种情况下,它看起来像这样:
RFI
Smad3_pS423/425_customer 0 1 0.664263
2 0.209911
3 0.099809
5 1 0.059652
有人知道如何打开这个选项吗?我试过 pandas.set_option('display.multi_sparse', True)
但这没有用。
=============编辑=============================== ===
创建我使用的多索引:
df.index=df[['Antibody','Time','Repeats']]
df.drop(['Antibody','Time','Repeats'],axis=1,inplace=True)
当我使用 df.index 时,我得到以下输出:
Index([ (u'Smad3_pS423/425_customer', u'0', u'1'),
(u'Smad3_pS423/425_customer', u'0', u'2'),
(u'Smad3_pS423/425_customer', u'0', u'3'),
(u'Smad3_pS423/425_customer', u'5', u'1'),
(u'Smad3_pS423/425_customer', u'5', u'2'),
(u'Smad3_pS423/425_customer', u'5', u'3'),
(u'Smad3_pS423/425_customer', u'10', u'1'),
(u'Smad3_pS423/425_customer', u'10', u'2'),
(u'Smad3_pS423/425_customer', u'10', u'3'),
(u'Smad3_pS423/425_customer', u'20', u'1'),
...
(u'a-Tubulin', u'120', u'3'),
(u'a-Tubulin', u'180', u'1'),
(u'a-Tubulin', u'180', u'2'),
(u'a-Tubulin', u'180', u'3'),
(u'a-Tubulin', u'240', u'1'),
(u'a-Tubulin', u'240', u'2'),
(u'a-Tubulin', u'240', u'3'),
(u'a-Tubulin', u'300', u'1'),
(u'a-Tubulin', u'300', u'2'),
(u'a-Tubulin', u'300', u'3')],
dtype='object', length=216)
可以使用MultiIndex.from_tuples
,因为index
包含tuples
:
df = pd.DataFrame({'RFI':[0.664263, 0.209911, 0.099809, 0.059652]},
index=[('Smad3_pS423/425_customer', 0, 1),
('Smad3_pS423/425_customer', 0, 2),
('Smad3_pS423/425_customer', 0, 3),
('Smad3_pS423/425_customer', 5, 1) ])
print (df)
RFI
(Smad3_pS423/425_customer, 0, 1) 0.664263
(Smad3_pS423/425_customer, 0, 2) 0.209911
(Smad3_pS423/425_customer, 0, 3) 0.099809
(Smad3_pS423/425_customer, 5, 1) 0.059652
df.index = pd.MultiIndex.from_tuples(df.index)
print (df)
RFI
Smad3_pS423/425_customer 0 1 0.664263
2 0.209911
3 0.099809
5 1 0.059652
编辑:
看来你需要set_index
:
df = df.set_index(['Antibody','Time','Repeats'])