如何在一条语句中说明 Python-Pandas describe() 表?
How can I illustrate Python-Pandas describe() tables in one statement?
我正在尝试分析一组数据集。但是,我找不到有效展示的方法。我想也许 groupby()
可以解决它,但我打算一次显示所有表格,但我不知道该如何说明。我的另一个解决方案是在比较中显示每一列;第一,第二,然后是第三。这就是我主要想实现的E.g.:
Mean Std Max Min
First_Result_Set
Second_Result_Set
Third_Result_Set
这是我的另一个解决方案(可能不太好):
Mean Std Max Min
First_Result_Set_first_column
Second_Result_Set_first_column
Third_Result_Set_first_column
任何建议或解决方案都会有所帮助。
代码:
def analyse_data(self, np_array, raw=45, column=3):
df = pd.DataFrame(np_array.reshape(raw, column),
columns=("Time", "Random Score", "AI Score"))
data_result = df.describe()
print(data_result)
return data_result
analyse_cache_ab_classes_depth_5 = file.analyse_data(cache_ab_classes_depth_5)
analyse_cache_ab_classes_depth_4 = file.analyse_data(cache_ab_classes_depth_4)
analyse_cache_ab_classes_depth_3 = file.analyse_data(cache_ab_classes_depth_3)
输出:
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 1.054444 2.355556 12.488889
std 0.423377 2.496867 7.225656
min 0.400000 0.000000 0.000000
25% 0.850000 0.000000 6.000000
50% 0.960000 2.000000 14.000000
75% 1.180000 4.000000 16.000000
max 2.620000 8.000000 28.000000
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 2.021333 5.644444 35.288889
std 0.889095 4.270169 12.764692
min 0.780000 0.000000 12.000000
25% 1.310000 2.000000 28.000000
50% 1.780000 4.000000 34.000000
75% 2.590000 8.000000 42.000000
max 4.220000 18.000000 76.000000
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 0.207333 1.822222 15.333333
std 0.077295 2.124413 6.993503
min 0.110000 0.000000 4.000000
25% 0.150000 0.000000 10.000000
50% 0.180000 2.000000 16.000000
75% 0.250000 2.000000 20.000000
max 0.380000 10.000000 30.000000
考虑将您的 DF 收集到一个小组中:
In [149]: p = pd.Panel({'d1':d1, 'd2':d2, 'd3':d3})
In [150]: p.axes
Out[150]:
[Index(['d1', 'd2', 'd3'], dtype='object'),
Index(['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], dtype='object'),
Index(['Time', 'Random Score', 'AI Score'], dtype='object')]
In [151]: p.loc['d1']
Out[151]:
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 1.054444 2.355556 12.488889
std 0.423377 2.496867 7.225656
min 0.400000 0.000000 0.000000
25% 0.850000 0.000000 6.000000
50% 0.960000 2.000000 14.000000
75% 1.180000 4.000000 16.000000
max 2.620000 8.000000 28.000000
In [152]: p.loc[:, 'mean']
Out[152]:
d1 d2 d3
Time 1.054444 2.021333 0.207333
Random Score 2.355556 5.644444 1.822222
AI Score 12.488889 35.288889 15.333333
In [153]: p.loc[:, :, 'AI Score']
Out[153]:
d1 d2 d3
count 45.000000 45.000000 45.000000
mean 12.488889 35.288889 15.333333
std 7.225656 12.764692 6.993503
min 0.000000 12.000000 4.000000
25% 6.000000 28.000000 10.000000
50% 14.000000 34.000000 16.000000
75% 16.000000 42.000000 20.000000
max 28.000000 76.000000 30.000000
或者您可以构建一个 multi-index DF,类似于以下内容:
In [154]: p.to_frame()
Out[154]:
d1 d2 d3
major minor
count Time 45.000000 45.000000 45.000000
Random Score 45.000000 45.000000 45.000000
AI Score 45.000000 45.000000 45.000000
mean Time 1.054444 2.021333 0.207333
Random Score 2.355556 5.644444 1.822222
AI Score 12.488889 35.288889 15.333333
std Time 0.423377 0.889095 0.077295
Random Score 2.496867 4.270169 2.124413
AI Score 7.225656 12.764692 6.993503
min Time 0.400000 0.780000 0.110000
... ... ... ...
25% AI Score 6.000000 28.000000 10.000000
50% Time 0.960000 1.780000 0.180000
Random Score 2.000000 4.000000 2.000000
AI Score 14.000000 34.000000 16.000000
75% Time 1.180000 2.590000 0.250000
Random Score 4.000000 8.000000 2.000000
AI Score 16.000000 42.000000 20.000000
max Time 2.620000 4.220000 0.380000
Random Score 8.000000 18.000000 10.000000
AI Score 28.000000 76.000000 30.000000
[24 rows x 3 columns]
或
count mean std min 25% 50% 75% max
major minor
d1 Time 45.0 1.054444 0.423377 0.40 0.85 0.96 1.18 2.62
Random Score 45.0 2.355556 2.496867 0.00 0.00 2.00 4.00 8.00
AI Score 45.0 12.488889 7.225656 0.00 6.00 14.00 16.00 28.00
d2 Time 45.0 2.021333 0.889095 0.78 1.31 1.78 2.59 4.22
Random Score 45.0 5.644444 4.270169 0.00 2.00 4.00 8.00 18.00
AI Score 45.0 35.288889 12.764692 12.00 28.00 34.00 42.00 76.00
d3 Time 45.0 0.207333 0.077295 0.11 0.15 0.18 0.25 0.38
Random Score 45.0 1.822222 2.124413 0.00 0.00 2.00 2.00 10.00
AI Score 45.0 15.333333 6.993503 4.00 10.00 16.00 20.00 30.00
我正在尝试分析一组数据集。但是,我找不到有效展示的方法。我想也许 groupby()
可以解决它,但我打算一次显示所有表格,但我不知道该如何说明。我的另一个解决方案是在比较中显示每一列;第一,第二,然后是第三。这就是我主要想实现的E.g.:
Mean Std Max Min
First_Result_Set
Second_Result_Set
Third_Result_Set
这是我的另一个解决方案(可能不太好):
Mean Std Max Min
First_Result_Set_first_column
Second_Result_Set_first_column
Third_Result_Set_first_column
任何建议或解决方案都会有所帮助。 代码:
def analyse_data(self, np_array, raw=45, column=3):
df = pd.DataFrame(np_array.reshape(raw, column),
columns=("Time", "Random Score", "AI Score"))
data_result = df.describe()
print(data_result)
return data_result
analyse_cache_ab_classes_depth_5 = file.analyse_data(cache_ab_classes_depth_5)
analyse_cache_ab_classes_depth_4 = file.analyse_data(cache_ab_classes_depth_4)
analyse_cache_ab_classes_depth_3 = file.analyse_data(cache_ab_classes_depth_3)
输出:
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 1.054444 2.355556 12.488889
std 0.423377 2.496867 7.225656
min 0.400000 0.000000 0.000000
25% 0.850000 0.000000 6.000000
50% 0.960000 2.000000 14.000000
75% 1.180000 4.000000 16.000000
max 2.620000 8.000000 28.000000
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 2.021333 5.644444 35.288889
std 0.889095 4.270169 12.764692
min 0.780000 0.000000 12.000000
25% 1.310000 2.000000 28.000000
50% 1.780000 4.000000 34.000000
75% 2.590000 8.000000 42.000000
max 4.220000 18.000000 76.000000
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 0.207333 1.822222 15.333333
std 0.077295 2.124413 6.993503
min 0.110000 0.000000 4.000000
25% 0.150000 0.000000 10.000000
50% 0.180000 2.000000 16.000000
75% 0.250000 2.000000 20.000000
max 0.380000 10.000000 30.000000
考虑将您的 DF 收集到一个小组中:
In [149]: p = pd.Panel({'d1':d1, 'd2':d2, 'd3':d3})
In [150]: p.axes
Out[150]:
[Index(['d1', 'd2', 'd3'], dtype='object'),
Index(['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max'], dtype='object'),
Index(['Time', 'Random Score', 'AI Score'], dtype='object')]
In [151]: p.loc['d1']
Out[151]:
Time Random Score AI Score
count 45.000000 45.000000 45.000000
mean 1.054444 2.355556 12.488889
std 0.423377 2.496867 7.225656
min 0.400000 0.000000 0.000000
25% 0.850000 0.000000 6.000000
50% 0.960000 2.000000 14.000000
75% 1.180000 4.000000 16.000000
max 2.620000 8.000000 28.000000
In [152]: p.loc[:, 'mean']
Out[152]:
d1 d2 d3
Time 1.054444 2.021333 0.207333
Random Score 2.355556 5.644444 1.822222
AI Score 12.488889 35.288889 15.333333
In [153]: p.loc[:, :, 'AI Score']
Out[153]:
d1 d2 d3
count 45.000000 45.000000 45.000000
mean 12.488889 35.288889 15.333333
std 7.225656 12.764692 6.993503
min 0.000000 12.000000 4.000000
25% 6.000000 28.000000 10.000000
50% 14.000000 34.000000 16.000000
75% 16.000000 42.000000 20.000000
max 28.000000 76.000000 30.000000
或者您可以构建一个 multi-index DF,类似于以下内容:
In [154]: p.to_frame()
Out[154]:
d1 d2 d3
major minor
count Time 45.000000 45.000000 45.000000
Random Score 45.000000 45.000000 45.000000
AI Score 45.000000 45.000000 45.000000
mean Time 1.054444 2.021333 0.207333
Random Score 2.355556 5.644444 1.822222
AI Score 12.488889 35.288889 15.333333
std Time 0.423377 0.889095 0.077295
Random Score 2.496867 4.270169 2.124413
AI Score 7.225656 12.764692 6.993503
min Time 0.400000 0.780000 0.110000
... ... ... ...
25% AI Score 6.000000 28.000000 10.000000
50% Time 0.960000 1.780000 0.180000
Random Score 2.000000 4.000000 2.000000
AI Score 14.000000 34.000000 16.000000
75% Time 1.180000 2.590000 0.250000
Random Score 4.000000 8.000000 2.000000
AI Score 16.000000 42.000000 20.000000
max Time 2.620000 4.220000 0.380000
Random Score 8.000000 18.000000 10.000000
AI Score 28.000000 76.000000 30.000000
[24 rows x 3 columns]
或
count mean std min 25% 50% 75% max
major minor
d1 Time 45.0 1.054444 0.423377 0.40 0.85 0.96 1.18 2.62
Random Score 45.0 2.355556 2.496867 0.00 0.00 2.00 4.00 8.00
AI Score 45.0 12.488889 7.225656 0.00 6.00 14.00 16.00 28.00
d2 Time 45.0 2.021333 0.889095 0.78 1.31 1.78 2.59 4.22
Random Score 45.0 5.644444 4.270169 0.00 2.00 4.00 8.00 18.00
AI Score 45.0 35.288889 12.764692 12.00 28.00 34.00 42.00 76.00
d3 Time 45.0 0.207333 0.077295 0.11 0.15 0.18 0.25 0.38
Random Score 45.0 1.822222 2.124413 0.00 0.00 2.00 2.00 10.00
AI Score 45.0 15.333333 6.993503 4.00 10.00 16.00 20.00 30.00