由于截断,使用 "group by" 的 "describe" 的 Panda 输出不完整
Panda's output for "describe" using "group by" is not complete due to truncation
我正在使用 Python 3 和 Pandas 进行数据科学项目。但是,我在使用 panda 语法时遇到了一些问题。
下面的代码做了一些接近我想要的事情:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('breast-cancer-wisconsin.data.txt')
print (df.groupby('class').describe())
我从 link 那里得到了关于乳腺癌的数据。包含我正在使用的数据的特定文件是 breast-cancer-wisconsin.data
.
它returns:
bland_chrom \
count mean std min 25% 50% 75% max
class
2 458.0 2.100437 1.080339 1.0 1.0 2.0 3.0 7.0
4 241.0 5.979253 2.273852 1.0 4.0 7.0 7.0 10.0
clump_thickness ... unif_cel_shape unif_cel_size \
count mean ... 75% max count
class ...
2 458.0 2.956332 ... 1.0 8.0 458.0
4 241.0 7.195021 ... 9.0 10.0 241.0
mean std min 25% 50% 75% max
class
2 1.325328 0.907694 1.0 1.0 1.0 1.0 9.0
4 6.572614 2.719512 1.0 4.0 6.0 10.0 10.0
[2 rows x 72 columns]
尽管如此,这并不是完整的输出。三个连续的点...
表示由于截断,有些东西被隐藏了。
我怎样才能得到完整的结果?
谢谢。
我不确定这是否是解决问题的最佳技术方法,但为了避免截断我这样做了:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('breast-cancer-wisconsin.data.txt')
pd.options.display.max_columns = 999
print (df.groupby('class').describe())
哪个returns正确输出:
bland_chrom \
count mean std min 25% 50% 75% max
class
2 458.0 2.100437 1.080339 1.0 1.0 2.0 3.0 7.0
4 241.0 5.979253 2.273852 1.0 4.0 7.0 7.0 10.0
clump_thickness id \
count mean std min 25% 50% 75% max count
class
2 458.0 2.956332 1.674318 1.0 1.0 3.0 4.0 8.0 458.0
4 241.0 7.195021 2.428849 1.0 5.0 8.0 10.0 10.0 241.0
\
mean std min 25% 50% 75%
class
2 1.107591e+06 723431.757966 61634.0 1002614.25 1180170.5 1256870.5
4 1.003505e+06 322232.308608 63375.0 832226.00 1126417.0 1221863.0
marg_adhesion \
max count mean std min 25% 50% 75% max
class
2 13454352.0 458.0 1.364629 0.996830 1.0 1.0 1.0 1.0 10.0
4 1371026.0 241.0 5.547718 3.210465 1.0 3.0 5.0 8.0 10.0
mitoses norm_nucleoli \
count mean std min 25% 50% 75% max count
class
2 458.0 1.063319 0.501995 1.0 1.0 1.0 1.0 8.0 458.0
4 241.0 2.589212 2.557939 1.0 1.0 1.0 3.0 10.0 241.0
single_epith_cell_size \
mean std min 25% 50% 75% max count
class
2 1.290393 1.058856 1.0 1.0 1.0 1.0 9.0 458.0
4 5.863071 3.350672 1.0 3.0 6.0 10.0 10.0 241.0
unif_cel_shape \
mean std min 25% 50% 75% max count mean
class
2 2.120087 0.917130 1.0 2.0 2.0 2.0 10.0 458.0 1.443231
4 5.298755 2.451606 1.0 3.0 5.0 6.0 10.0 241.0 6.560166
unif_cel_size \
std min 25% 50% 75% max count mean std
class
2 0.997836 1.0 1.0 1.0 1.0 8.0 458.0 1.325328 0.907694
4 2.562045 1.0 4.0 6.0 9.0 10.0 241.0 6.572614 2.719512
min 25% 50% 75% max
class
2 1.0 1.0 1.0 1.0 9.0
4 1.0 4.0 6.0 10.0 10.0
我正在使用 Python 3 和 Pandas 进行数据科学项目。但是,我在使用 panda 语法时遇到了一些问题。
下面的代码做了一些接近我想要的事情:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('breast-cancer-wisconsin.data.txt')
print (df.groupby('class').describe())
我从 link 那里得到了关于乳腺癌的数据。包含我正在使用的数据的特定文件是 breast-cancer-wisconsin.data
.
它returns:
bland_chrom \
count mean std min 25% 50% 75% max
class
2 458.0 2.100437 1.080339 1.0 1.0 2.0 3.0 7.0
4 241.0 5.979253 2.273852 1.0 4.0 7.0 7.0 10.0
clump_thickness ... unif_cel_shape unif_cel_size \
count mean ... 75% max count
class ...
2 458.0 2.956332 ... 1.0 8.0 458.0
4 241.0 7.195021 ... 9.0 10.0 241.0
mean std min 25% 50% 75% max
class
2 1.325328 0.907694 1.0 1.0 1.0 1.0 9.0
4 6.572614 2.719512 1.0 4.0 6.0 10.0 10.0
[2 rows x 72 columns]
尽管如此,这并不是完整的输出。三个连续的点...
表示由于截断,有些东西被隐藏了。
我怎样才能得到完整的结果?
谢谢。
我不确定这是否是解决问题的最佳技术方法,但为了避免截断我这样做了:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('breast-cancer-wisconsin.data.txt')
pd.options.display.max_columns = 999
print (df.groupby('class').describe())
哪个returns正确输出:
bland_chrom \
count mean std min 25% 50% 75% max
class
2 458.0 2.100437 1.080339 1.0 1.0 2.0 3.0 7.0
4 241.0 5.979253 2.273852 1.0 4.0 7.0 7.0 10.0
clump_thickness id \
count mean std min 25% 50% 75% max count
class
2 458.0 2.956332 1.674318 1.0 1.0 3.0 4.0 8.0 458.0
4 241.0 7.195021 2.428849 1.0 5.0 8.0 10.0 10.0 241.0
\
mean std min 25% 50% 75%
class
2 1.107591e+06 723431.757966 61634.0 1002614.25 1180170.5 1256870.5
4 1.003505e+06 322232.308608 63375.0 832226.00 1126417.0 1221863.0
marg_adhesion \
max count mean std min 25% 50% 75% max
class
2 13454352.0 458.0 1.364629 0.996830 1.0 1.0 1.0 1.0 10.0
4 1371026.0 241.0 5.547718 3.210465 1.0 3.0 5.0 8.0 10.0
mitoses norm_nucleoli \
count mean std min 25% 50% 75% max count
class
2 458.0 1.063319 0.501995 1.0 1.0 1.0 1.0 8.0 458.0
4 241.0 2.589212 2.557939 1.0 1.0 1.0 3.0 10.0 241.0
single_epith_cell_size \
mean std min 25% 50% 75% max count
class
2 1.290393 1.058856 1.0 1.0 1.0 1.0 9.0 458.0
4 5.863071 3.350672 1.0 3.0 6.0 10.0 10.0 241.0
unif_cel_shape \
mean std min 25% 50% 75% max count mean
class
2 2.120087 0.917130 1.0 2.0 2.0 2.0 10.0 458.0 1.443231
4 5.298755 2.451606 1.0 3.0 5.0 6.0 10.0 241.0 6.560166
unif_cel_size \
std min 25% 50% 75% max count mean std
class
2 0.997836 1.0 1.0 1.0 1.0 8.0 458.0 1.325328 0.907694
4 2.562045 1.0 4.0 6.0 9.0 10.0 241.0 6.572614 2.719512
min 25% 50% 75% max
class
2 1.0 1.0 1.0 1.0 9.0
4 1.0 4.0 6.0 10.0 10.0