由于截断，使用 "group by" 的 "describe" 的 Panda 输出不完整

Question

我正在使用 Python 3 和 Pandas 进行数据科学项目。但是，我在使用 panda 语法时遇到了一些问题。

下面的代码做了一些接近我想要的事情：

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('breast-cancer-wisconsin.data.txt')

print (df.groupby('class').describe())

我从 link 那里得到了关于乳腺癌的数据。包含我正在使用的数据的特定文件是 breast-cancer-wisconsin.data.

它returns:

      bland_chrom                                                \
            count      mean       std  min  25%  50%  75%   max   
class                                                             
2           458.0  2.100437  1.080339  1.0  1.0  2.0  3.0   7.0   
4           241.0  5.979253  2.273852  1.0  4.0  7.0  7.0  10.0   

      clump_thickness            ...  unif_cel_shape       unif_cel_size  \
                count      mean  ...             75%   max         count   
class                            ...                                       
2               458.0  2.956332  ...             1.0   8.0         458.0   
4               241.0  7.195021  ...             9.0  10.0         241.0   


           mean       std  min  25%  50%   75%   max  
class                                                 
2      1.325328  0.907694  1.0  1.0  1.0   1.0   9.0  
4      6.572614  2.719512  1.0  4.0  6.0  10.0  10.0  

[2 rows x 72 columns]

尽管如此，这并不是完整的输出。三个连续的点...表示由于截断，有些东西被隐藏了。

我怎样才能得到完整的结果？

谢谢。

Answer 1

我不确定这是否是解决问题的最佳技术方法，但为了避免截断我这样做了：

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('breast-cancer-wisconsin.data.txt')

pd.options.display.max_columns = 999

print (df.groupby('class').describe())

哪个returns正确输出：

      bland_chrom                                                \
            count      mean       std  min  25%  50%  75%   max   
class                                                             
2           458.0  2.100437  1.080339  1.0  1.0  2.0  3.0   7.0   
4           241.0  5.979253  2.273852  1.0  4.0  7.0  7.0  10.0   

      clump_thickness                                                    id  \
                count      mean       std  min  25%  50%   75%   max  count   
class                                                                         
2               458.0  2.956332  1.674318  1.0  1.0  3.0   4.0   8.0  458.0   
4               241.0  7.195021  2.428849  1.0  5.0  8.0  10.0  10.0  241.0   

                                                                               \
               mean            std      min         25%        50%        75%   
class                                                                           
2      1.107591e+06  723431.757966  61634.0  1002614.25  1180170.5  1256870.5   
4      1.003505e+06  322232.308608  63375.0   832226.00  1126417.0  1221863.0   

                  marg_adhesion                                                \
              max         count      mean       std  min  25%  50%  75%   max   
class                                                                           
2      13454352.0         458.0  1.364629  0.996830  1.0  1.0  1.0  1.0  10.0   
4       1371026.0         241.0  5.547718  3.210465  1.0  3.0  5.0  8.0  10.0   

      mitoses                                               norm_nucleoli  \
        count      mean       std  min  25%  50%  75%   max         count   
class                                                                       
2       458.0  1.063319  0.501995  1.0  1.0  1.0  1.0   8.0         458.0   
4       241.0  2.589212  2.557939  1.0  1.0  1.0  3.0  10.0         241.0   

                                                     single_epith_cell_size  \
           mean       std  min  25%  50%   75%   max                  count   
class                                                                         
2      1.290393  1.058856  1.0  1.0  1.0   1.0   9.0                  458.0   
4      5.863071  3.350672  1.0  3.0  6.0  10.0  10.0                  241.0   

                                                    unif_cel_shape            \
           mean       std  min  25%  50%  75%   max          count      mean   
class                                                                          
2      2.120087  0.917130  1.0  2.0  2.0  2.0  10.0          458.0  1.443231   
4      5.298755  2.451606  1.0  3.0  5.0  6.0  10.0          241.0  6.560166   

                                          unif_cel_size                      \
            std  min  25%  50%  75%   max         count      mean       std   
class                                                                         
2      0.997836  1.0  1.0  1.0  1.0   8.0         458.0  1.325328  0.907694   
4      2.562045  1.0  4.0  6.0  9.0  10.0         241.0  6.572614  2.719512   


       min  25%  50%   75%   max  
class                             
2      1.0  1.0  1.0   1.0   9.0  
4      1.0  4.0  6.0  10.0  10.0

由于截断，使用 "group by" 的 "describe" 的 Panda 输出不完整

Panda's output for "describe" using "group by" is not complete due to truncation

python

truncation

pandas

pandas-groupby