正确处理 pandas 数据框的缺失值和格式打印以制表

Properly handling missing values and formatting for pandas data frame printed to tabulate

在下面的场景中,我想:

问题

我似乎只能实现其中一个目标。如果我使用下面的代码,我可以在格式方面达到预期的效果;但是缺失值打印为 nan

"""Handling Missing Data in Pandas / Tabulate
"""

import pandas as pd
from tabulate import tabulate
import seaborn as sns
import numpy as np

# Create sample data
iris_data = sns.load_dataset('iris')
# Derive summary table
iris_summary = pd.DataFrame.describe(iris_data, percentiles=[]).transpose()
# Add missing values
iris_summary.iloc[0, 1:6] = None


# Show missing data
print(tabulate(iris_summary, missingval="-",
               floatfmt=(".0f", ".0f", ".3f", ".1f", ".4f", ".1f", ".0f")))

结果

------------  ---  -------  -----  --------  -----  ---
sepal_length  150  nan      nan    nan       nan    nan
sepal_width   150    3.057    0.4    2.0000    3.0    4
petal_length  150    3.758    1.8    1.0000    4.3    7
petal_width   150    1.199    0.8    0.1000    1.3    2
------------  ---  -------  -----  --------  -----  ---

尝试 1

我试过替换缺失值

iris_summary.replace(np.nan, "", inplace=True)

但由于数字格式丢失,结果并不令人满意:

------------  ---  ------------------  ------------------  ---  ----  ---
sepal_length  150
sepal_width   150  3.0573333333333337  0.4358662849366982  2.0  3.0   4.4
petal_length  150  3.7580000000000005  1.7652982332594662  1.0  4.35  6.9
petal_width   150  1.1993333333333336  0.7622376689603465  0.1  1.3   2.5
------------  ---  ------------------  ------------------  ---  ----  ---

想要的结果

我想到达看起来像休闲地的 table:

------------  ---  -------  -----  --------  -----  ---
sepal_length  150    -        -      -         -      -
sepal_width   150    3.057    0.4    2.0000    3.0    4
petal_length  150    3.758    1.8    1.0000    4.3    7
petal_width   150    1.199    0.8    0.1000    1.3    2
------------  ---  -------  -----  --------  -----  ---

备注

使用replace:

print(tabulate(iris_summary.replace(np.nan, None), missingval='-',
               floatfmt=(".0f", ".0f", ".3f", ".1f", ".4f", ".1f", ".0f")))

输出:

------------  ---  -----  ---  ------  ---  -
sepal_length  150  -      -    -       -    -
sepal_width   150  3.057  0.4  2.0000  3.0  4
petal_length  150  3.758  1.8  1.0000  4.3  7
petal_width   150  1.199  0.8  0.1000  1.3  2
------------  ---  -----  ---  ------  ---  -

我认为 missingval 用于 None 值,但 Pandas 将 None 转换为 NaN,其中该列具有 float dtype,因此您必须强制将 nan 替换为 None 以获得预期的输出。