如何使用 python-docx 添加带有索引的数据框

Question

我知道这里已经解决了这个问题（例如，, ）。不过，我希望这个问题有所不同。

我已经使用 value_counts() 并生成了一个 DataFrame，如下所示：

df = sns.load_dataset('tips')

object_cols = list(df.select_dtypes(exclude=['int', 'float', 'int64', 'float64', 'int32', 'float32']).columns)

# Value Count & Percentage for object columns
c = df[object_cols].apply(lambda x: x.value_counts()).T.stack().astype(int)
p = (df[object_cols].apply(lambda x: x.value_counts(normalize=True)).T.stack() * 100).round(2)
cp = pd.concat([c,p], axis=1, keys=['Count', 'Percentage %'])

cp

DataFrame 看起来像：

                 Count  Percentage %
sex      Female     87  35.66
         Male      157  64.34
smoker   No        151  61.89
         Yes       93   38.11
day      Fri       19   7.79
         Sat       87   35.66
         Sun       76   31.15
         Thur      62   25.41
time     Dinner   176   72.13
         Lunch     68   27.87

我正在尝试使用 python-docx

在文档中将上述 DataFrame 添加为 table

import docx 
from docx import Document

doc = Document()
doc.add_paragraph("Value Counts: ")

t = doc.add_table(cp.shape[0]+1, cp.shape[1])

# Set table style
t.style = 'Colorful List Accent 1'

# add the header rows.
for j in range(cp.shape[-1]):
    t.cell(0,j).text = cp.columns[j]

# add the rest of the data frame
for i in range(cp.shape[0]):
    for j in range(cp.shape[-1]):
        t.cell(i+1,j).text = str(cp.values[i,j])
        
filename = "output/ValueCOunts_Report.docx"
# save the docx
doc.save(filename)

我可以将 table 添加为

Count   Percentage %
87      35.66
157     64.34
151     61.89
.....
.....
.....

如何将带有索引的完整 DataFrame 作为 table 添加到文档中？

Answer 1

这是一个有点 hacky 的解决方案，因为它将索引引入列并将列操作为看起来像索引：

重置索引并利用series.duplicated和np.where用空白填充列的重复值

cp = cp.rename_axis(['Attr','Val']).reset_index()
cp['Attr'] = np.where(cp['Attr'].duplicated(),'',cp['Attr'])

然后执行您的代码会得到以下输出：

如何使用 python-docx 添加带有索引的数据框

How to add a Data Frame with indexes using python-docx

python

dataframe

python-3.x

pandas

python-docx