使用 pivot 转换数据框

Question

我正在尝试使用 pivot 转换数据框。由于该列包含重复条目，我尝试在建议之后添加一个 count 列（此答案中发布的问题 10）。

import pandas as pd
from pprint import pprint


if __name__ == '__main__':
    d = {
        't': [0, 1, 2, 0, 1, 2, 0, 2, 0, 1],
        'input': [2, 2, 2, 2, 2, 2, 4, 4, 4, 4],
        'type': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'B', 'B'],
        'value': [0.1, 0.2, 0.3, 1, 2, 3, 1, 2, 1, 1],
    }
    df = pd.DataFrame(d)
    df = df.drop('t', axis=1)
    df.insert(0, 'count', df.groupby('input').cumcount())
    pd.pivot(df, index='count', columns='type', values='value')

但我仍然得到同样的错误 raise ValueError("Index contains duplicate entries, cannot reshape") ValueError: Index contains duplicate entries, cannot reshape。

有人可以建议如何解决此错误吗？

Answer 1

只要您有多个与 'A' 和 'B' 关联的值，您就必须以某种方式聚合这些值。

因此，如果我理解您的问题，那么可能的解决方案如下：

#pip install pandas

import pandas as pd

d = {
        't': [0, 1, 2, 0, 1, 2, 0, 2, 0, 1],
        'input': [2, 2, 2, 2, 2, 2, 4, 4, 4, 4],
        'type': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'B', 'B'],
        'value': [0.1, 0.2, 0.3, 1, 2, 3, 1, 2, 1, 1],
    }

df = pd.DataFrame(d)
df

# I've used aggfunc='sum' argument for example, the default value is 'mean'
pd.pivot_table(df, index='t', columns='type', values='value', aggfunc='sum')

Returns

使用 pivot 转换数据框

Transform a dataframe using pivot

pivot

dataframe

python-3.x

pandas

pandas-groupby