有没有办法在不重建的情况下更新 pandas.pivot_table？

Question

我有一个学生 table，其中包含 student_id、course_id 和 exam_time（10k 行）。我以 student_id 和 exam_time 为中心来获取一次或一天中的考试次数。我正在构建一个时间表启发式算法，一次更改一次考试的时间，因此我需要多次更新此枢轴 table。一门课程考试时间的变化平均影响原始数据框中的 50 行。有没有一种方法可以更新结果枢轴 table 而无需重新计算 pandas 中的整个内容，或者我应该自己跟踪枢轴 table 上的变化（即通过加减 1 到更改的插槽）？

编辑：这是我构建枢轴的方法 table。我添加了一列来计算 np.sum 的数字。我找不到另一个运行速度更快的函数。

sLength = len(df["student_id"])
df["ones"] = pd.Series(np.ones(sLength))
pivot_table = pd.pivot_table(df, rows = "student_id", cols = "exam_time", values = "ones", aggfunc = np.sum)

对于考试时间的变化，我写了这个（假设changed_course从old_slot移动到new_slot）

affected_students = df[df["course_id"] == changed_course]["student_id"]
pivot_table[old_slot][affected_students] -= 1
pivot_table[new_slot][affected_students] += 1

Answer 1

这里是示例代码，想法是通过减去旧行的主元 table 并添加新行的主元 table 来更新总主元 table。

所以每次更改数据时，调用两次pivot_table()，一次add()和一次sub():

import numpy as np
import pandas as pd

### create random data
N = 1000
a = np.random.randint(0, 100, N)
b = np.random.randint(0, 30, N)
c = np.random.randint(0, 10, N)

df = pd.DataFrame({"a":a, "b":b, "c":c})

### calculate pivot sum
res = df.pivot_table(values="c", index="a", columns="b", aggfunc="sum", fill_value=0)

### create random rows to change
M = 100
row_index = np.unique(np.random.randint(0, N, M))
old_rows = df.iloc[row_index]
M = old_rows.shape[0]
new_rows = pd.DataFrame({"a":np.random.randint(0, 100, M), 
                         "b":np.random.randint(0, 30, M),
                         "c":np.random.randint(0, 10, M)})

### update pivot table
sub_df = old_rows.pivot_table(values="c", index="a", columns="b", aggfunc="sum", fill_value=0)
add_df = new_rows.pivot_table(values="c", index="a", columns="b", aggfunc="sum", fill_value=0)
new_res = res.sub(sub_df, fill_value=0).add(add_df, fill_value=0)

### check result
df.iloc[row_index] = new_rows.values
res2 = df.pivot_table(values="c", index="a", columns="b", aggfunc="sum", fill_value=0)
print new_res.astype(int).equals(res2)

有没有办法在不重建的情况下更新 pandas.pivot_table？

Is there a way to update pandas.pivot_table without reconstructing it?

python

pivot-table

pandas