根据另一列的差异创建新列

Question

我有如下数据框：

d = {'col1': ["A", "A", "A", "B", "B", "B"], 'col2': [2015, 2016, 2017, 2015, 2016, 2017], 'col3': [10, 20, 25, 10, 12, 14]}

我想通过 col1 和 col2 得到 col3 的区别。这样，col1 代表公司，而 col2 年份和 col3 股票价格。我试图获得每年的股票价格差异。这样，输出应该是：

d2= {'col4': ['nan', 10, 5, 'nan', 2, 4]}

提前感谢您的建议。

注意：我们无法重新索引数据框，我们有 col5、col6、col7...等等。许多其他列。

Answer 1

groupby + diff:

df.groupby('col1').col3.transform('diff')

0     NaN
1    10.0
2     5.0
3     NaN
4     2.0
5     2.0
Name: col3, dtype: float64

Creating a new column based on the difference of another column