根据另一列的差异值创建一列

Create a column based on dif values from another column

我有这样一个数据框:

|    | Vowel   |   Number |
|---:|:--------|---------:|
|  0 | a       |        2 |
|  1 | b       |        3 |
|  2 | c       |        4 |
|  3 | a       |        4 |
|  4 | a       |        8 |
|  5 | b       |        2 |
|  6 | c       |        5 |
|  7 | c       |        9 |

我想创建一个包含基于列元音和数字的差异值的列。我想要这个输出:

|    | Vowel   |   Number |   Diff |
|---:|:--------|---------:|-------:|
|  0 | a       |        2 |    nan |
|  1 | b       |        3 |    nan |
|  2 | c       |        4 |    nan |
|  3 | a       |        4 |      2 |
|  4 | a       |        8 |      4 |
|  5 | b       |        2 |     -1 |
|  6 | c       |        5 |      1 |
|  7 | c       |        9 |      4 |

因此,在元音列中查找值 'a',第一个 'a' 得到值 nan,因为之前的列 'Number' 上没有值。第二个 'a' 得到值 2,因为 4 - 2 = 2。(数字列)。

我正在做这样的事情:

for i in list(set(df['Vowel'])):
    one_vowel = df[df['Vowel'] == i]
    for n in one_vowel['Number'].diff():
        print(f'{i} and {n}')

结果:

b and nan
b and -1.0
a and nan
a and 2.0
a and 4.0
c and nan
c and 1.0
c and 4.0

但我想根据列获得正确的顺序。

有人帮帮我吗?

试试这个,

df['Diff'] = df.groupby('Vowel')['Number'].diff()

输出,

0    NaN
1    NaN
2    NaN
3    2.0
4    4.0
5   -1.0
6    1.0
7    4.0
Name: Diff, dtype: float64