groupby 并根据一列中的字符串进行排名

groupby and ranking based on the string in one column

我正在处理一个数据框,其中包含 70 多个动作。我有一列将这 70 个动作分组。我想创建一个新列,它是现有列中字符串的等级。以下数据框示例:

DF = pd.DataFrame()
DF ['template']= ['Attk','Attk','Attk','Attk','Attk','Attk','Def','Def','Def','Def','Def','Def','Accuracy','Accuracy','Accuracy','Accuracy','Accuracy','Accuracy']
DF ['Stats'] = ['Goal','xG','xA','Goal','xG','xA','Block','interception','tackles','Block','interception','tackles','Acc.passes','Acc.actions','Acc.crosses','Acc.passes','Acc.actions','Acc.crosses']
DF=DF.sort_values(['template','Stats'])

我想创建的新列是 groupby [模板] 并按统计字母顺序排列。

预期数据框如下:

我在每个模板下都有 10 到 15 个统计信息。

使用GroupBy.transform with lambda function and factorize,也是因为python从0算起添加了1:

f = lambda x: pd.factorize(x)[0]
DF['Order'] = DF.groupby('template')['Stats'].transform(f) + 1
print (DF)
    template         Stats  Order
13  Accuracy   Acc.actions      1
16  Accuracy   Acc.actions      1
14  Accuracy   Acc.crosses      2
17  Accuracy   Acc.crosses      2
12  Accuracy    Acc.passes      3
15  Accuracy    Acc.passes      3
0       Attk          Goal      1
3       Attk          Goal      1
2       Attk            xA      2
5       Attk            xA      2
1       Attk            xG      3
4       Attk            xG      3
6        Def         Block      1
9        Def         Block      1
7        Def  interception      2
10       Def  interception      2
8        Def       tackles      3
11       Def       tackles      3