R:选择不同的原始并分离到新的排名集

R: choose different raw and separate to new ranked set

我的数据集如下所示:

Interest    Age     Gender  Scored.Probabilities
AL008       18-24   male    0.211
AL024       25-34   male    0.022
AL008       35-44   female  0.102
AL008       25-34   female  0.002
AL024       13-17   male    0.102
AL035       35-44   female  0.027
AL024       35-44   female  0.051
AL024       55-64   male    0.025
AL024       35-44   male    0.016
AL034       45-54   male    0.021
AL036       35-44   male    0.082

我想选择与 'Interest' 列相同的名称并创建根据 'Scored.Probabilities':

排名的新数据集
Set         Interest    Age     Gender  Scored.Probabilities    rank
1           AL008       18-24   male    0.211                    1
1           AL008       35-44   female  0.102                    2
1           AL008       25-34   female  0.002                    3
2           AL024       13-17   male    0.102                    1
2           AL024       35-44   female  0.051                    2
2           AL024       55-64   male    0.025                    3
2           AL024       25-34   male    0.022                    4
2           AL024       35-44   male    0.016                    5
3           AL034       45-54   male    0.021                    1
4           AL035       35-44   female  0.027                    1
5           AL036       35-44   male    0.082                    1

使用 dplyr 试试这个

library("dplyr")
df <- read.table(text = "Interest    Age     Gender  Scored.Probabilities
AL008       18-24   male    0.211
AL024       25-34   male    0.022
AL008       35-44   female  0.102
AL008       25-34   female  0.002
AL024       13-17   male    0.102
AL035       35-44   female  0.027
AL024       35-44   female  0.051
AL024       55-64   male    0.025
AL024       35-44   male    0.016
AL034       45-54   male    0.021
AL036       35-44   male    0.082" , header = T)

df %>%
  arrange(Interest , desc(Scored.Probabilities)) %>%
  group_by(Interest) %>%
  mutate(rank = row_number())

你可以试试

 library(data.table)
 setDT(df1)[order(-Scored.Probabilities), rank:= 1:.N, Interest][
           order(Interest), Set := .GRP, Interest][order(Interest, rank)]
 #     Interest   Age Gender Scored.Probabilities rank Set
 #1:    AL008 18-24   male                0.211    1   1
 #2:    AL008 35-44 female                0.102    2   1
 #3:    AL008 25-34 female                0.002    3   1
 #4:    AL024 13-17   male                0.102    1   2
 #5:    AL024 35-44 female                0.051    2   2
 #6:    AL024 55-64   male                0.025    3   2
 #7:    AL024 25-34   male                0.022    4   2
 #8:    AL024 35-44   male                0.016    5   2
 #9:    AL034 45-54   male                0.021    1   3
#10:    AL035 35-44 female                0.027    1   4
#11:    AL036 35-44   male                0.082    1   5