按R中的多列排名
Rank by multiple columns in R
尝试在 2 列上创建排名指标,在本例中为帐户和 DATE。
例如:
df <- data.frame(
Account = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3),
DATE = c(201901, 201902, 201903, 201904, 201902, 201903, 201904, 201905, 201906, 201907, 201904, 201905))
> df
Account DATE
1 201901
1 201902
1 201903
1 201904
2 201902
2 201903
2 201904
2 201905
2 201906
2 201907
3 201904
3 201905
我试过使用 rank 和 order,以及 rank(rank()) 和 order(order()) 但没有成功
df <- df %>%
mutate("rank" = rank(Account, DATE))
Account DATE rank
1 201901 2.5
1 201902 2.5
1 201903 2.5
1 201904 2.5
2 201902 7.5
2 201903 7.5
2 201904 7.5
2 201905 7.5
2 201906 7.5
2 201907 7.5
3 201904 11.5
3 201905 11.5
但我想要的是将日期降序排列,但按每个帐户来看,它应该如下所示:
Account DATE RANK
1 201901 4
1 201902 3
1 201903 2
1 201904 1
2 201902 6
2 201903 5
2 201904 4
2 201905 3
2 201906 2
2 201907 1
3 201904 2
3 201905 1
library("dplyr")
df %>%
group_by(Account) %>%
mutate("rank" = rank(DATE))
#> # A tibble: 12 x 3
#> # Groups: Account [3]
#> Account DATE rank
#> <dbl> <dbl> <dbl>
#> 1 1 201901 1
#> 2 1 201902 2
#> 3 1 201903 3
#> 4 1 201904 4
#> 5 2 201902 1
#> 6 2 201903 2
#> 7 2 201904 3
#> 8 2 201905 4
#> 9 2 201906 5
#> 10 2 201907 6
#> 11 3 201904 1
#> 12 3 201905 2
由 reprex package (v0.3.0.9001)
于 2020 年 3 月 9 日创建
我们可以使用降序 order
来创建排名:
library(dplyr)
df %>%
group_by(Account) %>%
mutate("rank" = order(DATE, decreasing = TRUE))
输出:
# A tibble: 12 x 3
# Groups: Account [3]
Account DATE rank
<dbl> <dbl> <int>
1 1 201901 4
2 1 201902 3
3 1 201903 2
4 1 201904 1
5 2 201902 6
6 2 201903 5
7 2 201904 4
8 2 201905 3
9 2 201906 2
10 2 201907 1
11 3 201904 2
12 3 201905 1
给你:
df <- df %>% group_by(Account) %>% mutate(ranking = rank(DATE))
在基地 R
sortdata <- lapply(1:3,grep,df[,1])
for(i in sortdata){
df[i,3] <- order(df[i,2],decreasing=T)
}
尝试在 2 列上创建排名指标,在本例中为帐户和 DATE。
例如:
df <- data.frame(
Account = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3),
DATE = c(201901, 201902, 201903, 201904, 201902, 201903, 201904, 201905, 201906, 201907, 201904, 201905))
> df
Account DATE
1 201901
1 201902
1 201903
1 201904
2 201902
2 201903
2 201904
2 201905
2 201906
2 201907
3 201904
3 201905
我试过使用 rank 和 order,以及 rank(rank()) 和 order(order()) 但没有成功
df <- df %>%
mutate("rank" = rank(Account, DATE))
Account DATE rank
1 201901 2.5
1 201902 2.5
1 201903 2.5
1 201904 2.5
2 201902 7.5
2 201903 7.5
2 201904 7.5
2 201905 7.5
2 201906 7.5
2 201907 7.5
3 201904 11.5
3 201905 11.5
但我想要的是将日期降序排列,但按每个帐户来看,它应该如下所示:
Account DATE RANK
1 201901 4
1 201902 3
1 201903 2
1 201904 1
2 201902 6
2 201903 5
2 201904 4
2 201905 3
2 201906 2
2 201907 1
3 201904 2
3 201905 1
library("dplyr")
df %>%
group_by(Account) %>%
mutate("rank" = rank(DATE))
#> # A tibble: 12 x 3
#> # Groups: Account [3]
#> Account DATE rank
#> <dbl> <dbl> <dbl>
#> 1 1 201901 1
#> 2 1 201902 2
#> 3 1 201903 3
#> 4 1 201904 4
#> 5 2 201902 1
#> 6 2 201903 2
#> 7 2 201904 3
#> 8 2 201905 4
#> 9 2 201906 5
#> 10 2 201907 6
#> 11 3 201904 1
#> 12 3 201905 2
由 reprex package (v0.3.0.9001)
于 2020 年 3 月 9 日创建我们可以使用降序 order
来创建排名:
library(dplyr)
df %>%
group_by(Account) %>%
mutate("rank" = order(DATE, decreasing = TRUE))
输出:
# A tibble: 12 x 3
# Groups: Account [3]
Account DATE rank
<dbl> <dbl> <int>
1 1 201901 4
2 1 201902 3
3 1 201903 2
4 1 201904 1
5 2 201902 6
6 2 201903 5
7 2 201904 4
8 2 201905 3
9 2 201906 2
10 2 201907 1
11 3 201904 2
12 3 201905 1
给你:
df <- df %>% group_by(Account) %>% mutate(ranking = rank(DATE))
在基地 R
sortdata <- lapply(1:3,grep,df[,1])
for(i in sortdata){
df[i,3] <- order(df[i,2],decreasing=T)
}