`case_when` 函数中 `~` 后的条件项
Conditional term after the `~` in a `case_when` function
我想在 case_when
函数中的 ~
之后放置一个条件项。
我的例子:
df:
df <- structure(list(x = c("a", "a", "a", "b", "b", "b", "c", "c",
"c", "a", "a", "a"), y = 1:12), class = "data.frame", row.names = c(NA,
-12L))
无效代码:
library(dplyr)
df %>%
group_by(x) %>%
mutate(y = case_when(x=="b" ~ cumsum(y),
TRUE ~ y)) %>%
mutate(y = case_when(x=="a" ~ "what I want: last value of group "b" in column y",
TRUE ~ y))
换言之:
group_by
x
- 在
y
列中为组 b
计算 cumsum
- 取该组(=b)的最后一个值(=15)和
- 将此值 (=15) 放入组为
a
的列 y
期望的输出:
x y
<chr> <dbl>
1 a 15
2 a 15
3 a 15
4 b 4
5 b 9
6 b 15
7 c 7
8 c 8
9 c 9
10 a 15
11 a 15
12 a 15
非常感谢!!!
只需在计算第二个 mutate
之前添加 ungroup()
并使用 last
和条件来获得最后一个 y
和 x == "b"
library(dplyr)
df %>%
group_by(x) %>%
mutate(y = case_when(x=="b" ~ cumsum(y),
TRUE ~ y)) %>%
# add the ungroup here
ungroup() %>%
# and then the value is like this
mutate(y = case_when(x=="a" ~ last(y[x == "b"]),
TRUE ~ y))
#> # A tibble: 12 x 2
#> x y
#> <chr> <int>
#> 1 a 15
#> 2 a 15
#> 3 a 15
#> 4 b 4
#> 5 b 9
#> 6 b 15
#> 7 c 7
#> 8 c 8
#> 9 c 9
#> 10 a 15
#> 11 a 15
#> 12 a 15
由 reprex package (v2.0.0)
于 2021-04-22 创建
在这种情况下,group_by()
不是必需的(尽管它有助于提高可读性等):
df %>%
mutate(y = case_when(x == "b" ~ cumsum(y * (x == "b")),
x == "a" ~ max(cumsum(y[x == "b"])),
TRUE ~ y))
x y
1 a 15
2 a 15
3 a 15
4 b 4
5 b 9
6 b 15
7 c 7
8 c 8
9 c 9
10 a 15
11 a 15
12 a 15
我想在 case_when
函数中的 ~
之后放置一个条件项。
我的例子:
df:
df <- structure(list(x = c("a", "a", "a", "b", "b", "b", "c", "c",
"c", "a", "a", "a"), y = 1:12), class = "data.frame", row.names = c(NA,
-12L))
无效代码:
library(dplyr)
df %>%
group_by(x) %>%
mutate(y = case_when(x=="b" ~ cumsum(y),
TRUE ~ y)) %>%
mutate(y = case_when(x=="a" ~ "what I want: last value of group "b" in column y",
TRUE ~ y))
换言之:
group_by
x
- 在
y
列中为组 - 取该组(=b)的最后一个值(=15)和
- 将此值 (=15) 放入组为
a
的列
b
计算 cumsum
y
期望的输出:
x y
<chr> <dbl>
1 a 15
2 a 15
3 a 15
4 b 4
5 b 9
6 b 15
7 c 7
8 c 8
9 c 9
10 a 15
11 a 15
12 a 15
非常感谢!!!
只需在计算第二个 mutate
之前添加 ungroup()
并使用 last
和条件来获得最后一个 y
和 x == "b"
library(dplyr)
df %>%
group_by(x) %>%
mutate(y = case_when(x=="b" ~ cumsum(y),
TRUE ~ y)) %>%
# add the ungroup here
ungroup() %>%
# and then the value is like this
mutate(y = case_when(x=="a" ~ last(y[x == "b"]),
TRUE ~ y))
#> # A tibble: 12 x 2
#> x y
#> <chr> <int>
#> 1 a 15
#> 2 a 15
#> 3 a 15
#> 4 b 4
#> 5 b 9
#> 6 b 15
#> 7 c 7
#> 8 c 8
#> 9 c 9
#> 10 a 15
#> 11 a 15
#> 12 a 15
由 reprex package (v2.0.0)
于 2021-04-22 创建在这种情况下,group_by()
不是必需的(尽管它有助于提高可读性等):
df %>%
mutate(y = case_when(x == "b" ~ cumsum(y * (x == "b")),
x == "a" ~ max(cumsum(y[x == "b"])),
TRUE ~ y))
x y
1 a 15
2 a 15
3 a 15
4 b 4
5 b 9
6 b 15
7 c 7
8 c 8
9 c 9
10 a 15
11 a 15
12 a 15