R - 根据另一列以不同方式创建新列
R - Create new column differentially based on another column
我有以下数据集:
ID year start_year
a 1 1
a 2 1
a 3 1
b 1 2
b 2 2
b 3 2
c 1 3
c 2 3
c 3 3
我想创建一个新的虚拟列 present
,对于每个 ID,如果 start_year 为 1,则为 1-1-1,如果 [=19,则为 0-1-1 =] 为 2,如果 start_year 为 3,则为 0-0-1。
我的目标是获得以下 table:
ID year start_year present
a 1 1 1
a 2 1 1
a 3 1 1
b 1 2 0
b 2 2 1
b 3 2 1
c 1 3 0
c 2 3 0
c 3 3 1
我想这对你们大多数人来说应该很容易,但我真的卡住了。
非常感谢您的帮助!
更简单的选择是创建一个 key/value list
,然后将 list
与每个 'start_year' 的 first
元素进行子集化 'ID'(假设每组只有3个元素)
library(dplyr)
lst1 <- list(`1` = c(1, 1, 1), `2` = c(0, 1, 1), `3` = c(0, 0, 1))
df1 %>%
group_by(ID) %>%
mutate(present = lst1[[as.character(first(start_year))]]) %>%
ungroup
-输出
# A tibble: 9 × 4
ID year start_year present
<chr> <int> <int> <dbl>
1 a 1 1 1
2 a 2 1 1
3 a 3 1 1
4 b 1 2 0
5 b 2 2 1
6 b 3 2 1
7 c 1 3 0
8 c 2 3 0
9 c 3 3 1
数据
df1 <- structure(list(ID = c("a", "a", "a", "b", "b", "b", "c", "c",
"c"), year = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), start_year = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), class = "data.frame", row.names = c(NA,
-9L))
可能的方法:
library(tidyverse)
df <- tribble(
~ID, ~year, ~start_year,
"a", 1, 1,
"a", 2, 1,
"a", 3, 1,
"b", 1, 2,
"b", 2, 2,
"b", 3, 2,
"c", 1, 3,
"c", 2, 3,
"c", 3, 3
)
df |> mutate(present = if_else(start_year <= year, 1, 0))
#> # A tibble: 9 × 4
#> ID year start_year present
#> <chr> <dbl> <dbl> <dbl>
#> 1 a 1 1 1
#> 2 a 2 1 1
#> 3 a 3 1 1
#> 4 b 1 2 0
#> 5 b 2 2 1
#> 6 b 3 2 1
#> 7 c 1 3 0
#> 8 c 2 3 0
#> 9 c 3 3 1
由 reprex package (v2.0.1)
于 2022-05-27 创建
我有以下数据集:
ID year start_year
a 1 1
a 2 1
a 3 1
b 1 2
b 2 2
b 3 2
c 1 3
c 2 3
c 3 3
我想创建一个新的虚拟列 present
,对于每个 ID,如果 start_year 为 1,则为 1-1-1,如果 [=19,则为 0-1-1 =] 为 2,如果 start_year 为 3,则为 0-0-1。
我的目标是获得以下 table:
ID year start_year present
a 1 1 1
a 2 1 1
a 3 1 1
b 1 2 0
b 2 2 1
b 3 2 1
c 1 3 0
c 2 3 0
c 3 3 1
我想这对你们大多数人来说应该很容易,但我真的卡住了。 非常感谢您的帮助!
更简单的选择是创建一个 key/value list
,然后将 list
与每个 'start_year' 的 first
元素进行子集化 'ID'(假设每组只有3个元素)
library(dplyr)
lst1 <- list(`1` = c(1, 1, 1), `2` = c(0, 1, 1), `3` = c(0, 0, 1))
df1 %>%
group_by(ID) %>%
mutate(present = lst1[[as.character(first(start_year))]]) %>%
ungroup
-输出
# A tibble: 9 × 4
ID year start_year present
<chr> <int> <int> <dbl>
1 a 1 1 1
2 a 2 1 1
3 a 3 1 1
4 b 1 2 0
5 b 2 2 1
6 b 3 2 1
7 c 1 3 0
8 c 2 3 0
9 c 3 3 1
数据
df1 <- structure(list(ID = c("a", "a", "a", "b", "b", "b", "c", "c",
"c"), year = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), start_year = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), class = "data.frame", row.names = c(NA,
-9L))
可能的方法:
library(tidyverse)
df <- tribble(
~ID, ~year, ~start_year,
"a", 1, 1,
"a", 2, 1,
"a", 3, 1,
"b", 1, 2,
"b", 2, 2,
"b", 3, 2,
"c", 1, 3,
"c", 2, 3,
"c", 3, 3
)
df |> mutate(present = if_else(start_year <= year, 1, 0))
#> # A tibble: 9 × 4
#> ID year start_year present
#> <chr> <dbl> <dbl> <dbl>
#> 1 a 1 1 1
#> 2 a 2 1 1
#> 3 a 3 1 1
#> 4 b 1 2 0
#> 5 b 2 2 1
#> 6 b 3 2 1
#> 7 c 1 3 0
#> 8 c 2 3 0
#> 9 c 3 3 1
由 reprex package (v2.0.1)
于 2022-05-27 创建