如何有条件地将具有多个值的两行合并在一起并在 R 中发生变异?
How to conditionally merge two rows with multiple values together and mutate in R?
使用不同的捕鱼方法捕鱼。
我想合并基于Species
的行(如果它们是相同的鱼种),如果它们被Bottom fishing
和Trolling
两种方法捕获,它将导致两行合并为一行,将 Method
值更改为 Both
。
例如 Caranx ignobilis
将有一个新的 Method
值 Both
。 Bait
Released
和 Kept
列也应该在同一行上有值。
Species Method Bait Released Kept
4 Caranx ignobilis Both NA 1 1
它看起来很简单,但我已经挠头好几个小时,并把 case_when
作为 tidyverse
包的一部分。
小标题是之前 sub-setting 使用 group_by
和 pivot_wider
的数据的结果。
样本是这样的:
# A tibble: 10 x 5
# Groups: Species [9]
Species Method Bait Released Kept
<chr> <fct> <int> <int> <int>
1 Aethaloperca rogaa Bottom fishing NA NA 2
2 Aprion virescens Bottom fishing NA NA 1
3 Balistidae spp. Bottom fishing NA NA 1
4 Caranx ignobilis Trolling NA NA 1
5 Caranx ignobilis Bottom fishing NA 1 NA
6 Epinephelus fasciatus Bottom fishing NA 3 NA
7 Epinephelus multinotatus Bottom fishing NA NA 5
8 Other species Bottom fishing NA 1 NA
9 Thunnus albacares Trolling NA NA 1
10 Variola louti Bottom fishing NA NA 1
数据:
fish_catch <- structure(list(Species = c("Aethaloperca rogaa", "Aprion virescens","Balistidae spp.", "Caranx ignobilis", "Caranx ignobilis", "Epinephelus fasciatus","Epinephelus multinotatus", "Other species", "Thunnus albacares","Variola louti"),
Method = structure(c(1L, 1L, 1L, 2L, 1L, 1L,1L, 1L, 2L, 1L), .Label = c("Bottom fishing", "Trolling"), class = "factor"),Bait = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_),
Released = c(NA, NA, NA, NA, 1L, 3L, NA, 1L,NA, NA),
Kept = c(2L, 1L, 1L, 1L, NA, NA, 5L, NA, 1L, 1L)), class = c("grouped_df","tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(Species = c("Aethaloperca rogaa", "Aprion virescens",
"Balistidae spp.","Caranx ignobilis", "Epinephelus fasciatus", "Epinephelus multinotatus","Other species", "Thunnus albacares", "Variola louti"), .rows = list(1L, 2L, 3L, 4:5, 6L, 7L, 8L, 9L, 10L)), row.names = c(NA,-9L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE))
我要走的路线,但后来我意识到它没有包含 Species
或其他列
mutate(Method = case_when(Method == "Bottom fishing" & Method == "Trolling" ~ "Both",
Method == "Bottom fishing" ~ "Bottom fishing",
Method == "Trolling" ~ "Trolling", TRUE ~ as.character(MethodCaught)))
这应该可以帮助您入门。您可以将其他列添加到汇总函数中。
library(tidyverse)
fish_catch %>% select(-Bait, -Released, -Kept) %>%
group_by(Species) %>%
summarize(Method = paste0(Method, collapse = "")) %>%
mutate(Method = fct_recode(Method, "both" = "TrollingBottom fishing"))
# A tibble: 9 x 2
Species Method
<chr> <fct>
1 Aethaloperca rogaa Bottom fishing
2 Aprion virescens Bottom fishing
3 Balistidae spp. Bottom fishing
4 Caranx ignobilis both
5 Epinephelus fasciatus Bottom fishing
6 Epinephelus multinotatus Bottom fishing
7 Other species Bottom fishing
8 Thunnus albacares Trolling
9 Variola louti Bottom fishing
这是一种使用 tidyverse
的方法。如果底钓和拖钓都包含在该物种的方法中,您可以 group_by(Species)
并将 Method
设置为 "Both"。然后,您可以 group_by
Species 和 Method,并使用 fill
将 NA
替换为已知值。最后用slice
,每Species/Method保留一行。这假设您每个 Species/Method 都有 1 行 - 如果不是这种情况请告诉我。
library(tidyverse)
fish_catch %>%
group_by(Species) %>%
mutate(Method = ifelse(all(c("Bottom fishing", "Trolling") %in% Method), "Both", as.character(Method))) %>%
group_by(Species, Method) %>%
fill(c(Bait, Released, Kept), .direction = "updown") %>%
slice(1)
输出
# A tibble: 9 x 5
# Groups: Species, Method [9]
Species Method Bait Released Kept
<chr> <chr> <int> <int> <int>
1 Aethaloperca rogaa Bottom fishing NA NA 2
2 Aprion virescens Bottom fishing NA NA 1
3 Balistidae spp. Bottom fishing NA NA 1
4 Caranx ignobilis Both NA 1 1
5 Epinephelus fasciatus Bottom fishing NA 3 NA
6 Epinephelus multinotatus Bottom fishing NA NA 5
7 Other species Bottom fishing NA 1 NA
8 Thunnus albacares Trolling NA NA 1
9 Variola louti Bottom fishing NA NA 1
使用不同的捕鱼方法捕鱼。
我想合并基于Species
的行(如果它们是相同的鱼种),如果它们被Bottom fishing
和Trolling
两种方法捕获,它将导致两行合并为一行,将 Method
值更改为 Both
。
例如 Caranx ignobilis
将有一个新的 Method
值 Both
。 Bait
Released
和 Kept
列也应该在同一行上有值。
Species Method Bait Released Kept
4 Caranx ignobilis Both NA 1 1
它看起来很简单,但我已经挠头好几个小时,并把 case_when
作为 tidyverse
包的一部分。
小标题是之前 sub-setting 使用 group_by
和 pivot_wider
的数据的结果。
样本是这样的:
# A tibble: 10 x 5
# Groups: Species [9]
Species Method Bait Released Kept
<chr> <fct> <int> <int> <int>
1 Aethaloperca rogaa Bottom fishing NA NA 2
2 Aprion virescens Bottom fishing NA NA 1
3 Balistidae spp. Bottom fishing NA NA 1
4 Caranx ignobilis Trolling NA NA 1
5 Caranx ignobilis Bottom fishing NA 1 NA
6 Epinephelus fasciatus Bottom fishing NA 3 NA
7 Epinephelus multinotatus Bottom fishing NA NA 5
8 Other species Bottom fishing NA 1 NA
9 Thunnus albacares Trolling NA NA 1
10 Variola louti Bottom fishing NA NA 1
数据:
fish_catch <- structure(list(Species = c("Aethaloperca rogaa", "Aprion virescens","Balistidae spp.", "Caranx ignobilis", "Caranx ignobilis", "Epinephelus fasciatus","Epinephelus multinotatus", "Other species", "Thunnus albacares","Variola louti"),
Method = structure(c(1L, 1L, 1L, 2L, 1L, 1L,1L, 1L, 2L, 1L), .Label = c("Bottom fishing", "Trolling"), class = "factor"),Bait = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,NA_integer_),
Released = c(NA, NA, NA, NA, 1L, 3L, NA, 1L,NA, NA),
Kept = c(2L, 1L, 1L, 1L, NA, NA, 5L, NA, 1L, 1L)), class = c("grouped_df","tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(Species = c("Aethaloperca rogaa", "Aprion virescens",
"Balistidae spp.","Caranx ignobilis", "Epinephelus fasciatus", "Epinephelus multinotatus","Other species", "Thunnus albacares", "Variola louti"), .rows = list(1L, 2L, 3L, 4:5, 6L, 7L, 8L, 9L, 10L)), row.names = c(NA,-9L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE))
我要走的路线,但后来我意识到它没有包含 Species
或其他列
mutate(Method = case_when(Method == "Bottom fishing" & Method == "Trolling" ~ "Both",
Method == "Bottom fishing" ~ "Bottom fishing",
Method == "Trolling" ~ "Trolling", TRUE ~ as.character(MethodCaught)))
这应该可以帮助您入门。您可以将其他列添加到汇总函数中。
library(tidyverse)
fish_catch %>% select(-Bait, -Released, -Kept) %>%
group_by(Species) %>%
summarize(Method = paste0(Method, collapse = "")) %>%
mutate(Method = fct_recode(Method, "both" = "TrollingBottom fishing"))
# A tibble: 9 x 2
Species Method
<chr> <fct>
1 Aethaloperca rogaa Bottom fishing
2 Aprion virescens Bottom fishing
3 Balistidae spp. Bottom fishing
4 Caranx ignobilis both
5 Epinephelus fasciatus Bottom fishing
6 Epinephelus multinotatus Bottom fishing
7 Other species Bottom fishing
8 Thunnus albacares Trolling
9 Variola louti Bottom fishing
这是一种使用 tidyverse
的方法。如果底钓和拖钓都包含在该物种的方法中,您可以 group_by(Species)
并将 Method
设置为 "Both"。然后,您可以 group_by
Species 和 Method,并使用 fill
将 NA
替换为已知值。最后用slice
,每Species/Method保留一行。这假设您每个 Species/Method 都有 1 行 - 如果不是这种情况请告诉我。
library(tidyverse)
fish_catch %>%
group_by(Species) %>%
mutate(Method = ifelse(all(c("Bottom fishing", "Trolling") %in% Method), "Both", as.character(Method))) %>%
group_by(Species, Method) %>%
fill(c(Bait, Released, Kept), .direction = "updown") %>%
slice(1)
输出
# A tibble: 9 x 5
# Groups: Species, Method [9]
Species Method Bait Released Kept
<chr> <chr> <int> <int> <int>
1 Aethaloperca rogaa Bottom fishing NA NA 2
2 Aprion virescens Bottom fishing NA NA 1
3 Balistidae spp. Bottom fishing NA NA 1
4 Caranx ignobilis Both NA 1 1
5 Epinephelus fasciatus Bottom fishing NA 3 NA
6 Epinephelus multinotatus Bottom fishing NA NA 5
7 Other species Bottom fishing NA 1 NA
8 Thunnus albacares Trolling NA NA 1
9 Variola louti Bottom fishing NA NA 1