根据唯一 ID 过滤非重复值
Filter for non-repeated values based on Unique ID
我有以下table
S/N
Unique ID
Code
1
111
YES
2
111
YES
3
111
NO
4
111
YES
5
222
YES
6
222
YES
7
222
YES
8
222
YES
9
333
NO
10
333
NO
11
333
YES
12
333
YES
如何根据以下条件得出以下 table:
对于每个唯一 ID,如果 YES 重复出现,则保留第一个 YES。如果出现NO,则保留下面的YES。我尝试使用 mutate,它给了我各种各样的错误。
S/N
Unique ID
Code
1
111
YES
4
111
YES
5
222
YES
11
333
YES
谢谢!
基础 R
ind <- ave(dat$Code == "YES", dat$`Unique ID`,
FUN = function(z) z & c(TRUE, !z[-length(z)]))
dat[ind,]
# S/N Unique ID Code
# 1 1 111 YES
# 4 4 111 YES
# 5 5 222 YES
# 11 11 333 YES
dplyr
library(dplyr)
dat %>%
group_by(`Unique ID`) %>%
filter(Code == "YES" & lag(Code == "NO", default = TRUE)) %>%
ungroup()
# # A tibble: 4 x 3
# `S/N` `Unique ID` Code
# <int> <int> <chr>
# 1 1 111 YES
# 2 4 111 YES
# 3 5 222 YES
# 4 11 333 YES
data.table
library(data.table)
as.data.table(dat)[, .SD[Code == "YES" & shift(Code == "NO", fill = TRUE),], by = `Unique ID`]
# Unique ID S/N Code
# <int> <int> <char>
# 1: 111 1 YES
# 2: 111 4 YES
# 3: 222 5 YES
# 4: 333 11 YES
数据
dat <- structure(list("S/N" = 1:12, "Unique ID" = c(111L, 111L, 111L, 111L, 222L, 222L, 222L, 222L, 333L, 333L, 333L, 333L), Code = c("YES", "YES", "NO", "YES", "YES", "YES", "YES", "YES", "NO", "NO", "YES", "YES")), class = "data.frame", row.names = c(NA, -12L))
我有以下table
S/N | Unique ID | Code |
---|---|---|
1 | 111 | YES |
2 | 111 | YES |
3 | 111 | NO |
4 | 111 | YES |
5 | 222 | YES |
6 | 222 | YES |
7 | 222 | YES |
8 | 222 | YES |
9 | 333 | NO |
10 | 333 | NO |
11 | 333 | YES |
12 | 333 | YES |
如何根据以下条件得出以下 table: 对于每个唯一 ID,如果 YES 重复出现,则保留第一个 YES。如果出现NO,则保留下面的YES。我尝试使用 mutate,它给了我各种各样的错误。
S/N | Unique ID | Code |
---|---|---|
1 | 111 | YES |
4 | 111 | YES |
5 | 222 | YES |
11 | 333 | YES |
谢谢!
基础 R
ind <- ave(dat$Code == "YES", dat$`Unique ID`,
FUN = function(z) z & c(TRUE, !z[-length(z)]))
dat[ind,]
# S/N Unique ID Code
# 1 1 111 YES
# 4 4 111 YES
# 5 5 222 YES
# 11 11 333 YES
dplyr
library(dplyr)
dat %>%
group_by(`Unique ID`) %>%
filter(Code == "YES" & lag(Code == "NO", default = TRUE)) %>%
ungroup()
# # A tibble: 4 x 3
# `S/N` `Unique ID` Code
# <int> <int> <chr>
# 1 1 111 YES
# 2 4 111 YES
# 3 5 222 YES
# 4 11 333 YES
data.table
library(data.table)
as.data.table(dat)[, .SD[Code == "YES" & shift(Code == "NO", fill = TRUE),], by = `Unique ID`]
# Unique ID S/N Code
# <int> <int> <char>
# 1: 111 1 YES
# 2: 111 4 YES
# 3: 222 5 YES
# 4: 333 11 YES
数据
dat <- structure(list("S/N" = 1:12, "Unique ID" = c(111L, 111L, 111L, 111L, 222L, 222L, 222L, 222L, 333L, 333L, 333L, 333L), Code = c("YES", "YES", "NO", "YES", "YES", "YES", "YES", "YES", "NO", "NO", "YES", "YES")), class = "data.frame", row.names = c(NA, -12L))