使用 R 中的多个条件删除 df 中的行
Remove rows in df using multiple conditions in R
是否可以通过从 2 个或更多列中引用特定字符串或因子水平来删除数据行?对于小型数据集,这很容易,因为我可以滚动浏览数据框并删除我想要的行,但是对于较大的数据集,如果不不断滚动以查看哪些行符合我的条件,如何实现这一点?
虚假数据:
df1 <- data.frame(year = rep(c(2019, 2020), each = 10),
month = rep(c("March", "October"), each = 1),
site = rep(c("1", "2", "3", "4", "5"), each = 2),
common_name = rep(c("Tuna", "shark"), each = 1),
num = sample(x = 0:2, size = 20, replace = TRUE))
例如:如何在一行代码中只删除 2019 年 3 月的站点“1”,而不查看它在哪一行?
您可以使用 subset()
:
df1 <- data.frame(year = rep(c(2019, 2020), each = 10),
month = rep(c("March", "October"), each = 1),
site = rep(c("1", "2", "3", "4", "5"), each = 2),
common_name = rep(c("Tuna", "shark"), each = 1),
num = sample(x = 0:2, size = 20, replace = TRUE))
subset(df1, !(site == "1" & year == 2019 & month == "March"))
#> year month site common_name num
#> 2 2019 October 1 shark 0
#> 3 2019 March 2 Tuna 1
#> 4 2019 October 2 shark 0
#> 5 2019 March 3 Tuna 0
#> 6 2019 October 3 shark 0
#> 7 2019 March 4 Tuna 2
#> 8 2019 October 4 shark 2
#> 9 2019 March 5 Tuna 0
#> 10 2019 October 5 shark 2
#> 11 2020 March 1 Tuna 1
#> 12 2020 October 1 shark 1
#> 13 2020 March 2 Tuna 2
#> 14 2020 October 2 shark 2
#> 15 2020 March 3 Tuna 1
#> 16 2020 October 3 shark 0
#> 17 2020 March 4 Tuna 1
#> 18 2020 October 4 shark 0
#> 19 2020 March 5 Tuna 0
#> 20 2020 October 5 shark 2
由 reprex package (v2.0.1)
创建于 2022-05-31
我们也可以使用 paste
subset(df1, paste(year, month, site) != '2019 March 1')
-输出
year month site common_name num
2 2019 October 1 shark 1
3 2019 March 2 Tuna 1
4 2019 October 2 shark 2
5 2019 March 3 Tuna 0
6 2019 October 3 shark 0
7 2019 March 4 Tuna 2
8 2019 October 4 shark 1
9 2019 March 5 Tuna 1
10 2019 October 5 shark 1
11 2020 March 1 Tuna 1
12 2020 October 1 shark 1
13 2020 March 2 Tuna 1
14 2020 October 2 shark 2
15 2020 March 3 Tuna 1
16 2020 October 3 shark 0
17 2020 March 4 Tuna 1
18 2020 October 4 shark 1
19 2020 March 5 Tuna 1
20 2020 October 5 shark 2
使用 R 括号表示法替代 subset
或 dplyr:filter
的一行:
df2 <- df1[!(df1$site=="1" & df1$year==2019 & df1$month=="March"),]
是否可以通过从 2 个或更多列中引用特定字符串或因子水平来删除数据行?对于小型数据集,这很容易,因为我可以滚动浏览数据框并删除我想要的行,但是对于较大的数据集,如果不不断滚动以查看哪些行符合我的条件,如何实现这一点?
虚假数据:
df1 <- data.frame(year = rep(c(2019, 2020), each = 10),
month = rep(c("March", "October"), each = 1),
site = rep(c("1", "2", "3", "4", "5"), each = 2),
common_name = rep(c("Tuna", "shark"), each = 1),
num = sample(x = 0:2, size = 20, replace = TRUE))
例如:如何在一行代码中只删除 2019 年 3 月的站点“1”,而不查看它在哪一行?
您可以使用 subset()
:
df1 <- data.frame(year = rep(c(2019, 2020), each = 10),
month = rep(c("March", "October"), each = 1),
site = rep(c("1", "2", "3", "4", "5"), each = 2),
common_name = rep(c("Tuna", "shark"), each = 1),
num = sample(x = 0:2, size = 20, replace = TRUE))
subset(df1, !(site == "1" & year == 2019 & month == "March"))
#> year month site common_name num
#> 2 2019 October 1 shark 0
#> 3 2019 March 2 Tuna 1
#> 4 2019 October 2 shark 0
#> 5 2019 March 3 Tuna 0
#> 6 2019 October 3 shark 0
#> 7 2019 March 4 Tuna 2
#> 8 2019 October 4 shark 2
#> 9 2019 March 5 Tuna 0
#> 10 2019 October 5 shark 2
#> 11 2020 March 1 Tuna 1
#> 12 2020 October 1 shark 1
#> 13 2020 March 2 Tuna 2
#> 14 2020 October 2 shark 2
#> 15 2020 March 3 Tuna 1
#> 16 2020 October 3 shark 0
#> 17 2020 March 4 Tuna 1
#> 18 2020 October 4 shark 0
#> 19 2020 March 5 Tuna 0
#> 20 2020 October 5 shark 2
由 reprex package (v2.0.1)
创建于 2022-05-31我们也可以使用 paste
subset(df1, paste(year, month, site) != '2019 March 1')
-输出
year month site common_name num
2 2019 October 1 shark 1
3 2019 March 2 Tuna 1
4 2019 October 2 shark 2
5 2019 March 3 Tuna 0
6 2019 October 3 shark 0
7 2019 March 4 Tuna 2
8 2019 October 4 shark 1
9 2019 March 5 Tuna 1
10 2019 October 5 shark 1
11 2020 March 1 Tuna 1
12 2020 October 1 shark 1
13 2020 March 2 Tuna 1
14 2020 October 2 shark 2
15 2020 March 3 Tuna 1
16 2020 October 3 shark 0
17 2020 March 4 Tuna 1
18 2020 October 4 shark 1
19 2020 March 5 Tuna 1
20 2020 October 5 shark 2
使用 R 括号表示法替代 subset
或 dplyr:filter
的一行:
df2 <- df1[!(df1$site=="1" & df1$year==2019 & df1$month=="March"),]