当一个因子在 R 中取特定值时如何从 data.frame 中删除行
How to remove rows from a data.frame when a factor takes particular values in R
我正在 R 中处理一个汽车数据集。在那,我有一个名为 fuel
的列,它具有 class 个因子。因此,汽车总数分布在 5 种类型。我想从该列中删除 3 种类型。举例如下:
fuel:
CNG : 40
Diesel :2133
Electric: 1
LPG : 23
Petrol :2120
如何使用一个命令删除因子级别 CNG
、Electric
和 LPG
?
我试过如下,它有效,但我认为有更好的方法,比如使用 1 行命令。
1.
car <- car[!car$fuel == "CNG", ]
car <- car[!car$fuel == "Electric", ]
car <- car[!car$fuel == "LPG", ]
我也试过下面的方法,但是没有用,为什么下面的命令不起作用?
2.
car <- car[!car$fuel == "CNG"||"Electric"||"LPG", ]
一个常见的解决方案是:
car[!(car$fuel %in% c("CNG", "Electric", "LPG")), ]
要使第二个解决方案起作用,首先您需要使用 |
而不是 ||
,因为您正在处理矢量。其次,您需要说明要实施的逻辑测试,以便 R 理解:
car[!(car$fuel == "CNG" | car$fuel == "Electric" | car$fuel == "LPG"), ]
根据德摩根定律简化:
car[car$fuel != "CNG" & car$fuel != "Electric" & car$fuel != "LPG", ]
要添加到 解决方案,您可以像这样使用 subset
# simulate data
set.seed(2)
n <- 12
car <- data.frame(fuel = factor(
sample.int(5, size = n, replace = TRUE),
labels = c("CNG", "Electric", "LPG", "Gas", "Unknown")),
id = 1:n)
# show alternative solution
subset(car, fuel != "CNG" & fuel != "Electric" & fuel != "LPG")
#R> fuel id
#R> 1 Unknown 1
#R> 3 Unknown 3
#R> 5 Gas 5
#R> 6 Unknown 6
subset(car, !fuel %in% c("CNG", "Electric", "LPG"))
#R> fuel id
#R> 1 Unknown 1
#R> 3 Unknown 3
#R> 5 Gas 5
#R> 6 Unknown 6
您的第二个版本失败,因为您使用了 ||
而不是 |
。请参见 help("Logic", package = "base")
,尤其是
&
and &&
indicate logical AND and |
and ||
indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector.
我正在 R 中处理一个汽车数据集。在那,我有一个名为 fuel
的列,它具有 class 个因子。因此,汽车总数分布在 5 种类型。我想从该列中删除 3 种类型。举例如下:
fuel:
CNG : 40
Diesel :2133
Electric: 1
LPG : 23
Petrol :2120
如何使用一个命令删除因子级别 CNG
、Electric
和 LPG
?
我试过如下,它有效,但我认为有更好的方法,比如使用 1 行命令。
1.
car <- car[!car$fuel == "CNG", ]
car <- car[!car$fuel == "Electric", ]
car <- car[!car$fuel == "LPG", ]
我也试过下面的方法,但是没有用,为什么下面的命令不起作用?
2.
car <- car[!car$fuel == "CNG"||"Electric"||"LPG", ]
一个常见的解决方案是:
car[!(car$fuel %in% c("CNG", "Electric", "LPG")), ]
要使第二个解决方案起作用,首先您需要使用 |
而不是 ||
,因为您正在处理矢量。其次,您需要说明要实施的逻辑测试,以便 R 理解:
car[!(car$fuel == "CNG" | car$fuel == "Electric" | car$fuel == "LPG"), ]
根据德摩根定律简化:
car[car$fuel != "CNG" & car$fuel != "Electric" & car$fuel != "LPG", ]
要添加到 subset
# simulate data
set.seed(2)
n <- 12
car <- data.frame(fuel = factor(
sample.int(5, size = n, replace = TRUE),
labels = c("CNG", "Electric", "LPG", "Gas", "Unknown")),
id = 1:n)
# show alternative solution
subset(car, fuel != "CNG" & fuel != "Electric" & fuel != "LPG")
#R> fuel id
#R> 1 Unknown 1
#R> 3 Unknown 3
#R> 5 Gas 5
#R> 6 Unknown 6
subset(car, !fuel %in% c("CNG", "Electric", "LPG"))
#R> fuel id
#R> 1 Unknown 1
#R> 3 Unknown 3
#R> 5 Gas 5
#R> 6 Unknown 6
您的第二个版本失败,因为您使用了 ||
而不是 |
。请参见 help("Logic", package = "base")
,尤其是
&
and&&
indicate logical AND and|
and||
indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector.