R 中的子集,具有由其索引号标识的特定列的特定值
Subset in R with specific values for specific columns identified by their index number
如果我有这样的数据框:
df = data.frame(A = sample(1:5, 10, replace=T), B = sample(1:5, 10, replace=T), C = sample(1:5, 10, replace=T), D = sample(1:5, 10, replace=T), E = sample(1:5, 10, replace=T))
给我这个:
A B C D E
1 1 5 1 4 3
2 2 3 5 4 3
3 4 2 2 4 4
4 2 1 2 5 2
5 3 3 4 4 5
6 3 2 3 1 5
7 1 5 4 2 3
8 1 3 5 5 1
9 3 1 1 3 5
10 5 3 1 2 4
我如何获得一个子集,其中包含所有行,其中某些列(例如 B 和 D)的值等于 1,并且这些列由它们的索引号(2 和 4)标识,而不是它们的索引号名字?在这种情况下:
A B C D E
4 2 1 2 5 2
6 3 2 3 1 5
9 3 1 1 3 5
df[rowSums(df[c(2,4)] == 1) > 0,]
# A B C D E
# 4 2 1 2 5 2
# 6 3 2 3 1 5
# 9 3 1 1 3 5
- 你说按列索引比较值,所以
df[c(2,4)]
或(或df[,c(2,4)]
)。
df[c(2,4)] == 1
returns逻辑矩阵,单元格的值是否等于1。
rowSums(.) > 0
查找至少有一个 1
. 的行
df[rowSums(.)>0,]
只选择那些行。
数据
df <- structure(list(A = c(1L, 2L, 4L, 2L, 3L, 3L, 1L, 1L, 3L, 5L), B = c(5L, 3L, 2L, 1L, 3L, 2L, 5L, 3L, 1L, 3L), C = c(1L, 5L, 2L, 2L, 4L, 3L, 4L, 5L, 1L, 1L), D = c(4L, 4L, 4L, 5L, 4L, 1L, 2L, 5L, 3L, 2L), E = c(3L, 3L, 4L, 2L, 5L, 5L, 3L, 1L, 5L, 4L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
tidyverse
df <-
structure(
list(
A = c(1L, 2L, 4L, 2L, 3L, 3L, 1L, 1L, 3L, 5L),
B = c(5L, 3L, 2L, 1L, 3L, 2L, 5L, 3L, 1L, 3L),
C = c(1L, 5L, 2L, 2L, 4L, 3L, 4L, 5L, 1L, 1L),
D = c(4L, 4L, 4L, 5L, 4L, 1L, 2L, 5L, 3L, 2L),
E = c(3L, 3L, 4L, 2L, 5L, 5L, 3L, 1L, 5L, 4L)
),
class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")
)
library(tidyverse)
df %>%
filter(B == 1 | D == 1)
#> A B C D E
#> 4 2 1 2 5 2
#> 6 3 2 3 1 5
#> 9 3 1 1 3 5
由 reprex package (v2.0.1)
于 2022-01-23 创建
data.table
library(data.table)
setDT(df)[B == 1 | D == 1, ]
#> A B C D E
#> 1: 2 1 2 5 2
#> 2: 3 2 3 1 5
#> 3: 3 1 1 3 5
由 reprex package (v2.0.1)
于 2022-01-23 创建
如果我有这样的数据框:
df = data.frame(A = sample(1:5, 10, replace=T), B = sample(1:5, 10, replace=T), C = sample(1:5, 10, replace=T), D = sample(1:5, 10, replace=T), E = sample(1:5, 10, replace=T))
给我这个:
A B C D E
1 1 5 1 4 3
2 2 3 5 4 3
3 4 2 2 4 4
4 2 1 2 5 2
5 3 3 4 4 5
6 3 2 3 1 5
7 1 5 4 2 3
8 1 3 5 5 1
9 3 1 1 3 5
10 5 3 1 2 4
我如何获得一个子集,其中包含所有行,其中某些列(例如 B 和 D)的值等于 1,并且这些列由它们的索引号(2 和 4)标识,而不是它们的索引号名字?在这种情况下:
A B C D E
4 2 1 2 5 2
6 3 2 3 1 5
9 3 1 1 3 5
df[rowSums(df[c(2,4)] == 1) > 0,]
# A B C D E
# 4 2 1 2 5 2
# 6 3 2 3 1 5
# 9 3 1 1 3 5
- 你说按列索引比较值,所以
df[c(2,4)]
或(或df[,c(2,4)]
)。 df[c(2,4)] == 1
returns逻辑矩阵,单元格的值是否等于1。rowSums(.) > 0
查找至少有一个1
. 的行
df[rowSums(.)>0,]
只选择那些行。
数据
df <- structure(list(A = c(1L, 2L, 4L, 2L, 3L, 3L, 1L, 1L, 3L, 5L), B = c(5L, 3L, 2L, 1L, 3L, 2L, 5L, 3L, 1L, 3L), C = c(1L, 5L, 2L, 2L, 4L, 3L, 4L, 5L, 1L, 1L), D = c(4L, 4L, 4L, 5L, 4L, 1L, 2L, 5L, 3L, 2L), E = c(3L, 3L, 4L, 2L, 5L, 5L, 3L, 1L, 5L, 4L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
tidyverse
df <-
structure(
list(
A = c(1L, 2L, 4L, 2L, 3L, 3L, 1L, 1L, 3L, 5L),
B = c(5L, 3L, 2L, 1L, 3L, 2L, 5L, 3L, 1L, 3L),
C = c(1L, 5L, 2L, 2L, 4L, 3L, 4L, 5L, 1L, 1L),
D = c(4L, 4L, 4L, 5L, 4L, 1L, 2L, 5L, 3L, 2L),
E = c(3L, 3L, 4L, 2L, 5L, 5L, 3L, 1L, 5L, 4L)
),
class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")
)
library(tidyverse)
df %>%
filter(B == 1 | D == 1)
#> A B C D E
#> 4 2 1 2 5 2
#> 6 3 2 3 1 5
#> 9 3 1 1 3 5
由 reprex package (v2.0.1)
于 2022-01-23 创建data.table
library(data.table)
setDT(df)[B == 1 | D == 1, ]
#> A B C D E
#> 1: 2 1 2 5 2
#> 2: 3 2 3 1 5
#> 3: 3 1 1 3 5
由 reprex package (v2.0.1)
于 2022-01-23 创建