对所有值都是 r 中相同字符值的列进行子集化
Subsetting a column where all values are the same character value in r
我正在尝试识别列具有单个字符值的数据框列 tree
。
这是一个示例数据集。
df <- data.frame(id = c(1,2,3,4,5),
var.1 = c(5,6,7,"tree",4),
var.2 = c("tree","tree","tree","tree","tree"),
var.3 = c(4,5,8,9,1))
> df
id var.1 var.2 var.3
1 1 5 tree 4
2 2 6 tree 5
3 3 7 tree 8
4 4 tree tree 9
5 5 4 tree 1
我会标记 Var.2
变量,因为它包含所有 "tree
值。
flagged
[1] "var.2"
有什么想法吗?
谢谢!
对于每一列,检查所有元素是否都等于第一个元素。
df <- data.frame(id = c(1,2,3,4,5),
var.1 = c(5,6,7,"tree",4),
var.2 = c("tree","tree","tree","tree","tree"),
var.3 = c(4,5,8,9,1))
names(df)[sapply(df, function(x) all(x == x[1]))]
#> [1] "var.2"
由 reprex package (v2.0.1)
创建于 2022-02-17
使用 dplyr,你可以做到
library(dplyr)
flagged <- df %>%
select(where(~n_distinct(.x) == 1 && unique(.x) == "tree")) %>%
names()
你 select 所有只有一个等于“树”的不同值的列,然后提取列名。
我正在尝试识别列具有单个字符值的数据框列 tree
。
这是一个示例数据集。
df <- data.frame(id = c(1,2,3,4,5),
var.1 = c(5,6,7,"tree",4),
var.2 = c("tree","tree","tree","tree","tree"),
var.3 = c(4,5,8,9,1))
> df
id var.1 var.2 var.3
1 1 5 tree 4
2 2 6 tree 5
3 3 7 tree 8
4 4 tree tree 9
5 5 4 tree 1
我会标记 Var.2
变量,因为它包含所有 "tree
值。
flagged [1] "var.2"
有什么想法吗? 谢谢!
对于每一列,检查所有元素是否都等于第一个元素。
df <- data.frame(id = c(1,2,3,4,5),
var.1 = c(5,6,7,"tree",4),
var.2 = c("tree","tree","tree","tree","tree"),
var.3 = c(4,5,8,9,1))
names(df)[sapply(df, function(x) all(x == x[1]))]
#> [1] "var.2"
由 reprex package (v2.0.1)
创建于 2022-02-17使用 dplyr,你可以做到
library(dplyr)
flagged <- df %>%
select(where(~n_distinct(.x) == 1 && unique(.x) == "tree")) %>%
names()
你 select 所有只有一个等于“树”的不同值的列,然后提取列名。