如何检查一个因子变量是否只有我想要的水平?

How to check if a factor variable has only the levels I want?

验证因子变量具有我想要的水平的最简单方法是什么?

# I want to make sure a factor variable has 'FP', 'TP' but nothing else

a <- factor(c('TP','TP','FP')) # Return TRUE
b <- factor(c('TP')) # Return TRUE
c <- factor(c('TP', '1234')) # Return FALSE

我们可以使用 all%in%

all(levels(a) %in% c("FP", "TP"))
#[1] TRUE
all(levels(b) %in% c("FP", "TP"))
#[1] TRUE
all(levels(c) %in% c("FP", "TP"))
#[1] FALSE

只是为了避免重复代码或者以防万一我们需要检查更多关卡

checkFactor <- c("FP", "TP")
all(levels(a) %in% checkFactor)
#[1] TRUE
all(levels(b) %in% checkFactor)
#[1] TRUE
all(levels(c) %in% checkFactor)
#[1] FALSE

还有另一种方法:

check.factor.levels <- function(expected, actual) {
  length(setdiff(actual, expected)) == 0  
}

expected <- c('FP', 'TP')
check.factor.levels(expected, levels(a))
#[1] TRUE
check.factor.levels(expected, levels(b))
#[1] TRUE
check.factor.levels(expected, levels(c))
#[1] FALSE