检查一个因素中的所有因素是否都是唯一的，如果是，则返回该因素。如果不是，则返回第三个值。 R

Question

第一次在这里发帖！已经为此苦苦挣扎了大约两天，但我有一个看起来像这样的数据框：

code.1 <- factor(c(rep("x",3), rep("y",2), rep("z",3)))
type.1 <- factor(c(rep("small", 2), rep("medium", 2), rep("large", 4)))
df <- cbind.data.frame(type.1, code.1)
df

我正努力将它变成 return 这个：

code.2 <- factor(c("x", "y", "z"))
type.2 <- factor(c("multiple", "multiple", "large"))
df2 <- cbind.data.frame(type.2, code.2)
df2

我已经尝试了所有 If/Else 的方法，并将按“代码”分组的函数应用于 return 这些结果，但我被卡住了。任何帮助表示赞赏！

Answer 1

你可以用 dplyr 做到这一点：你按 code.1 分组，然后你所要做的就是用 if/else 总结 type.1：如果有只有一个值，你return它，否则你return“多个”。

出于实际考虑，代码稍微复杂一些（需要转换为字符，需要有一个向量化的 TRUE 条件，即使在 FALSE 时也始终 return 为单个值）：


df %>%
  group_by(code.1) %>%
  summarize(type.2 = if_else(n_distinct(type.1) == 1,
                             as.character(first(type.1)),
                             "multiple"),
            type.2 = as.factor(type.2))
# A tibble: 3 x 2
#   code.1 type.2  
#   <fct>  <fct>   
# 1 x      multiple
# 2 y      multiple
# 3 z      large

编辑：这是同一方法的不同表述，无需转换为字符，可能更适合大型问题，并且可能对同一问题给出不同的看法：

# default value when multiple
iffalse <- as.factor("multiple")

df %>%
  group_by(code.1) %>%
  mutate(type.1 = factor(type.1, levels = c(levels(type.1), levels(iffalse)))) %>% # add possible level to type.1
  summarize(type.2 = if_else(n_distinct(type.1) == 1,
                             first(type.1),
                             iffalse))

检查一个因素中的所有因素是否都是唯一的，如果是，则返回该因素。如果不是，则返回第三个值。 R

Checking if all factors within a factor are unique, then if so, returning that factor. If not, returning a third value. R

if-statement

r

factors

dplyr