只制作一些功能 dummyVars

Make only some features dummyVars

my_diamonds <- diamonds %>% mutate(cut = as.character(cut),
                                   color = as.character(color),
                                   clarity = as.character(clarity))

我想创建一个新的数据框,其中只有 cut 和 color 作为 dummyVars。

但是,我无法让下面代码中的第一个块工作:

# make cut and color dummar vars
dummy <- caret::dummyVars("cut + color",
                            data = my_diamonds, fullRank = F, sep = ".")

# now create the dummy vars as new dataframe training data
training_data <- predict(dummy, my_diamonds) %>% as.data.frame()

这篇文章:

# make cut and color dummar vars
dummy <- caret::dummyVars("cut + color",
                            data = my_diamonds, fullRank = F, sep = ".")

给出: eval(parse(text = x, keep.source = FALSE)[[1L]]) 错误: 找不到对象 'color'。

也尝试过:

dummy <- caret::dummyVars(~ "cut + color",
                            data = my_diamonds, fullRank = F, sep = ".")

给出: terms.formula(formula, data = data) 错误: ExtractVars

中的模型公式无效

如何根据 my_diamonds 创建一个新的数据框,其中 cut 和 color 是虚拟变量?

一个小问题:~ "cut + color" 应该改为 "~ cut + color" 或者只是 ~ cut + color:

dummy <- caret::dummyVars(~ cut + color,
                          data = my_diamonds, fullRank = FALSE, sep = ".")
training_data <- predict(dummy, my_diamonds) %>% as.data.frame()
head(training_data)
#   cutFair cutGood cutIdeal cutPremium cutVery Good colorD colorE colorF colorG colorH colorI colorJ
# 1       0       0        1          0            0      0      1      0      0      0      0      0
# 2       0       0        0          1            0      0      1      0      0      0      0      0
# 3       0       1        0          0            0      0      1      0      0      0      0      0
# 4       0       0        0          1            0      0      0      0      0      0      1      0
# 5       0       1        0          0            0      0      0      0      0      0      0      1
# 6       0       0        0          0            1      0      0      0      0      0      0      1