将分类变量重新编码为 R 中的新变量
Recode categorical variable as new variable in R
我如何根据 R 中第一列中的值向该数据添加新的分类列?像这样:
> head(df)
common_name
1 Sailfin molly
2 Hardhead silverside
3 Blue crab
if common_name = "Sailfin molly", "Hardhead silverside", put "Fish"
否则,放“螃蟹”
> head(df)
common_name category
1 Sailfin molly Fish
2 Hardhead silverside Fish
3 Blue crab Crab
在这里找到这个答案 (https://rstudio-pubs-static.s3.amazonaws.com/116317_e6922e81e72e4e3f83995485ce686c14.html#/9)
df <- mutate(df, cat = ifelse(grepl("Sailfin molly", common_name), "Fish",
ifelse(grepl("Hardhead silverside", common_name), "Fish", "Crab")))
使用 dput()
提供数据样本,不要只列出打印输出,因为这会隐藏重要细节:
df <- structure(list(common_name = c("Sailfin molly", "Hardhead silverside",
"Blue crab")), class = "data.frame", row.names = c(NA, -3L))
现在我们需要一个常用名称列表:
Names <- unique(df$common_name)
Names
# [1] "Sailfin molly" "Hardhead silverside" "Blue crab"
Fish <- unique(df$common_name)[1:2]
前两个名字是鱼。您的完整数据将有更多名称,但您必须创建一个列出鱼的变量。然后添加新列:
df$category <- ifelse(df$common_name %in% Fish, "Fish", "Crab")
df
common_name category
1 Sailfin molly Fish
2 Hardhead silverside Fish
3 Blue crab Crab
如果您有两个以上的类别,则创建一个包含每个 common_name
和 category
的 2 列数据框会更容易,然后使用 merge()
.
df2 <- df[, 1, drop=FALSE]
table <- data.frame(common_name=Names, category=df$category)
merge(df2, table)
# common_name category
# 1 Blue crab Crab
# 2 Hardhead silverside Fish
# 3 Sailfin molly Fish
我如何根据 R 中第一列中的值向该数据添加新的分类列?像这样:
> head(df)
common_name
1 Sailfin molly
2 Hardhead silverside
3 Blue crab
if common_name = "Sailfin molly", "Hardhead silverside", put "Fish" 否则,放“螃蟹”
> head(df)
common_name category
1 Sailfin molly Fish
2 Hardhead silverside Fish
3 Blue crab Crab
在这里找到这个答案 (https://rstudio-pubs-static.s3.amazonaws.com/116317_e6922e81e72e4e3f83995485ce686c14.html#/9)
df <- mutate(df, cat = ifelse(grepl("Sailfin molly", common_name), "Fish",
ifelse(grepl("Hardhead silverside", common_name), "Fish", "Crab")))
使用 dput()
提供数据样本,不要只列出打印输出,因为这会隐藏重要细节:
df <- structure(list(common_name = c("Sailfin molly", "Hardhead silverside",
"Blue crab")), class = "data.frame", row.names = c(NA, -3L))
现在我们需要一个常用名称列表:
Names <- unique(df$common_name)
Names
# [1] "Sailfin molly" "Hardhead silverside" "Blue crab"
Fish <- unique(df$common_name)[1:2]
前两个名字是鱼。您的完整数据将有更多名称,但您必须创建一个列出鱼的变量。然后添加新列:
df$category <- ifelse(df$common_name %in% Fish, "Fish", "Crab")
df
common_name category
1 Sailfin molly Fish
2 Hardhead silverside Fish
3 Blue crab Crab
如果您有两个以上的类别,则创建一个包含每个 common_name
和 category
的 2 列数据框会更容易,然后使用 merge()
.
df2 <- df[, 1, drop=FALSE]
table <- data.frame(common_name=Names, category=df$category)
merge(df2, table)
# common_name category
# 1 Blue crab Crab
# 2 Hardhead silverside Fish
# 3 Sailfin molly Fish