将分类列转换为多个二进制列
Convert categorical column to multiple binary columns
我想将此列转换为每个品种的二进制列(1 条狗是品种,0 条狗不是那个品种)
一种方法是使用 unique
和 for-loop
Breed = c(
"Sheetland Sheepdog Mix",
"Pit Bull Mix",
"Lhasa Aposo/Miniature",
"Cairn Terrier/Chihuahua Mix",
"American Pitbull",
"Cairn Terrier",
"Pit Bull Mix"
)
df=data.frame(Breed)
for (i in unique(df$breed)){
df[,paste0(i)]=ifelse(df$Breed==i,1,0)
}
使用 model.matrix() 将分类变量转换为二进制变量。
Breed = c(
"Sheetland Sheepdog Mix",
"Pit Bull Mix",
"Lhasa Aposo/Miniature",
"Cairn Terrier/Chihuahua Mix",
"American Pitbull",
"Cairn Terrier",
"Pit Bull Mix"
)
df=data.frame(Breed)
dfcat = data.frame(model.matrix(~ df$Breed-1, data=df))
names(dfcat) = levels(df$Breed)
因此 dfcat 包含您的二进制变量:
dfcat
#American Pitbull Cairn Terrier Cairn Terrier/Chihuahua Mix Lhasa Aposo/Miniature Pit Bull Mix Sheetland Sheepdog Mix
# 0 0 0 0 0 1
# 0 0 0 0 1 0
# 0 0 0 1 0 0
# 0 0 1 0 0 0
# 1 0 0 0 0 0
# 0 1 0 0 0 0
# 0 0 0 0 1 0
我想将此列转换为每个品种的二进制列(1 条狗是品种,0 条狗不是那个品种)
一种方法是使用 unique
和 for-loop
Breed = c(
"Sheetland Sheepdog Mix",
"Pit Bull Mix",
"Lhasa Aposo/Miniature",
"Cairn Terrier/Chihuahua Mix",
"American Pitbull",
"Cairn Terrier",
"Pit Bull Mix"
)
df=data.frame(Breed)
for (i in unique(df$breed)){
df[,paste0(i)]=ifelse(df$Breed==i,1,0)
}
使用 model.matrix() 将分类变量转换为二进制变量。
Breed = c(
"Sheetland Sheepdog Mix",
"Pit Bull Mix",
"Lhasa Aposo/Miniature",
"Cairn Terrier/Chihuahua Mix",
"American Pitbull",
"Cairn Terrier",
"Pit Bull Mix"
)
df=data.frame(Breed)
dfcat = data.frame(model.matrix(~ df$Breed-1, data=df))
names(dfcat) = levels(df$Breed)
因此 dfcat 包含您的二进制变量:
dfcat
#American Pitbull Cairn Terrier Cairn Terrier/Chihuahua Mix Lhasa Aposo/Miniature Pit Bull Mix Sheetland Sheepdog Mix
# 0 0 0 0 0 1
# 0 0 0 0 1 0
# 0 0 0 1 0 0
# 0 0 1 0 0 0
# 1 0 0 0 0 0
# 0 1 0 0 0 0
# 0 0 0 0 1 0