R:将字符串绑定到因子列
R: rbind string to factor column
我尝试将包含字符串的行绑定到数据框,但没有成功。
代码:
df1 = data.frame(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
df2 = data.frame(id=5:1, age=c(11,24,25,67,2))
df3 = merge(df1,df2)
df3 = rbind(df3, c(6, "hannah", 30))
df3
str(df3)
结果:
> df1 = data.frame(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
> df2 = data.frame(id=5:1, age=c(11,24,25,67,2))
> df3 = merge(df1,df2)
> df3 = rbind(df3, c(6, "hannah", 30))
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = "hannah") :
ungültiges Faktorniveau, NA erzeugt
> df3
id name age
1 1 peter 2
2 2 kate 67
3 3 lisa 25
4 4 daniel 24
5 5 paul 11
6 6 <NA> 30
>
> str(df3)
'data.frame': 6 obs. of 3 variables:
$ id : chr "1" "2" "3" "4" ...
$ name: Factor w/ 5 levels "daniel","kate",..: 5 2 3 1 4 NA
$ age : chr "2" "67" "25" "24" ...
我看起来 R 使名称列成为因子列,这就是它不接受字符串值的原因。我该如何解决这个问题?哪个更可取:将整个列转换为字符串列(如果存在)或将新字符串转换为一个因子?如何做到这一点?
谢谢!
一个好的解决方案是使用 dplyr 包并创建 tibble 而不是数据框(tibble 是一种现代类型的数据框,它创建字符变量作为标准而不是因子)。
library(dplyr)
df1 <- tibble(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
df2 <- tibble(id=5:1, age=c(11,24,25,67,2))
df3 <- left_join(df1,df2) #or merge(df1, df2) as you prefere so
df3 <- rbind(df3, c(6, "hannah", 30))
df3
str(df3)
我尝试将包含字符串的行绑定到数据框,但没有成功。 代码:
df1 = data.frame(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
df2 = data.frame(id=5:1, age=c(11,24,25,67,2))
df3 = merge(df1,df2)
df3 = rbind(df3, c(6, "hannah", 30))
df3
str(df3)
结果:
> df1 = data.frame(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
> df2 = data.frame(id=5:1, age=c(11,24,25,67,2))
> df3 = merge(df1,df2)
> df3 = rbind(df3, c(6, "hannah", 30))
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = "hannah") :
ungültiges Faktorniveau, NA erzeugt
> df3
id name age
1 1 peter 2
2 2 kate 67
3 3 lisa 25
4 4 daniel 24
5 5 paul 11
6 6 <NA> 30
>
> str(df3)
'data.frame': 6 obs. of 3 variables:
$ id : chr "1" "2" "3" "4" ...
$ name: Factor w/ 5 levels "daniel","kate",..: 5 2 3 1 4 NA
$ age : chr "2" "67" "25" "24" ...
我看起来 R 使名称列成为因子列,这就是它不接受字符串值的原因。我该如何解决这个问题?哪个更可取:将整个列转换为字符串列(如果存在)或将新字符串转换为一个因子?如何做到这一点?
谢谢!
一个好的解决方案是使用 dplyr 包并创建 tibble 而不是数据框(tibble 是一种现代类型的数据框,它创建字符变量作为标准而不是因子)。
library(dplyr)
df1 <- tibble(id=1:5, name=c("peter","kate","lisa","daniel","paul"))
df2 <- tibble(id=5:1, age=c(11,24,25,67,2))
df3 <- left_join(df1,df2) #or merge(df1, df2) as you prefere so
df3 <- rbind(df3, c(6, "hannah", 30))
df3
str(df3)