不同长度数据帧的聚合因子
aggregating factors of different length dataframes
我有各种数据框,例如:
Var1 "Bananas" "Apples" "Oranges"
Freq "2" "2" "1"
Var2 "Bananas" "Carrots" "Strawberries" "Apples"
Freq "3" "2" "3" "4"
作为输出,我想要一个数据帧/table/类似的东西,给出每个输入数据帧的出现次数,包括在一个很好的概述中出现的 0 次。所以像:
Var "Bananas" "Apples" "Oranges" "Carrots" "Strawberries"
Sample1 "2" "2" "1" "0" "0"
Sample2 "3" "4" "0" "2" "3"
我想不出任何解决方案,尤其是因为 data.frames 不允许不同的长度。
你应该看看 ?merge
:
set.seed(1234)
dat1 <- data.frame(var1 = LETTERS[1:5], freq = sample(1:100, 5))
dat2 <- data.frame(var2 = LETTERS[3:7], freq = sample(1:100, 5))
res <- merge(dat1, dat2, by.x = "var1", by.y = "var2", all = TRUE)
res[is.na(res)] <- 0
res
# var1 freq.x freq.y
# 1 A 12 0
# 2 B 62 0
# 3 C 60 65
# 4 D 61 1
# 5 E 83 23
# 6 F 0 100
# 7 G 0 50
请注意,NA
和 0
的含义截然不同。查看帮助文件 ?dplyr::join
library(dplyr)
df1 <- data.frame(Var1 =c("Bananas", "Apples", "Oranges"),
Freq =c(2,2,1))
df2 <- data.frame(Var1 =c("Bananas", "Carrots",
"Strawberries", "Apples"),
Freq =c(3,2,3,4))
full_join(df1,df2, by = "Var1")
我有各种数据框,例如:
Var1 "Bananas" "Apples" "Oranges"
Freq "2" "2" "1"
Var2 "Bananas" "Carrots" "Strawberries" "Apples"
Freq "3" "2" "3" "4"
作为输出,我想要一个数据帧/table/类似的东西,给出每个输入数据帧的出现次数,包括在一个很好的概述中出现的 0 次。所以像:
Var "Bananas" "Apples" "Oranges" "Carrots" "Strawberries"
Sample1 "2" "2" "1" "0" "0"
Sample2 "3" "4" "0" "2" "3"
我想不出任何解决方案,尤其是因为 data.frames 不允许不同的长度。
你应该看看 ?merge
:
set.seed(1234)
dat1 <- data.frame(var1 = LETTERS[1:5], freq = sample(1:100, 5))
dat2 <- data.frame(var2 = LETTERS[3:7], freq = sample(1:100, 5))
res <- merge(dat1, dat2, by.x = "var1", by.y = "var2", all = TRUE)
res[is.na(res)] <- 0
res
# var1 freq.x freq.y
# 1 A 12 0
# 2 B 62 0
# 3 C 60 65
# 4 D 61 1
# 5 E 83 23
# 6 F 0 100
# 7 G 0 50
请注意,NA
和 0
的含义截然不同。查看帮助文件 ?dplyr::join
library(dplyr)
df1 <- data.frame(Var1 =c("Bananas", "Apples", "Oranges"),
Freq =c(2,2,1))
df2 <- data.frame(Var1 =c("Bananas", "Carrots",
"Strawberries", "Apples"),
Freq =c(3,2,3,4))
full_join(df1,df2, by = "Var1")