使用频率 table 创建堆积条

Create a stacked bar using a frequency table

实际上我正在使用两个具有名称的频率表:identified_modification_tableunidentified_modifications_table

这些文件的结构是这样的:

identified_modification_table

Modifications   | Frequency
MOD:42123       | 12
MOD:1234        | 7
MOD:7618        | 36
MOD:411232      | 51

unidentified_modifications_table

Modifications   | Frequency
MOD:42123       | 12  
MOD:12          | 20
MOD:7618        | 36
MOD:411232      | 51

我想合并这些文件并创建此输出,以便创建像本示例一样的堆叠条形图。

Modifications   | Frequency.1 | Frequency.2 
MOD:42123       | 12          | 12
MOD:1234        | 7           | NA
MOD:12          | NA          | 20
MOD:7618        | 36          | 36
MOD:411232      | 51          | 51

我试图使用此代码合并表并在值不存在的地方添加 NA。

df_final <- cbind.data.frame(df1, df2[match(df1$modifications, df2$modifications), ]);

但这不能正常工作,我不知道为什么。

在此之后我想我应该只使用 melt 和 ggplot2 堆积条:

df_barplot <- melt(df,measure.vars = names(df))

ggplot((df_barplot), aes(x = value, fill = variable)) + 
    geom_bar(stat = "count", position = "dodge") + 
    theme(axis.text.x = element_text(angle = 20, hjust = 0.5, vjust = -0.1)) + 
    guides(fill=FALSE)+
    labs("Barplot") + 
    xlab("Values")+
    ylab("Frequency")+
    theme(text = element_text(size=18), axis.text.x = element_text(angle = 90, hjust = 1, size = 15), axis.text.y=element_text(size = 15))

有谁知道我该怎么做?

这是一个可重现的例子:

df1 <- data.frame(modifications=c("MOD:214", "MOD:3","MOD:24","MOD:44","MOD:123", "MOD:123", "MOD:212"), Frequency=c(1,41,616,727,828,8993,383))


  df2 <- data.frame(modifications=c("MOD:214", "MOD:3","MOD:24","MOD:445","MOD:12", "MOD:123", "MOD:212"), Frequency=c(1,43,64,77,88,893,38))

谢谢

这是 tidyverse 的方法:

library(tidyverse)
merged_df <- full_join(df1, df2, by = "modifications")
merged_df <- gather(merged_df, key = Category, value = Frequency, -modifications)

图表:

ggplot(merged_df, aes(x = modifications, y = Frequency, fill = Category)) + 
geom_col(position = "dodge")

我想这就是你想要的

df3<-merge(df1,df2, by = "modifications",all = T)

library(reshape2)
df3<- melt(df3)
df3$variable<-factor(df3$variable,labels = c("modifications1","modifications2"))

library(ggplot2)
ggplot(df3, aes(x = modifications, y = value, fill = variable)) + 
  geom_bar(stat = "identity",position = "dodge")

编辑:添加了 all = T 以保留出现在 table

中的所有频率