将 R 用于 3 变量堆叠条形图

Using R for 3 variable stacked barplot

目前,我有这个代码:

### Uses the library ggplot2###
library("ggplot2")
library("reshape2")

### Reads in the CSV file to be plotted ###
plot <- read.csv("C:/Users/dam203/Desktop/Ongoing_Projects/output-1.csv")

### Makes R recognize that the X-axis has been pre-sorted so that GGPLOT2 does not sort alphabetically. ###
plot$Date <- factor(plot$Date, levels = plot$Date)

### Plots the Graph ###
ggplot(plot[which(plot$F.Sym.Onset>0),], aes(x=Date, y=F.Sym.Onset)) + geom_bar(stat="identity") + theme(axis.text.x=element_text(angle=90, hjust=1)) + ggtitle("Epidemic Pertussis Case Curve")

这是 CSV 文件中数据的一小部分示例。但是,实际的 CSV 文件中有更多的列和行。日期、C.Sym.Onset、F.Sym.Onset、D.Sym.Onset 是我现在唯一感兴趣的列。:

Date    C.Sym.Onset   F.Sym.Onset   D.Sym.Onset     Temp
6-Jan        2              1                        47
7-Jan        1              3            2           57
8-Jan                                    1           54
9-Jan                                                58
10-Jan       1                                       59

正如上面的代码,它目前忽略没有 F.Sym.Onset 的日期,并绘制根据图表组织的给定日期的病例数,而不是 ggplots 默认的字母顺序。

我的问题是,如何才能在垂直堆叠条形图上绘制 C.Sym.onset、F.Sym.Onset 和 D.Sym.Onset?

这是我的脚本当前从包含完整数据的 CSV 文件生成的图表的副本:

RPlot

感谢您的帮助!

library(ggplot2)
library(tidyr)
library(dplyr)

df <- data.frame(Date = seq.Date(Sys.Date()-5, Sys.Date(), by = 'days'),
                 x = sample(c(1:3, NA), 6, replace = TRUE),
                 y = sample(c(1:3, NA), 6, replace = TRUE),
                 z = sample(c(1:3, NA), 6, replace = TRUE))
longdf <- gather(df, var, value, -Date)

ggplot(longdf, aes(value, fill = var)) +
  geom_bar()

您需要将列收集到一个变量中。然后,您可以通过使用 var.

指定填充来将它们绘制为堆积条形图

我重写了代码,我能够使用以下代码使用 melt 创建堆叠条形图:

### Uses the library ggplot2###
library("ggplot2")
library("reshape2")

### Reads in the CSV file to be plotted ###
ModelOutput <- read.csv("C:/Users/dam203/Desktop/Ongoing_Projects/output-3.csv")

### Creates a new dataframe using only variables needed to plot the PECurve. ###
PECurve <- data.frame("Date" = ModelOutput$Date, 
                "C.Sym.Onset" = ModelOutput$C.Sym.Onset,
                "F.Sym.Onset" = ModelOutput$F.Sym.Onset,
                "D.Sym.Onset" = ModelOutput$D.Sym.Onset)

### Removes rows where value is 'NA' in C.Sym.Onset, F.Sym.Onset, and D.Sym.Onset. ###
PECurve <- PECurve[!with(PECurve,is.na(C.Sym.Onset)& is.na(F.Sym.Onset)&  is.na(D.Sym.Onset)),]

### Makes R recognize that the X-axis has been pre-sorted so that GGPLOT2 does not sort alphabetically. ###
PECurve$Date <- factor(PECurve$Date, levels = PECurve$Date)

### Plots according to Day of Symptom Onset ###
df <- melt(PECurve, .measure.vars=.(C.Sym.Onset, F.Sym.Onset, D.Sym.Onset))
ggplot(df, aes(x=Date, y=value, fill=variable)) +
  geom_bar(stat="identity") + theme(axis.text.x=element_text(angle=90, hjust=1))

CSV 的最终绘图示例如下所示:

GGPlot2