堆叠直方图上的颜色标签（在 R 中）

Question

我想重现下面的堆叠直方图（来自 Armstrong A.，Microbiome，2018）。

情节本身没问题，我可以用 PcoA 坐标来排序我的相对丰度。我的问题是我找不到通过一些临床数据（在本例中为性取向和 HIV）标记每个 column/stack 顶部的解决方案。这是 ggplot 还是其他？我如何绘制上面的行？谢谢！

[编辑] 尝试的一些数据：

直方图（已由 PCoA 订购）

Pt  Streptococcus   Staphylococcus  Lactobacillus   Acinetobacter   Pseudomonas Bacillus
patient1    61.20    5.65    7.45    1.65    0.30    0.60
patient6    43.00    2.10   18.10    0.40    0.60    0.60
patient5    41.95    4.10   24.55    0.75    0.90    0.00
patient8    41.15   25.95    3.50    0.20    7.45    0.30
patient4    26.45   55.10    2.55    3.40    0.05    2.85
patient7    18.20   26.40    0.95   20.25    0.50    0.05
patient3    18.00   18.70   38.55    0.10   56.55    0.00
patient2     0.35    0.05    2.10    0.20    0.40   94.75

要在顶部标记的元数据

Pt Time
patient1    T1
patient2    T3
patient3    T4
patient4    T2
patient5    T2
patient6    T1
patient7    T1
patient8    T2

Answer 1

好的，这是一个比较快速的回答 - 在准备数据时遇到了一些问题，所以请忽略前期准备代码，因为它有点混乱。但基本上我会推荐使用优秀的 patchwork 库来创建组合图。语法非常简单，但您可以非常细粒度地控制输出。

加载包：

library(tidyverse)
library(patchwork)

读取数据：

df <- read.csv2(text = "cPt,  Streptococcus,   Staphylococcus,  Lactobacillus,   Acinetobacter,   Pseudomonas, Bacillus,
patient1,    61.20,    5.65,    7.45,    1.65,    0.30,    0.60,
patient6,    43.00,    2.10,   18.10,    0.40,    0.60,    0.60,
patient5,    41.95,    4.10,   24.55,    0.75,    0.90,    0.00,
patient8,    41.15,   25.95,    3.50,    0.20,    7.45,    0.30,
patient4,    26.45,   55.10,    2.55,    3.40,    0.05,    2.85,
patient7,    18.20,   26.40,    0.95,   20.25,    0.50,    0.05,
patient3,    18.00,   18.70,   38.55,    0.10,   5.655,    0.00,
patient2,     0.35,    0.05,    2.10,    0.20,    0.40,   94.75", 
header = T, 
sep = ",", 
stringsAsFactors = FALSE
)

df2 <- read.csv2(text = "cPt, Time
patient1,    T1
patient2,    T3
patient3,    T4
patient4,    T2
patient5,    T2
patient6,    T1
patient7,    T1
patient8,    T2", 
header = T, sep = ","
)

争论数据-计算其他细菌：

df %>% 
    select(-X) %>%
    mutate(other = as.character(100 - as.numeric(Streptococcus) - 
            as.numeric(Staphylococcus) - 
            as.numeric(Lactobacillus) - 
            as.numeric(Acinetobacter) - 
            as.numeric(Pseudomonas) - 
            as.numeric(Bacillus)))  %>%
    pivot_longer(-cPt) %>% 
    left_join(df2) -> df.complete

创建底图：

df.complete %>%
    ggplot(aes(x = cPt, y = as.numeric(value), fill = name)) + 
    geom_col(width = 1) + 
    theme_minimal() + 
    theme(legend.position = "bottom")  + 
    scale_fill_brewer(palette = "Set3") -> plot1

创建顶部图：

df.complete %>%
    ggplot(aes(x = cPt, fill = Time)) + 
    geom_bar(width = 1) + 
    theme_void() + 
    theme(legend.position = "top") + 
    scale_fill_brewer(palette = "Set1") -> plot2

拼凑地块：

plot2 + plot1 + plot_layout(ncol = 1, heights = c(1, 10))

堆叠直方图上的颜色标签（在 R 中）

Color label over stacked histogram (in R)

r

data-manipulation

bar-chart

ggplot2

stacked-chart