为 2 个分类变量之间的 ggalluvial 图格式化数据框?

Formating dataframe for a ggalluvial plot between 2 categorical variables?

我有一个包含三个分类变量的数据框:

数据框由多行组成,每个人一行。 前 20 行:

classification1 <- c(4, 3, 1, 2, 3, 1, 2, 2, 2, 2, 1, 1, 4, 2, 2, 1, 2, 1, 3, 2)
classification2 <- c("Medium", "Medium", "Low", "High", "High", "Low", "Medium", "Medium", "High", "Low", "Low", "Low", "High", "High", "Medium", "Low", "Medium", "Low", "Medium", "Medium")
survival <- c(2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 2, 2, 1, 2, 2, 1, 2, 1)
df <- data.frame(classification1, classification2, survival)

我想使用 ggalluvialggplot2 来构建这样的冲积地块,但我不知道怎么做!

下面的代码return一个错误( Error in FUN(X[[i]], ...) : objet 'Freq' introuvable ) 因为我不知道什么是“Freq”:

ggplot(data = df, aes(axis1 = classification1, axis2 = classification2, y = Freq)) +
  scale_x_discrete(limits = c("classification1", "classification2"), expand = c(.2, .05)) +
  geom_alluvium(aes(fill = survival)) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum)))

关于如何格式化我的数据框以适应 ggaluvial 的任何线索?

您可以使用例如dplyr::count 添加频率列:

library(ggalluvial)
library(dplyr)

df <- df %>% 
  count(classification1, classification2, survival, name = "Freq")

ggplot(data = df, aes(axis1 = classification1, axis2 = classification2, y = Freq)) +
  scale_x_discrete(limits = c("classification1", "classification2"), expand = c(.2, .05)) +
  geom_alluvium(aes(fill = factor(survival))) +
  geom_stratum() +
  geom_text(stat = "stratum", aes(label = after_stat(stratum)))