如何将紧凑的字母显示添加到 ggboxplot()？

Question

我正在尝试在我创建的箱线图中添加紧凑的字母显示，是否有机会将 cldList() 函数与 ggboxplot() 结合使用？

这是我的示例数据

library(FSA)
library(multcompView)
library(rcompanion)
library(ggplot2)
library(ggpubr)
library(tidyr)

df_list <- list(
  `1.3.A` = 
    tibble::tribble(
      ~Person, ~Height, ~Weight,
      "Alex",    175,     75,
      "Gerard",    110,     85,
      "Clyde",    120,     79
    ),
  `2.2.A` = 
    tibble::tribble(
      ~Person, ~Height, ~Weight,
      "Missy",    162,     55,
      "Britany",    111,     56,
      "Sussie",    192,     85
    ), 
  `1.1.B` = 
    tibble::tribble(
      ~Person, ~Height, ~Weight,
      "Luke",    177,     66,
      "Alex",    169,     69,
      "Haley",    145,     54
    )
)


lapply(df_list, function(i) ggboxplot(i, x = "Person", y = c("Height", "Weight"), combine = TRUE))
lapply(df_list, function(k) dunnTest(Weight ~ as.factor(Person), method = "bh", data = k))
lapply(df_list, function(i) cldList(P.adj ~ Comparison, threshold = 0.05))

我正在尝试为每个 Person 添加重要字母，在我的原始数据中，我有 30 个组进行比较，并且在箱线图中添加紧凑的字母显示将使数据解释更容易。

我在一个列表中也有多个数据框，想知道 cldList() 是否可以包含在 lapply() 函数中

希望有人能帮忙

Answer 1

我是 example you already mentioned in a comment 的作者。其中一位评论员已经正确指出了 .group 列的来源。但是，当指出您的代码与示例中的代码之间的一般差异时，我发现

数据

1a。您的数据每个因子水平（=人）有 1 个观察值。

1b。我的数据每个因子水平（=组）有多个观察值。
平均比较

2a。您使用 FSA::Dunntest 来拟合模型并立即比较每个因子水平的均值。

2b。我使用 lm() 找到模型，然后 emmeans::emmeans() 比较每个因子水平的均值。
紧凑型字母显示

3a。您使用 rcompanion::cldList() 来获取字母。

3b。我用 multcomp::cld() 来获取字母。

我认为第 2 点和第 3 点完全没问题 - 它们只是导致相同目标的不同功能。我在我的数据上尝试了你的方法并且有效：

dunnTest_out <- FSA::dunnTest(weight ~ as.factor(group), method = "bh", data = PlantGrowth)
rcompanion::cldList(P.adj ~ Comparison, data = dunnTest_out$res, threshold = 0.05)
#>   Group Letter MonoLetter
#> 1  ctrl     ab         ab
#> 2  trt1      a         a 
#> 3  trt2      b          b

但是，我觉得你的数据不对。如果您的“手段”实际上不是手段，而是单个值，您不应该能够相互比较“手段”甚至执行测试（其结果可以通过紧凑的字母显示显示）。

我将您的示例简化为数据集之一：

dat_1.1.B <- 
  tibble::tribble(
    ~Person, ~Height, ~Weight,
    "Luke",    177,     66,
    "Alex",    169,     69,
    "Haley",    145,    54
  )

dunnTest_out <- FSA::dunnTest(Weight ~ as.factor(Person), method = "bh", data = dat_1.1.B)
dunnTest_out
#> Dunn (1964) Kruskal-Wallis multiple comparison
#>   p-values adjusted with the Benjamini-Hochberg method.
#>     Comparison          Z   P.unadj     P.adj
#> 1 Alex - Haley  1.4142136 0.1572992 0.4718976
#> 2  Alex - Luke  0.7071068 0.4795001 0.4795001
#> 3 Haley - Luke -0.7071068 0.4795001 0.7192502

rcompanion::cldList(P.adj ~ Comparison, data = dunnTest_out$res, threshold = 0.05)
#> Error: No significant differences.

请注意，当我将其中一个 Weight 值更改为更大的数字但 p 值根本没有改变时，很明显有些东西无法正常工作。

dat_1.1.B <- 
  tibble::tribble(
    ~Person, ~Height, ~Weight,
    "Luke",    177,     66,
    "Alex",    169,     100000069,
    "Haley",    145,     54
  )

dunnTest_out <- FSA::dunnTest(Weight ~ as.factor(Person), method = "bh", data = dat_1.1.B)
dunnTest_out
#> Dunn (1964) Kruskal-Wallis multiple comparison
#>   p-values adjusted with the Benjamini-Hochberg method.
#>     Comparison          Z   P.unadj     P.adj
#> 1 Alex - Haley  1.4142136 0.1572992 0.4718976
#> 2  Alex - Luke  0.7071068 0.4795001 0.4795001
#> 3 Haley - Luke -0.7071068 0.4795001 0.7192502

rcompanion::cldList(P.adj ~ Comparison, data = dunnTest_out$res, threshold = 0.05)
#> Error: No significant differences.

是的，我认为是数据。请注意，错误“无显着差异”对我来说也很奇怪，因为假设测试正确完成，无显着差异 仅意味着所有值都具有相同的字母。

tl;dr：数据是问题所在。如果您的“均值”只是每组的单个值，则无法通过测试比较均值。 如果您拥有用于获取每组单个值的原始数据，则应该将其输入到模型中 -正如在我的例子和 ?FSA::Dunntest and ?rcompanion::cldList().

的例子中所做的那样

如何将紧凑的字母显示添加到 ggboxplot()？

How to add compact letter display to ggboxplot()?

r

function

ggplot2

lapply