如何在不改变p值调整方法的情况下按组分离multcomp中的紧凑型字母显示(CLD)?

How to separate compact letter display (CLD) in multcomp by group without changing the p-value adjustment method?

问题

我想绘制三因素因子实验的估计边际均值,其中字母表示显着不同的均值,并针对多重比较进行了调整。我目前的工作流程是用 lmer() 拟合模型,用 emmeans() 计算估计的边际均值,然后用 cld().

实现紧凑的字母显示算法

我的问题是,当您在同一图上绘制所有三向交互时,图表太忙了。所以我想拆分情节并为每个子情节生成不同的字母集,从“a”开始。问题是,当我在 cld 中使用 by 参数将其拆分时,它会对每个 by 组内的多重比较进行单独更正。因为现在每个组中的测试较少,所以这会导致不那么保守的修正。但是,如果我尝试在没有 by 组的情况下手动拆分 cld() 的输出,我将不得不为每个子图手动重新实现字母算法。我想我可以这样做,但它看起来很麻烦。我正在尝试与客户共享此代码供他稍后修改,因此该解决方案可能太复杂了。有没有人有一个简单的方法:

  1. 获取 cld() 的输出以对所有 by 组使用一个组合校正。
  2. 使用相对简单的方法,将每个子组的紧凑字母显示减少到最少的必要字母数。

可重现的例子

加载包和数据。

library(lme4)
library(emmeans)
library(multcomp)

dat <- structure(list(y = c(2933.928571, 930.3571429, 210.7142857, 255.3571429, 
                            2112.5, 1835.714286, 1358.928571, 1560.714286, 9192.857143, 3519.642857, 
                            2771.428571, 7433.928571, 4444.642857, 3025, 3225, 2103.571429, 
                            3876.785714, 925, 1714.285714, 3225, 1783.928571, 2223.214286, 
                            2537.5, 2251.785714, 7326.785714, 5130.357143, 2539.285714, 6116.071429, 
                            5808.928571, 3341.071429, 2212.5, 7562.5, 3907.142857, 3241.071429, 
                            1294.642857, 4325, 4487.5, 2551.785714, 5648.214286, 3198.214286, 
                            1075, 335.7142857, 394.6428571, 1605.357143, 658.9285714, 805.3571429, 
                            1580.357143, 1575, 2037.5, 1721.428571, 1014.285714, 2994.642857, 
                            2116.071429, 800, 2925, 3955.357143, 9075, 3917.857143, 2666.071429, 
                            6141.071429, 3925, 1626.785714, 2864.285714, 7271.428571, 3432.142857, 
                            1826.785714, 514.2857143, 1319.642857, 1782.142857, 2637.5, 1355.357143, 
                            3328.571429, 1914.285714, 817.8571429, 1896.428571, 2121.428571, 
                            521.4285714, 360.7142857, 1114.285714, 1139.285714, 7042.857143, 
                            2371.428571, 2287.5, 4967.857143, 2180.357143, 1944.642857, 2408.928571, 
                            5289.285714, 7028.571429, 3080.357143, 5394.642857, 5973.214286, 
                            7323.214286, 1419.642857, 1455.357143, 4657.142857, 7069.642857, 
                            2451.785714, 4319.642857, 5562.5, 3953.571429, 1182.142857, 1957.142857, 
                            3796.428571, 1773.214286, 400, 871.4285714, 842.8571429, 657.1428571, 
                            1360.714286, 1853.571429, 1826.785714, 3405.357143, 2605.357143, 
                            5983.928571, 4935.714286, 4105.357143, 7666.071429, 3619.642857, 
                            5085.714286, 1592.857143, 1751.785714, 5992.857143, 2987.5, 794.6428571, 
                            3187.5, 825, 3244.642857), f1 = structure(c(4L, 4L, 4L, 4L, 4L, 
                                                                        4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
                                                                        3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
                                                                        2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 
                                                                        3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 
                                                                        3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 
                                                                        2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
                                                                        2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 
                                                                        1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", 
                                                                                                                                "B", "C", "D"), class = "factor"), f2 = structure(c(2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
                                                                                                                                                                                    1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
                                                                                                                                                                                    1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("foo", 
                                                                                                                                                                                                                                                    "bar"), class = "factor"), f3 = structure(c(4L, 3L, 2L, 1L, 3L, 
                                                                                                                                                                                                                                                                                                4L, 1L, 2L, 4L, 2L, 1L, 3L, 3L, 2L, 4L, 1L, 3L, 1L, 4L, 2L, 2L, 
                                                                                                                                                                                                                                                                                                4L, 3L, 1L, 2L, 4L, 1L, 3L, 2L, 3L, 1L, 4L, 3L, 4L, 1L, 2L, 3L, 
                                                                                                                                                                                                                                                                                                2L, 4L, 1L, 2L, 1L, 3L, 4L, 1L, 2L, 4L, 3L, 2L, 1L, 3L, 4L, 3L, 
                                                                                                                                                                                                                                                                                                1L, 4L, 2L, 4L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 3L, 4L, 1L, 2L, 1L, 
                                                                                                                                                                                                                                                                                                4L, 3L, 2L, 3L, 1L, 4L, 2L, 1L, 3L, 4L, 2L, 4L, 3L, 1L, 2L, 1L, 
                                                                                                                                                                                                                                                                                                3L, 4L, 2L, 3L, 1L, 4L, 2L, 4L, 1L, 3L, 2L, 2L, 3L, 4L, 1L, 4L, 
                                                                                                                                                                                                                                                                                                1L, 2L, 3L, 4L, 1L, 3L, 2L, 1L, 2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L, 
                                                                                                                                                                                                                                                                                                4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L, 2L, 4L), .Label = c("L1", 
                                                                                                                                                                                                                                                                                                                                                        "L2", "L3", "L4"), class = "factor"), block = structure(c(1L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          "2", "3", "4"), class = "factor")), row.names = c(NA, -128L), class = "data.frame")

拟合模型并获得估计的边际均值。

fit <- lmer(log10(y) ~ f1 * f2 * f3 + (1 | block), data = dat)
emm <- emmeans(fit, ~ f1 + f2 + f3, mode = 'Kenward-Roger', type = 'response')

版本 1

在这个版本中,我将CLD作为一个整体,正确地使用了Sidak调整进行了496次测试。但是,假设我只想绘制 f2 == 'bar' 的那些行。这些字母不再正确,因为有些是多余的(需要少于 8 个)。有没有什么函数可以把字母减下来?

cldisplay1 <- cld(emm, adjust = 'sidak', Letters = letters)
subset(as.data.frame(cldisplay1), f2 == 'bar') # correct comparisons but contains redundant letters

输出

   f1  f2 f3  response        SE df  lower.CL   upper.CL    .group
8   D bar L1  365.6732   76.1231 96  185.9699   719.0244  a       
24  D bar L3  582.8573  121.3349 96  296.4229  1146.0742  ab      
16  D bar L2  682.9238  142.1659 96  347.3136  1342.8353  ab      
7   C bar L1  898.1560  186.9714 96  456.7740  1766.0470  abcd    
6   B bar L1 1627.7069  338.8438 96  827.8006  3200.5652   bcdefg 
15  C bar L2 1635.4393  340.4534 96  831.7330  3215.7694   bcdefg 
32  D bar L4 1746.6052  363.5951 96  888.2685  3434.3552   bcdefg 
31  C bar L4 2348.6629  488.9270 96 1194.4562  4618.1832    cdefgh
21  A bar L3 2499.6772  520.3640 96 1271.2573  4915.1230    cdefgh
5   A bar L1 2545.4594  529.8946 96 1294.5407  5005.1448    cdefgh
23  C bar L3 2561.0138  533.1326 96 1302.4512  5035.7294    cdefgh
30  B bar L4 3158.6969  657.5538 96 1606.4140  6210.9556      efgh
22  B bar L3 3364.9438  700.4887 96 1711.3047  6616.4994      efgh
14  B bar L2 3411.4009  710.1598 96 1734.9313  6707.8482      efgh
13  A bar L2 3769.4223  784.6900 96 1917.0098  7411.8269      efgh
29  A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551         h

版本 2

在这个版本中,我使用 cld()by 参数来拆分 f2。这减少了每个组内的字母,但 Sidak 调整现在不那么保守了。例如,第 8 行和第 16 行在调整后的 alpha 水平上与上面的比较没有显着差异,但现在它们 不同。但我不想更改使用的测试,只是为了仅绘制数据的一个子集。有没有一种方法可以指定我作为一个整体调整的测试数量,即使 cld 被分成了 by 个组?

cldisplay2 <- cld(emm, adjust = 'sidak', by = 'f2', Letters = letters)
subset(as.data.frame(cldisplay2), f2 == 'bar')

输出

   f1  f2 f3  response        SE df  lower.CL   upper.CL  .group
8   D bar L1  365.6732   76.1231 96  185.9699   719.0244  a     
24  D bar L3  582.8573  121.3349 96  296.4229  1146.0742  ab    
16  D bar L2  682.9238  142.1659 96  347.3136  1342.8353  abc   
7   C bar L1  898.1560  186.9714 96  456.7740  1766.0470  abcd  
6   B bar L1 1627.7069  338.8438 96  827.8006  3200.5652   bcde 
15  C bar L2 1635.4393  340.4534 96  831.7330  3215.7694   bcde 
32  D bar L4 1746.6052  363.5951 96  888.2685  3434.3552    cde 
31  C bar L4 2348.6629  488.9270 96 1194.4562  4618.1832     de 
21  A bar L3 2499.6772  520.3640 96 1271.2573  4915.1230     def
5   A bar L1 2545.4594  529.8946 96 1294.5407  5005.1448     def
23  C bar L3 2561.0138  533.1326 96 1302.4512  5035.7294     def
30  B bar L4 3158.6969  657.5538 96 1606.4140  6210.9556      ef
22  B bar L3 3364.9438  700.4887 96 1711.3047  6616.4994      ef
14  B bar L2 3411.4009  710.1598 96 1734.9313  6707.8482      ef
13  A bar L2 3769.4223  784.6900 96 1917.0098  7411.8269      ef
29  A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551       f

对于两个单独的 table(或绘图?),您总共显示了 90 + 90 = 180 次比较。如果您想要对所有这 180 次比较进行整体多重性调整,则需要比 496 次比较少得多的保守。但是,可以指定 level 的不同值,以便 Sidak 调整正确进行。例如,如果您希望整体 alpha 为 0.05,请使用

cld(emm, adjust = 'sidak', by = 'f2', Letters = letters, 
    alpha = 1 - sqrt(0.95))

由此,您指定 alpha = 0.02532。注意如果

p.adj  =  1 - (1 - p)^90  <  1 - sqrt(.95)

然后

(1 - p)^90 > sqrt(.95)

所以

(1 - p)^180 > .95

因此

1 - (1 - p)^180  < .05

也就是说,通过将 CLD table 分成两部分,每部分显示 90 次比较,我们正确地应用 Sidak 调整来校正显着性水平为 0.05 的 180 次比较总数。

增强

基于此的另一个想法导致不太保守的调整是指定 Tukey 调整:

cld(emm, adjust = 'tukey', by = 'f2', Letters = letters, 
    alpha = 1 - sqrt(0.95))

因此,每个单独的 table 都有一个精确的全族错误率 1 - sqrt(0.05);并且我们使用了Sidak adjustment(略显保守),使得全族180次测试的错误率小于0.05。