在与 emmeans 交互时绘制缺少类别的结果

Question

我有一个相当“混乱的数据”。我有一个在两个因素之间相互作用的模型。我想绘制它。所以：

f1 <- structure(list(tipo = c("digitables", "digitables", "digitables", 
"digitables", "digitables", "digitables", "digitables", "digitables", 
"payments", "payments", "payments", "payments", "payments", "payments", 
"payments", "payments", "traditionals", "traditionals", "traditionals", 
"traditionals", "traditionals", "traditionals", "traditionals", 
"traditionals"), categoria = c("Advice", "Digital banks", "Exchange", 
"FinTech", "Insurance", "Investments", "Lending", "Payments and transfers", 
"Advice", "Digital banks", "Exchange", "FinTech", "Insurance", 
"Investments", "Lending", "Payments and transfers", "Advice", 
"Digital banks", "Exchange", "FinTech", "Insurance", "Investments", 
"Lending", "Payments and transfers"), Total = c(63L, 450L, 279L, 
63L, 36L, 108L, 567L, 549L, 63L, 450L, 279L, 63L, 36L, 108L, 
567L, 549L, 35L, 250L, 155L, 35L, 20L, 60L, 315L, 305L), Frequencia = c(44L, 
266L, 118L, 9L, 14L, 45L, 134L, 242L, 33L, 68L, 2L, 10L, 3L, 
8L, 11L, 78L, 27L, 226L, 142L, 10L, 20L, 45L, 300L, 245L), Perc = c(69.84, 
59.11, 42.29, 14.29, 38.89, 41.67, 23.63, 44.08, 52.38, 15.11, 
0.72, 15.87, 8.33, 7.41, 1.94, 14.21, 77.14, 90.4, 91.61, 28.57, 
100, 75, 95.24, 80.33), Failure = c(19L, 184L, 161L, 54L, 22L, 
63L, 433L, 307L, 30L, 382L, 277L, 53L, 33L, 100L, 556L, 471L, 
8L, 24L, 13L, 25L, 0L, 15L, 15L, 60L)), row.names = c(NA, -24L
), class = "data.frame")
# Packages
library(dplyr)
library(ggplot2)
library(emmeans) #version 1.4.8. or 1.5.1
# Works as expected
m1 <- glm(cbind(Frequencia, Failure) ~ tipo*categoria,
          data = f1, family = binomial(link = "logit"))
l1 <- emmeans(m1, ~categoria|tipo)
plot(l1, type = "response",
        comparison = T,
     by = "categoria")

使用 by="tipo" 结果：

# Doesn't work:
plot(l1, type = "response",
        comparison = T,
     by = "tipo")
Error: Aborted -- Some comparison arrows have negative length!
In addition: Warning message:
Comparison discrepancy in group digitables, Advice - Insurance:
    Target overlap = -0.0241, overlap on graph = 0.0073

如果我按照 explanation supplement vignette 的建议使用 comparison = F，它就可以工作。但是，它没有给我显示非常重要的箭头。

问题 1 - 有解决办法吗？（还是因为我的数据不可能？）

从上图可以看出，有一个概率为 1 的类别（categoria=Insurance 和 tipo=traditionals）。所以，我只删除了数据框的这一行，然后尝试重做绘图，结果是：

f1 <- f1 %>% 
  filter(!Perc ==100)
m1 <- glm(cbind(Frequencia, Failure) ~ tipo*categoria,
          data = f1, family = binomial(link = "logit"))
l1 <- emmeans(m1, ~categoria|tipo)
plot(l1, type = "response",
        comparison = T,
     by = "categoria")
Error in if (dif[i] > 0) lmat[i, id1[i]] = rmat[i, id2[i]] = wgt * v1[i] else rmat[i,  : 
  missing value where TRUE/FALSE needed

Q2 - 即使我缺少一个变量的水平（相对于另一个变量？），如何绘制我的结果。我希望 Insurance 方面只有 payments 和 digitables 级别（而其他级别保持不变）。

Answer 1

首先，请永远不要 re-use 相同的变量名用于不止一件事；这使得事情不可重现。如果你修改了一个数据集，或者一个模型，或者其他什么，给它一个新的名字，这样它就可以被区分了。

Q1

如文档所述，不能始终计算比较箭头。这就是这样一个例子。我建议以其他方式显示结果，例如使用 pwpp() 或 pwpm()

Q2

处理遗漏案例时出现错误。这已在 GitHub 版本中修复：

f2 <- f1 %>% 
    filter(!Perc ==100)
m2 <- glm(cbind(Frequencia, Failure) ~ tipo*categoria,
          data = f2, family = binomial(link = "logit"))
l2 <- emmeans(m2, ~categoria|tipo)

plot(l2, type = "response",
     comparison = TRUE,
     by = "categoria")

plot(l2, type = "response",
     comparison = TRUE,
     by = "tipo")

## Error: Aborted -- Some comparison arrows have negative length!
## (in group "payments")

在与 emmeans 交互时绘制缺少类别的结果

Plotting results with missing categories in interaction with emmeans

r

anova

emmeans

Q1

Q2