具有 NA 宽度的 R Plotly 抖动箱线图

R Plotly jittered boxplot with NAs width

我正在使用以下函数绘制带有抖动的分组箱线图:

plot_boxplot <- function(dat) {
  # taking one of each joine_group to be able to plot it
  allx <- dat %>% 
    mutate(y = median(y, na.rm = TRUE)) %>%
    group_by(joined_group) %>% 
    sample_n(1) %>% 
    ungroup()

  p <- dat %>%
    plotly::plot_ly() %>%
    # plotting all the groups 1:20
    plotly::add_trace(data = allx, 
                      x = ~as.numeric(joined_group),
                      y = ~y,
                      type = "box",
                      hoverinfo = "none",
                      boxpoints = FALSE,
                      color = NULL,
                      opacity = 0,
                      showlegend = FALSE) %>% 
    # plotting the boxes
    plotly::add_trace(data = dat, 
                      x = ~as.numeric(joined_group),
                      y = ~y,
                      color = ~group1,
                      type = "box",
                      hoverinfo = "none",
                      boxpoints = FALSE,
                      showlegend = FALSE) %>% 
    # adding ticktext
    layout(xaxis = list(tickvals = 1:20,
                        ticktext = rep(levels(dat$group1), each = 4)))

  p <- p %>%
    # adding jittering
    add_markers(data = dat,
                x = ~jitter(as.numeric(joined_group), amount = 0.2),
                y = ~y,
                color = ~group1,
                showlegend = FALSE)
  p

}

问题在于,当某些级别具有 NA 作为 y 变量时,抖动框的宽度会发生变化。这是一个例子:

library(plotly)
library(dplyr)
set.seed(123)
dat <- data.frame(group1 = factor(sample(letters[1:5], 100, replace = TRUE)),
                  group2 = factor(sample(LETTERS[21:24], 100, replace = TRUE)),
                  y = runif(100)) %>% 
  dplyr::mutate(joined_group = factor(
    paste0(group1, "-", group2)
  ))

# do the plot with all the levels
p1 <- plot_boxplot(dat)

# now the group1 e is having NAs as y values
dat$y[dat$group1 == "e"] <- NA

# create the plot with missing data
p2 <- plot_boxplot(dat)

# creating the subplot to see that the width has changed:
subplot(p1, p2, nrows = 2)

问题是两个图中框的宽度不同:

我发现盒子的尺寸相同但没有抖动,所以我知道抖动是 "messing" 宽度,但我不知道如何解决这个问题。

有谁知道如何使两个抖动图中的宽度完全相同?

我看到两个独立的情节转变:

  1. 由于抖动
  2. 由于 NA

首先可以通过声明新的带有固定种子的抖动函数来解决

fixed_jitter <- function (x, factor = 1, amount = NULL) {
  set.seed(42)
  jitter(x, factor, amount)
}

并在 add_markers 调用中使用它代替 jitter

第二个问题可以通过分配-1而不是NA并设置

来解决

yaxis = list(range = c(0, ~max(1.1 * y)))

作为 layout.

的第二个参数