扩展 ggplot2:如何构建 geom 和 stat?

Extending ggplot2: How to build a geom and stat?

我正处于学习如何扩展 ggplot2 的早期阶段。我想创建一个自定义 geom 并关联 stat。我的出发点是 vignette. In addition, I have benefited from and this。我正在尝试整理一个模板来教自己并希望其他人。

主要问题:

在我的函数 calculate_shadows() 中,需要的参数 params$anchorNULL。我怎样才能访问它?

下面描述的目标仅用于学习如何创建自定义 statgeom 函数,这不是真正的目标:正如您从屏幕截图中看到的那样,我知道如何利用 ggplot2 的力量制作图表。

  1. geom 将读取数据,对于提供的变量 ("x", "y") 将绘制(需要更好的词)shadows:水平线 min(x)--max(x) 默认为 y=0,垂直线 min(y)--max(y) 默认为 x=0。如果提供了一个选项,这些 "anchors" 可以改变,例如如果用户提供 x = 35, y = 1,水平线将在截距 y = 1 处绘制,而垂直线将在截距 x = 35 处绘制。用法:

    library(ggplot2)
    ggplot(data = mtcars, aes(x = mpg, y = wt)) + 
        geom_point() +
        geom_shadows(x = 35, y = 1) 
    

  1. stat 将读取数据,对于提供的变量 ("x", "y") 将根据 stat 的值计算 shadows。例如,通过传递 stat = "identity",将为数据的最小值和最大值计算阴影(如 geom_shadows 所做的那样)。但是通过传递 stat = "quartile",将计算第一和第三四分位数的阴影。更一般地说,可以通过参数 args = list(probs = c(0.10, 0.90), type = 6) 传递类似 stats::quantile 的函数,以使用第 10 个和第 90 个百分位数以及类型 6 的分位数方法计算阴影。用法:

    ggplot(data = mtcars, aes(x = mpg, y = wt)) + 
        geom_point() +
        stat_shadows(stat = "quartile") 
    

不幸的是,我对扩展 ggplot2 的不熟悉使我无法实现 objective。这些地块是 "faked" 和 geom_segment。基于上面引用的教程和讨论并检查现有代码,如 stat-qqstat-smooth,我已经为这个目标建立了一个基本架构。一定有很多错误,望指教。另请注意,以下任何一种方法都可以:geom_shadows(anchor = c(35, 1))geom_shadows(x = 35, y = 1)

现在这是我的努力。首先,geom-shadows.r定义geom_shadows()。二、stat-shadows.r定义stat_shadows()。该代码不能按原样工作。但是如果我执行它的内容,它确实会产生所需的统计数据。为清楚起见,我删除了 stat_shadows() 中的大部分计算,例如四分位数,以专注于基本要素。布局有明显错误吗?

geom-shadows.r

#' documentation ought to be here
geom_shadows <- function(
  mapping = NULL, 
  data = NULL, 
  stat = "shadows", 
  position = "identity", 
  ...,
  anchor = list(x = 0, y = 0),
  shadows = list("x", "y"), 
  type = NULL,
  na.rm = FALSE,
  show.legend = NA, 
  inherit.aes = TRUE) {
    layer(
      data = data,
      mapping = mapping,
      stat = stat,
      geom = GeomShadows,
      position = position,
      show.legend = show.legend,
      inherit.aes = inherit.aes,
      params = list(
        anchor = anchor,
        shadows = shadows,
        type = type,  
        na.rm = na.rm,
        ...
    )
  )
}

GeomShadows <- ggproto("GeomShadows", Geom, 

  # set up the data, e.g. remove missing data
  setup_data = function(data, params) { 
    data 
  }, 

  # set up the parameters, e.g. supply warnings for incorrect input
  setup_params = function(data, params) {
    params
  },

  draw_group = function(data, panel_params, coord, anchor, shadows, type) { 
    # draw_group uses stats returned by compute_group

    # set common aesthetics
    geom_aes <- list(
      alpha = data$alpha,
      colour = data$color,
      size = data$size,
      linetype = data$linetype,
      fill = alpha(data$fill, data$alpha),
      group = data$group
    )

    # merge aesthetics with data calculated in setup_data
    geom_stats <- new_data_frame(c(list(
          x = c(data$x.xmin, data$y.xmin),
          xend = c(data$x.xmax, data$y.xmax),
          y = c(data$x.ymin, data$y.ymin),
          yend = c(data$x.ymax, data$y.ymax),
          alpha = c(data$alpha, data$alpha) 
        ), geom_aes
      ), n = 2) 

    # turn the stats data into a GeomPath
    geom_grob <- GeomSegment$draw_panel(unique(geom_stats), 
        panel_params, coord) 

    # pass the GeomPath to grobTree
    ggname("geom_shadows", grobTree(geom_grob)) 
  },

  # set legend box styles
  draw_key = draw_key_path,

  # set default aesthetics 
  default_aes = aes(
    colour = "blue",
    fill = "red",
    size = 1,
    linetype = 1,
    alpha = 1
  )

)

stat-shadows.r

#' documentation ought to be here
stat_shadows <-  
  function(mapping = NULL, 
           data = NULL,
           geom = "shadows", 
           position = "identity",
           ...,
           # do I need to add the geom_shadows arguments here?
           anchor = list(x = 0, y = 0),
           shadows = list("x", "y"), 
           type = NULL,
           na.rm = FALSE,
           show.legend = NA,
           inherit.aes = TRUE) {
  layer(
    stat = StatShadows,  
    data = data,
    mapping = mapping,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      # geom_shadows argument repeated here?
      anchor = anchor,  
      shadows = shadows,
      type = type,
      na.rm = na.rm,
      ...
    )
  )
}

StatShadows <- 
  ggproto("StatShadows", Stat,

    # do I need to repeat required_aes?
    required_aes = c("x", "y"), 

    # set up the data, e.g. remove missing data
    setup_data = function(data, params) {
      data
    },

    # set up parameters, e.g. unpack from list
    setup_params = function(data, params) {
      params
    },

    # calculate shadows: returns data_frame with colnames: xmin, xmax, ymin, ymax 
    compute_group = function(data, scales, anchor = list(x = 0, y = 0), shadows = list("x", "y"), type = NULL, na.rm = TRUE) {

      .compute_shadows(data = data, anchor = anchor, shadows = shadows, type = type)

  }
)

# Calculate the shadows for each type / shadows / anchor
.compute_shadows <- function(data, anchor, shadows, type) {

# Deleted all type-checking, etc. for MWE
# Only 'type = c(double, double)' accepted, e.g. type = c(0, 1)

qs <- type

# compute shadows along the x-axis
if (any(shadows == "x")) {
    shadows.x <- c(
    xmin = as.numeric(stats::quantile(data[, "x"], qs[[1]])),
    xmax = as.numeric(stats::quantile(data[, "x"], qs[[2]])),
    ymin = anchor[["y"]], 
    ymax = anchor[["y"]]) 
}

# compute shadows along the y-axis
if (any(shadows == "y")) {
    shadows.y <- c(
    xmin = anchor[["x"]], 
    xmax = anchor[["x"]], 
    ymin = as.numeric(stats::quantile(data[, "y"], qs[[1]])),
    ymax = as.numeric(stats::quantile(data[, "y"], qs[[2]])))
} 

# store shadows in one data_frame
stats <- new_data_frame(c(x = shadows.x, y = shadows.y))

# return the statistics
stats
}

.

直到出现更彻底的答案:你错过了

extra_params = c("na.rm", "shadows", "anchor", "type"),

里面GeomShadows <- ggproto("GeomShadows", Geom,

可能也在 StatShadows <- ggproto("StatShadows", Stat,.

里面

geom-.rstat-.r 中有许多非常有用的注释阐明了几何和统计信息的工作原理。特别是(帽子提示 Claus Wilke 在 github 个问题上):

# Most parameters for the geom are taken automatically from draw_panel() or
# draw_groups(). However, some additional parameters may be needed
# for setup_data() or handle_na(). These can not be imputed automatically,
# so the slightly hacky "extra_params" field is used instead. By
# default it contains `na.rm`
extra_params = c("na.rm"),