在 R 的映射函数中添加到列表 object

Question

我正在使用 GGally::ggpairs 创建散点图矩阵。我正在使用自定义函数（下面称为 my_fn）来创建 bottom-left non-diagonal 子图。在调用该自定义函数的过程中，计算了这些子图的每一个信息，我想存储这些信息以备后用。

在下面的示例中，每个 h@cID 都是一个具有 100 个值的 int[] 结构。总共在 my_fn 中创建了 10 次（10 个 bottom-left non-diagonal 子图中各创建一次）。我正在尝试将所有 10 个 h@cID 结构存储到 listCID 列表 object.

我用这种方法没有成功，我尝试了一些其他变体（例如尝试将 listCID 作为 my_fn 的输入参数，或尝试 return 到最后）。

我可以通过my_fn高效地存储十个h@cID结构，以备后用吗？我觉得有几个我不完全熟悉的语法问题可以解释为什么我被卡住了，同样，如果我没有使用适当的术语，我很乐意更改这个问题的标题。谢谢！

library(hexbin)
library(GGally)
library(ggplot2)

set.seed(1)

bindata <- data.frame(
    ID = paste0("ID", 1:100), 
    A = rnorm(100), B = rnorm(100), C = rnorm(100), 
    D = rnorm(100), E = rnorm(100))
    bindata$ID <- as.character(bindata$ID
)

maxVal <- max(abs(bindata[ ,2:6]))
maxRange <- c(-1 * maxVal, maxVal)

listCID <- c()

my_fn <- function(data, mapping, ...){
  x <- data[ ,c(as.character(mapping$x))]
  y <- data[ ,c(as.character(mapping$y))]
  h <- hexbin(x=x, y=y, xbins=5, shape=1, IDs=TRUE, 
              xbnds=maxRange, ybnds=maxRange)
  hexdf <- data.frame(hcell2xy(h),  hexID=h@cell, counts=h@count)
  listCID <- c(listCID, h@cID)
  print(listCID)
  p <- ggplot(hexdf, aes(x=x, y=y, fill=counts, hexID=hexID)) + 
            geom_hex(stat="identity")
  p
}

p <- ggpairs(bindata[ ,2:6], lower=list(continuous=my_fn))
p

Answer 1

如果我正确理解你的问题，这很容易，尽管不优雅，使用 <<- 运算符实现。

有了它，您可以在函数范围内分配诸如全局变量之类的东西。

在执行函数之前设置listCID <- NULL，在函数内部设置listCID <<-c(listCID,h@cID)。

listCID = NULL

my_fn <- function(data, mapping, ...){
  x = data[,c(as.character(mapping$x))]
  y = data[,c(as.character(mapping$y))]
  h <- hexbin(x=x, y=y, xbins=5, shape=1, IDs=TRUE, xbnds=maxRange, ybnds=maxRange)
  hexdf <- data.frame (hcell2xy (h),  hexID = h@cell, counts = h@count)

  if(exists("listCID")) listCID <<-c(listCID,h@cID)

  print(listCID)
  p <- ggplot(hexdf, aes(x=x, y=y, fill = counts, hexID=hexID)) + geom_hex(stat="identity")
  p
    }

有关范围的更多信息，请参阅 Hadleys 优秀的 Advanced R：http://adv-r.had.co.nz/Environments.html

Answer 2

一般来说，尝试用一个函数 return 得到两个不同的结果并不是一个好习惯。在您的情况下，您想要 return 绘图和计算结果（hexbin cID）。

更好的方法是分步计算结果。每个步骤都是一个单独的功能。第一个函数的结果（计算 hexbin）然后可以用作多个 follow-up 函数（查找 cID 和创建绘图）的输入。接下来是重构代码的众多方法之一：

calc_hexbins() 在其中生成所有 hexbin。这个函数可以 return 一个 hexbin 的命名列表（例如 list(AB = h1, AC = h2, BC = 43)）。这是通过枚举列表的所有可能组合（A、B、C、D 和 E）来实现的。缺点是您正在复制 ggpairs().
gen_cids() 将 hexbin 作为输入并生成所有 cID。这是一个简单的操作，您可以循环（或遍历）列表中的所有元素并获取 cID。
create_plot() 也将 hexbins 作为输入，这是您实际生成绘图的函数。在这里，您可以为 hexbins 列表添加一个额外的参数（GGally 包中有一个函数 wrap() 可以执行此操作）。您可以通过将 A 和 B 组合成一个字符串，在您之前生成的命名列表中查找它们，而不是计算 hexbin。

这避免了诸如使用属性或使用全局变量之类的骇人听闻的方法。这些当然有用，但在维护代码时常常令人头疼。不幸的是，这也会使您的代码更长一些，但这可能是一件好事。

在 R 的映射函数中添加到列表 object

Adding to a list object in a mapping function in R

r

data-visualization

ggplot2

ggally