查找 hexbin 对象之间重叠的六边形 bin 的坐标

Find coordinates for overlapping hexagonal bins between hexbin objects

我有两个空间数据集,其坐标表示对一个物种的观察,我想估计这些数据集之间的重叠区域。由于点坐标不能表示一个区域,因此必须对两个数据集使用相似的 x(经度)和 y(纬度)类别的坐标进行分箱。

对于此任务,我发现实用 hexbin package, which does hexagonal binning. The package is great, but at least I fail to find a function that directly outputs the coordinates / IDs of overlapping bins among hexbin objects. For example, the hdiffplot returns 一个很好的图形 重叠 bin 的概述,但是如何提取此信息以供进一步分析

library(hexbin)

set.seed(1); df1 <- data.frame(x = rnorm(10, 0, 5), y = rnorm(10, 0, 5))
set.seed(2); df2 <- data.frame(x = rnorm(10, 0, 5), y = rnorm(10, 0, 5))

xrange <- c(floor(min(c(df1$x, df2$x))-1), ceiling(max(c(df1$x, df2$x))+1))
#-/+1 just to make the plot nicer
yrange <- c(floor(min(c(df1$y, df2$y))-1), ceiling(max(c(df1$y, df2$y)))+1)

hb1 <- hexbin(df1$x, df1$y, xbins = 10, xbnds = xrange, ybnds = yrange)
hb2 <- hexbin(df2$x, df2$y, xbins = 10, xbnds = xrange, ybnds = yrange)

hdiffplot(hb1,hb2, xbnds = xrange, ybnds = yrange)

我在做题的时候想出了解决这个问题的办法。将 post 放在这里,希望有一天它能对某人有所帮助。

您可以使用 hcell2xy 函数提取坐标。这里有一个小函数可以找到 bin 质心的唯一重叠坐标:

#' @title Print overlapping and unique bin centroid coordinates for two hexbin objects
#' @param bin1,bin2 two objects of class hexbin.
#' @details The hexbin objects for comparison, bin1 and bin2, must have the same plotting limits and cell size.
#' @return Returns a list of data frames with unique coordinates for \code{bin1} and \code{bin2} as well as overlapping coordinates among bins.

hdiffcoords <- function(bin1, bin2) {

  ## Checks modified from: https://github.com/edzer/hexbin/blob/master/R/hdiffplot.R
  if(is.null(bin1) | is.null(bin1)) {
    stop("Need 2 hex bin objects")
  } else {
        if(bin1@shape != bin2@shape)
            stop("Bin objects must have same shape parameter")
        if(all(bin1@xbnds == bin2@xbnds) & all(bin1@ybnds == bin2@ybnds))
            equal.bounds <- TRUE
        else stop("Bin objects need the same xbnds and ybnds")
        if(bin1@xbins != bin2@xbins)
            stop("Bin objects need the same number of bins")
  }

  ## Find overlapping and unique bins

  hd1 <- data.frame(hcell2xy(bin1), count_bin1 = bin1@count, cell_bin1 = bin1@cell)
  hd2 <- data.frame(hcell2xy(bin2), count_bin2 = bin2@count, cell_bin2 = bin2@cell)

  overlapping_hd1 <- apply(hd1, 1, function(r, A){ sum(A$x==r[1] & A$y==r[2]) }, hd2)
  overlapping_hd2 <- apply(hd2, 1, function(r, A){ sum(A$x==r[1] & A$y==r[2]) }, hd1)

  overlaps <- merge(hd1[as.logical(overlapping_hd1),], hd2[as.logical(overlapping_hd2),])

  unique_hd1 <- hd1[!as.logical(overlapping_hd1),]
  unique_hd2 <- hd2[!as.logical(overlapping_hd2),]

  ## Return list of data.frames

  list(unique_bin1 = unique_hd1, unique_bin2 = unique_hd2, overlapping = overlaps)

}

此信息应与 hdiffplot 以图形格式返回的信息相同:

df <- hdiffcoords(hb1, hb2)

library(ggplot2)

ggplot() + 
  geom_point(data = df$unique_bin1, aes(x = x, y = y), color = "red", size = 10) + 
  geom_point(data = df$unique_bin2, aes(x = x, y = y), color = "cyan", size = 10) +
  geom_point(data = df$overlapping, aes(x = x, y = y), color = "green", size = 10) + theme_bw() 

任何 comments/corrections 都表示赞赏。