如何为 R 中定义的 S4 class 定义函数“match”和“%in%”的行为？

Question

我创建了一个包含多个 R 类的包，它将整数映射到组合数学中经常出现的各种结构，反之亦然。虽然类只是封装了mappings（结构的数量增长很快，很容易达到万亿级）而不是实际存储结构，但是可以方便地想到[的实例=47=] 作为向量 "containing" 结构，实例表现得像向量会很好。

比如其中一个类是PPV（for permutations pseudo vector），设置为：

setClass(
  Class = "PPV",
  representation(k = "numeric", items = "vector")
)

为了使它的行为有点像向量，我添加了 length 和 [ 的定义：

setMethod(
  f = "length",
  signature = "PPV",
  definition = function(x) # blah blah blah
)

setMethod(
  f = "[",
  signature = "PPV",
  definition = function(x, i, j, drop) {
    # blah blah blah
  }
)

到目前为止，还不错。这允许我在实例上使用 length 并通过索引在实例中访问结构 "contained"：

> # (ppv is a constructor)
> # Create a pseudo-vector of 3-permutations of the first 5 letters.
> ps <- ppv(3, letters[1:5])
> # Like vectors, access we can access structures "contained" by index.
> for (i in 1:5) cat(ps[i],"\n")
a b c 
a c b 
c a b 
c b a 
b c a 
> # Like vectors, function length is meaningful.
> length(ps)
[1] 60

我还定义了从结构到索引和存在性测试的映射，看来通过 match 和 %in% 函数实现这些映射是最通俗易懂的分别。这是我目前所拥有的：

setMethod(
  f = "%in%",
  signature = c("vector", "PPV"),
  definition = function(x, table)
    # blah blah blah
)

setMethod(
  f = "match",
  signature = c("vector", "PPV"),
  definition = function(x, table) {
    # blah blah blah
  }
)

问题是当我安装和加载库时，这些似乎没有被定义：

> some.permutation <- c("a", "c", "e")
> some.permutation %in% ps
Error in match(x, table, nomatch = 0L) : 
  'match' requires vector arguments
> match(some.permutation, ps)
Error in match(some.permutation, ps) : 'match' requires vector arguments

然而，当我明确执行文件中包含的代码时，它起作用了：

> some.permutation %in% ps
[1] TRUE
> match(some.permutation, ps)
[1] 25
> ps[25]
[1] "a" "c" "e"

为什么 length 和 [ 的定义在加载包时被执行，而 %in% 和 match 的定义在同一个文件中同样的设置，不是吗？

Answer 1

match() 不是泛型 (isGeneric("match"))，所以您想将它变成一个泛型，也许比分派所有参数更明智。

setGeneric("match", signature=c("x", "table"))

写方法跟签名

setMethod("match", c("vector", "PPV"),
    function(x, table, nomatch = NA_integer_, incomparables = NULL)
{
    "match,vector,PPV-method"
})

记得在你的包 NAMESPACE 中导出 class 和 generic

exportClasses("PPV")
export("match")

对于%in%，隐式泛型（通过定义一个方法而不先使用setGeneric()创建）是明智的，所以只定义方法

setMethod("%in%", c("vector", "PPV"), function(x, table) {
    message("%in%")
    match(x, table, nomatch=0L) > 0
})

记得也导出隐式泛型（export("%in%") 在 NAMESPACE 中）。

有人可能希望，由于 base::%in% 是根据 match() 定义的，并且 match() 已为您的 class 实现，因此没有必要实现 %in% 的方法。事实并非如此，我认为是因为 match() 是在 C 中以一种不首先寻找泛型的方式实现的。

如何为 R 中定义的 S4 class 定义函数“match”和“%in%”的行为？

How do I define the behaviour of functions `match` and `%in%` for a defined S4 class in R?

r

s4