如何在rpy2中使用R的赋值方法?

How to use R's assignment methods in rpy2?

我正在使用 rpy2,我需要对 R 对象使用赋值方法。例如,从这个对象开始:

# Python code
from rpy2.robjects import r
myvar = r('c(a=1,b=2,c=3)')

假设我要分配给names(myvar)。 (注意:忽略 rpy2 提供了另一种通过 myvar.names 访问名称的方法这一事实。这仅适用于名称,不适用于任意分配方法。)在 R 中,我会这样做:

# R code
names(myvar) <- c("x", "y", "z")

但是,这在 Python 中不起作用:

# Python code
> names(myvar) = ['x', 'y', 'z']
In [62]: names(myvar) = ['x', 'y', 'z']
  File "<ipython-input-62-aa3f7998cdcb>", line 1
    names(myvar) = ['x', 'y', 'z']
                                  ^
SyntaxError: can't assign to function call

当然,我可以 运行 通过 rpy2 的字符串 eval 任意代码:

# Python code
r('''names(myvar) <- c("x", "y", "z")''')

但是将值插入要评估的字符串听起来既不有趣也不安全。那么有没有一种方法可以通过 rpy2 安全地执行 method(object) <- value 的等效操作?

在 R 中,"setter" 函数遵循命名约定,即 "getter" 的名称后跟 <-。例如,当做

names(myvar) <- c("x", "y", "z")

发生以下情况:

myvar <- "names<-"(myvar, c("x","y","z"))

如果我们分解它:

> myvar = c(a=1,b=2,c=3)
> # call the assignment function "names<-"    
> "names<-"(myvar, c("x","y","z")) 
x y z 
1 2 3 
> # the "names" are stored as an attribute
> attributes(myvar)
$names
[1] "x" "y" "z"
> attributes(myvar)$names <- c("a","b","c")
> myvar
a b c 
1 2 3 
> # note that the function does have a side effect
> # (unlike what I wrote in a previous version of this answer):
> # the names are changed in place. I think that this is a C-level
> # optimization specific to "names" and this may not always be
> # the case for all "setters"
> "names<-"(myvar, c("x","y","z")) 
x y z 
1 2 3 
> myvar
x y z 
1 2 3   

从 rpy2 做类似 method(object) <- value 的事情很简单。 python 代码看起来像:

set_method = r("`method<-`")
my_object = set_method(my_object, value)

考虑导入R的基础包直接使用c()函数,赋名导入R的stats包直接使用setNames()函数。下面显示了如何使用 r()base.c() 分配产生等效值:

from rpy2.robjects import r
from rpy2.robjects.packages import importr

base = importr('base')

myvar1 = r("c('x','y','z')")
myvar2 = base.c('x', 'y', 'z')

# SAME CLASS TYPE
print(type(myvar1))
# <class 'rpy2.robjects.vectors.StrVector'>
print(type(myvar2))
# <class 'rpy2.robjects.vectors.StrVector'>

from rpy2.robjects import pandas2ri
pandas2ri.activate()

# CONVERT TO PYTHON NUMPY ARRAY
py_myvar1 = pandas2ri.ri2py(myvar1)
py_myvar2 = pandas2ri.ri2py(myvar2)

print(py_myvar1==py_myvar2)
# [ True  True  True]

print(py_myvar1)
# ['x' 'y' 'z']
print(py_myvar2)
# ['x' 'y' 'z']

并使用名称和值的输出向量分配名称:

stats = importr('stats')
# EQUIVALENT TO R: myvar <- setNames(c('a', 'b', 'c'), c(1,2,3))
myvar3 = stats.setNames(base.c(1,2,3), base.c('a', 'b', 'c'))

print(type(myvar3))
# <class 'rpy2.robjects.vectors.IntVector'>

# NAME VECTOR
py_myvar3 = pandas2ri.ri2py(base.names(myvar3))
print(py_myvar3)
# ['a' 'b' 'c']

# VALUES VECTOR
py_myvar3 = pandas2ri.ri2py(myvar3)
print(py_myvar3)
# [1 2 3]

总之,Python不允许函数调用赋值。因此,找到合适的方法来创建对象并为其分配右侧值,以符合 Python 的约定。