运行 R 函数与 rpy2 时出错
Error when running R function with rpy2
我正在尝试使用 rpy2 to run the multi.split function from the questionr 包。
这是我的代码
from rpy2 import robjects
from rpy2.robjects.packages import importr
questionr = importr(str('questionr'))
data = ["red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green"]
data_vector = robjects.StrVector(data)
multi_split = questionr.multi_split
data_table = multi_split(data_vector, split_char='/')
在最后一行之后出现以下错误:
RRuntimeError: Error in `colnames<-`(`*tmp*`, value = c("c(\"red/blue\",_\"green\",_\"red/green\",_\"blue/red\",_\"red/blue\",_\"green\",_.blue", :
'names' attribute [4] must be the same length as the vector [3]
我认为这与我发送的矢量的大小有关,因为如果我删除最后一项
data = ["red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue"]
然后是运行
data_vector = robjects.StrVector(data)
multi_split = questionr.multi_split
data_table = multi_split(data_vector, split_char='/')
我没有收到任何错误消息。另外,如果我更改“split_char”变量,例如:
data_table = multi_split(data_vector, split_char='.')
无论我发送的数组大小如何,我都没有收到任何错误消息。
我已经尝试 运行 直接在 R 中(使用 R-Studio)匹配代码 运行 没有问题。
关于如何解决此问题的任何想法?
这似乎是因为函数 multi_split
(R 包中的 multi.split
)试图使用与第一个参数相关联的表达式的字符串表示形式(此处为 "data_vector"
).
R函数的签名是:
multi.split(var, split.char = "/", mnames = NULL)
mnames
的文档是:
names to give to the produced variabels. If NULL, the name are
computed from the original variable name and the answers.
在调用 multi_split(data_vector, split_char='/')
中,嵌入式 R 看不到变量名,因为这是一个 Python 调用,而 data_vector
是一个匿名变量(只有内容,没有变量名)。
虽然您可以指定 mnames
,但您检查过并且这不起作用(请参阅下面的评论)。这就是代码的意思:无论是否指定 mnames,都会评估行 vname <- deparse(substitute(var))
:https://github.com/juba/questionr/blob/9cf09f3ffcd6c8df24182380f12d52b061c221ef/R/table.multi.R#L161
另一种方法是使用 R 表达式。较旧的 post 应该为此提供必要的位:What object to pass to R from rpy2?
第三种可能性是创造性地混合 Python-strings-as-R-code:
data = ["red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green"]
data_vector = robjects.StrVector(data)
# binding the R vector to a symbol in R's "GlobalEnv"
robjects.globalenv['mydata'] = data_vector
# the call is now in a Python string that is evaluated as R code
data_table = robjects.r("multi.split(data_vector, split_char='/')")
我正在尝试使用 rpy2 to run the multi.split function from the questionr 包。
这是我的代码
from rpy2 import robjects
from rpy2.robjects.packages import importr
questionr = importr(str('questionr'))
data = ["red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green"]
data_vector = robjects.StrVector(data)
multi_split = questionr.multi_split
data_table = multi_split(data_vector, split_char='/')
在最后一行之后出现以下错误:
RRuntimeError: Error in `colnames<-`(`*tmp*`, value = c("c(\"red/blue\",_\"green\",_\"red/green\",_\"blue/red\",_\"red/blue\",_\"green\",_.blue", :
'names' attribute [4] must be the same length as the vector [3]
我认为这与我发送的矢量的大小有关,因为如果我删除最后一项
data = ["red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue"]
然后是运行
data_vector = robjects.StrVector(data)
multi_split = questionr.multi_split
data_table = multi_split(data_vector, split_char='/')
我没有收到任何错误消息。另外,如果我更改“split_char”变量,例如:
data_table = multi_split(data_vector, split_char='.')
无论我发送的数组大小如何,我都没有收到任何错误消息。
我已经尝试 运行 直接在 R 中(使用 R-Studio)匹配代码 运行 没有问题。 关于如何解决此问题的任何想法?
这似乎是因为函数 multi_split
(R 包中的 multi.split
)试图使用与第一个参数相关联的表达式的字符串表示形式(此处为 "data_vector"
).
R函数的签名是:
multi.split(var, split.char = "/", mnames = NULL)
mnames
的文档是:
names to give to the produced variabels. If NULL, the name are computed from the original variable name and the answers.
在调用 multi_split(data_vector, split_char='/')
中,嵌入式 R 看不到变量名,因为这是一个 Python 调用,而 data_vector
是一个匿名变量(只有内容,没有变量名)。
虽然您可以指定 mnames
,但您检查过并且这不起作用(请参阅下面的评论)。这就是代码的意思:无论是否指定 mnames,都会评估行 vname <- deparse(substitute(var))
:https://github.com/juba/questionr/blob/9cf09f3ffcd6c8df24182380f12d52b061c221ef/R/table.multi.R#L161
另一种方法是使用 R 表达式。较旧的 post 应该为此提供必要的位:What object to pass to R from rpy2?
第三种可能性是创造性地混合 Python-strings-as-R-code:
data = ["red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green", "red/green", "blue/red", "red/blue", "green"]
data_vector = robjects.StrVector(data)
# binding the R vector to a symbol in R's "GlobalEnv"
robjects.globalenv['mydata'] = data_vector
# the call is now in a Python string that is evaluated as R code
data_table = robjects.r("multi.split(data_vector, split_char='/')")