rpy2：在放入 DataFrame 时防止 StrVector 分解

Question

在 rpy2 中，我注意到一旦将 StrVector 放入 DataFrame 中，它就会被分解。示例如下。

import rpy2.robjects as ro

series_1 = ("0", "0", "0", "0")
series_1_robject = ro.StrVector(series_1)  # => ['0', '0', '0', '0']
df = ro.DataFrame({"series_1": series_1_robject})    # => FactorVector [1, 1, 1, 1]

还有...

>>> df[0][1]
1

看起来当我构建一个 DataFrame 时，我的 nice StrVector 被分解了，所以 0 对应于因子值 1 （R 是 1-indexed ），等等。但是我该如何阻止这种情况发生呢？当输入向量 (series_1) 是 0,0,0...,0 时，它在结果 DataFrame 中的表示将是 0，而不是 1，这对我来说非常重要。到目前为止，我还没有真正能够在文档中找到关于此事的任何内容....

Answer 1

根据注释here, you can prevent this conversion to FactorVector by wrapping the StrVector with a call to ro.r.I() (the "as-is" function in R)：

In [1]: import rpy2.robjects as ro

In [2]: series_1 = ("0", "0", "0", "0")

In [3]: series_1_robject = ro.StrVector(series_1)

In [4]: df = ro.DataFrame({"series_1": series_1_robject})

In [5]: df.rx2("series_1")
Out[5]:
R object with classes: ('factor',) mapped to:
<FactorVector - Python:0x113a39368 / R:0x7f8d15882e40>
[       1,        1,        1,        1]

In [6]: df = ro.DataFrame({"series_1": ro.r.I(series_1_robject)})

In [7]: df.rx2("series_1")
Out[7]:
R object with classes: ('AsIs',) mapped to:
<StrVector - Python:0x113a398c0 / R:0x7f8d13a8aec8>
[str, str, str, str]

rpy2：在放入 DataFrame 时防止 StrVector 分解

rpy2: prevent StrVector from factorisation when put into a DataFrame

python

r

rpy2