rpy2:在放入 DataFrame 时防止 StrVector 分解

rpy2: prevent StrVector from factorisation when put into a DataFrame

rpy2 中,我注意到一旦将 StrVector 放入 DataFrame 中,它就会被分解。示例如下。

import rpy2.robjects as ro

series_1 = ("0", "0", "0", "0")
series_1_robject = ro.StrVector(series_1)  # => ['0', '0', '0', '0']
df = ro.DataFrame({"series_1": series_1_robject})    # => FactorVector [1, 1, 1, 1]

还有...

>>> df[0][1]
1

看起来当我构建一个 DataFrame 时,我的 nice StrVector 被分解了,所以 0 对应于因子值 1 (R 是 1-indexed ), 等等。但是我该如何阻止这种情况发生呢?当输入向量 (series_1) 是 0,0,0...,0 时,它在结果 DataFrame 中的表示将是 0,而不是 1,这对我来说非常重要。到目前为止,我还没有真正能够在文档中找到关于此事的任何内容....

根据注释here, you can prevent this conversion to FactorVector by wrapping the StrVector with a call to ro.r.I() (the "as-is" function in R)

In [1]: import rpy2.robjects as ro

In [2]: series_1 = ("0", "0", "0", "0")

In [3]: series_1_robject = ro.StrVector(series_1)

In [4]: df = ro.DataFrame({"series_1": series_1_robject})

In [5]: df.rx2("series_1")
Out[5]:
R object with classes: ('factor',) mapped to:
<FactorVector - Python:0x113a39368 / R:0x7f8d15882e40>
[       1,        1,        1,        1]

In [6]: df = ro.DataFrame({"series_1": ro.r.I(series_1_robject)})

In [7]: df.rx2("series_1")
Out[7]:
R object with classes: ('AsIs',) mapped to:
<StrVector - Python:0x113a398c0 / R:0x7f8d13a8aec8>
[str, str, str, str]