如何设置自定义 R 安装以在 Jupyter 中使用 rpy2?

How to set a custom R installation for using rpy2 in Jupyter?

我有一个 conda 环境,我通过 运行 将其作为内核提供给我的 Jupyter 实例: python -m ipykernel install --user --name my-env-name --display-name "Python (my-env-name)"

在这个环境中,我想在 Jupyter 中使用 R,利用 rpy2%load_ext rpy2.ipython 命令来启用 %%R 魔法。但是,rpy2 使用的是我的全局 R,而不是安装在我的 conda 环境中的那个。我通过以下方式检查了我的 R 主页:

%%R
R.home()

(我也可以在 Jupyter notebook () 中使用 %run -m rpy2.situation 检查情况,但是这似乎在 rpy2 版本 3.1.03.2.1 ... 至少对我来说,它在 3.1.0 中抛出了 UnboundLocalError: local variable 'rpy2' referenced before assignment,并且它对 3.2.1 有效)。

如何让我的 Jupyter notebook 从我的 conda 环境加载 R 安装?

有两种方法可以解决这个问题,一种是本地的(针对单个 Jupyter 笔记本),另一种是全局的(针对内核本身)。两者都与设置 R_HOME 环境变量有关。

本地 (source): 在你的 Jupyter notebook 中调用 %load_ext rpy2.ipython 之前,运行:

import os
os.environ['R_HOME'] = '/home/your/anaconda3/envs/myenv/lib/R' #path to your R installation

全球: 通过以下方式找到您的内核目录:jupyter kernelspec list 并编辑文件 kernel.json。通过添加更新 JSON: "env": {"R_HOME":"/home/your/anaconda3/envs/my-env-name/lib/R"},然后重新启动内核(您可能还需要重新启动 Jupyter)。

更新(搞砸了LD_LIBRARY_PATH

最近,在使用 conda:

设置新环境后,我再次尝试在 jupyter 中 运行ning rpy2
conda config --add channels conda-forge
conda config --set channel_priority strict
conda create -n myenv python=3.7
conda activate myenv
conda install r-essentials pandas rpy2

这次我 运行 在尝试 %load_ext rpy2.ipython (Jupyter) 或 import rpy2.robjects (任何脚本)时遇到以下问题:

>>> import rpy2.robjects                                            
Warning message:                                                    
package ‘methods’ was built under R version 3.6.3     
Error: package or namespace load failed for ‘stats’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/stats/libs/stats.so':                  
  /home/your/anaconda3/envs/myenv/lib/R/library/stats/libs/stats.so: undefined symbol: MARK_NOT_MUTABLE
During startup - Warning messages:                                                                                          
1: package ‘datasets’ was built under R version 3.6.3      
2: package ‘utils’ was built under R version 3.6.3                                                                     
3: package ‘grDevices’ was built under R version 3.6.3  
4: package ‘graphics’ was built under R version 3.6.3                                                                       
5: package ‘stats’ was built under R version 3.6.3          
6: package ‘stats’ in options("defaultPackages") was not found                                                       
R[write to console]: Error: package or namespace load failed for ‘tools’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so':
  /home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so: undefined symbol: R_NewPreciousMSet

R[write to console]: Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so':
  /home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so: undefined symbol: R_NewPreciousMSet

R[write to console]: In addition:                                      
R[write to console]: Warning message:                        

R[write to console]: package ‘tools’ was built under R version 3.6.3

Traceback (most recent call last):                          
  File "<stdin>", line 1, in <module>                    
  File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/robjects/__init__.py", line 20, in <module>
    import rpy2.robjects.functions                                           
  File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/robjects/functions.py", line 12, in <module>
    from rpy2.robjects import help                                   
  File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/robjects/help.py", line 43, in <module>
    tools_ns = _get_namespace(StrSexpVector(('tools',)))          
  File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 44, in _
    cdata = function(*args, **kwargs)                                     
  File "/home/your/anaconda3/envs/myenv/lib/python3.7/site-packages/rpy2/rinterface.py", line 621, in __call__
    raise embedded.RRuntimeError(_rinterface._geterrmessage())                            
rpy2.rinterface_lib.embedded.RRuntimeError: Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared object '/home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so':
  /home/your/anaconda3/envs/myenv/lib/R/library/tools/libs/tools.so: undefined symbol: R_NewPreciousMSet

问题似乎是 R "situation" 搞砸了(通过 Jupyter 中的 %run -m rpy2.situation 或命令行中的 python -m rpy2.situation 检查),它有 R's additions to LD_LIBRARY_PATH:指向旧的、全局安装的 R 版本。

我不得不手动取消设置 LD_LIBRARY_PATH 来解决这个问题。可以类似于 R_HOME 设置/取消设置此路径。

PS:我发现 R_HOMELD_LIBRARY_PATH 在我的 .bashrc 中设置为自定义(来自源)R 安装。这显然混淆了 Jupyter 内核。不聪明 ;)

PPS: rpy2.situation 仍然告诉我有一个 Warning: The environment variable R_HOME differs from the default R in the PATH.:

Looking for R's HOME:
    Environment variable R_HOME: /home/your/anaconda3/envs/myenv/lib/R
    Calling `R RHOME`: /home/your/anaconda3/envs/jupyter-env/lib/R
    Environment variable R_LIBS_USER: None
    Warning: The environment variable R_HOME differs from the default R in the PATH.

让我担心的是 R 实际上默认为 Jupyter 安装的 R。所以如果有人对此有意见,我将不胜感激。