引发 RRuntimeError 时从 R 捕获回溯

Capturing traceback from R when an RRuntimeError is raised

A​​ Python class 通过 rpy2 执行 R 函数,我希望能够在 R 函数产生错误的情况下从 R 捕获回溯。

R代码是遗留的,修改风险很大;我更愿意在 Python 方面做一些事情。

这是 Python 代码目前的样子:

from rpy2.rinterface import RRuntimeError
from rpy2.robjects import DataFrame
from rpy2.robjects.packages import InstalledPackage

class RAdapter(BaseRAdapter):
    _module = None # type: InstalledPackage

    def call_raw(self, function_name, *args, **kwargs):
        # type: (str, tuple, dict) -> DataFrame
        """
        Invokes an R function and returns the result as a DataFrame.
        """
        try:
            return getattr(self._module, function_name)(*args, **kwargs)
        except RRuntimeError as e:
            # :todo: Capture traceback from R and attach to `e`.
            e.context = {'r_traceback': '???'}
            raise

    ...

我应该如何修改 call_raw 以便在 R 函数导致错误的情况下捕获来自 R 的回溯?

traceback() 是在 R 中生成错误回溯的首选函数。使用 rpy2.robjects.r,您可以评估 traceback() 函数并将结果直接存储到 Python变量。

rpy2 v2.8.x 的注意事项:traceback() 的结果是一个配对列表,rpy2 can work with just fine, but there's an issue that prevents repr from working correctly。为了使代码更易于调试,它使用 unlist 将 pairlist 转换为列表。

请注意 traceback() 也会将回溯发送到标准输出,并且(据我所知)没有办法避免这种情况,除了 [暂时] 覆盖 sys.stdout.

以下是 RAdapter.call_raw() 可以捕获 R 回溯的方法:

# If you are using rpy2 < 3.4.5 change the next line to:
# from rpy2.rinterface import RRuntimeError
from rpy2.rinterface_lib.embedded import RRuntimeError

from rpy2.robjects import DataFrame, r
from rpy2.robjects.packages import InstalledPackage

class RAdapter(BaseRAdapter):
    _module = None # type: InstalledPackage

    def call_raw(self, function_name, *args, **kwargs):
        # type: (str, *typing.Any, **typing.Any) -> DataFrame
        """
        Invokes an R function and returns the result as a DataFrame.
        """
        try:
            return getattr(self._module, function_name)(*args, **kwargs)
        except RRuntimeError as e:
            # Attempt to capture the traceback from R.
            try:
                e.context = {
                    # :kludge: Have to use `unlist` because `traceback`
                    #   returns a pairlist, which rpy2 doesn't know how
                    #   to repr.
                    'r_traceback': '\n'.join(r('unlist(traceback())'))
                }
            except Exception as traceback_exc:
                e.context = {
                    'r_traceback':
                        '(an error occurred while getting traceback from R)',

                    'r_traceback_err':  traceback_exc,
                }

            raise

    ...

使用 rpy2==2.8.3 测试。

rpy2 可以(大部分)很好地处理 R 配对列表。但是,它们的表示(方法 __repr__)似乎有一个错误:R 向量的一般 __repr__ 使用切片,而切片不适用于 pairlist 对象。

>>> from rpy2.robjects import baseenv
>>> opts = baseenv['.Options']
>>> opts.typeof # this is a pairlist
2
>>> print(opts) # working
...
>>> str(opts) # working
>>> opts.items() # working
>>> repr(opts) # ValueError: Cannot handle R type 2