引发 RRuntimeError 时从 R 捕获回溯

Question

A Python class 通过 rpy2 执行 R 函数，我希望能够在 R 函数产生错误的情况下从 R 捕获回溯。

R代码是遗留的，修改风险很大；我更愿意在 Python 方面做一些事情。

这是 Python 代码目前的样子：

from rpy2.rinterface import RRuntimeError
from rpy2.robjects import DataFrame
from rpy2.robjects.packages import InstalledPackage

class RAdapter(BaseRAdapter):
    _module = None # type: InstalledPackage

    def call_raw(self, function_name, *args, **kwargs):
        # type: (str, tuple, dict) -> DataFrame
        """
        Invokes an R function and returns the result as a DataFrame.
        """
        try:
            return getattr(self._module, function_name)(*args, **kwargs)
        except RRuntimeError as e:
            # :todo: Capture traceback from R and attach to `e`.
            e.context = {'r_traceback': '???'}
            raise

    ...

我应该如何修改 call_raw 以便在 R 函数导致错误的情况下捕获来自 R 的回溯？

Answer 1

traceback() 是在 R 中生成错误回溯的首选函数。使用 rpy2.robjects.r，您可以评估 traceback() 函数并将结果直接存储到 Python变量。

rpy2 v2.8.x 的注意事项：traceback() 的结果是一个配对列表，rpy2 can work with just fine, but there's an issue that prevents repr from working correctly。为了使代码更易于调试，它使用 unlist 将 pairlist 转换为列表。

请注意 traceback() 也会将回溯发送到标准输出，并且（据我所知）没有办法避免这种情况，除了 [暂时] 覆盖 sys.stdout.

以下是 RAdapter.call_raw() 可以捕获 R 回溯的方法：

# If you are using rpy2 < 3.4.5 change the next line to:
# from rpy2.rinterface import RRuntimeError
from rpy2.rinterface_lib.embedded import RRuntimeError

from rpy2.robjects import DataFrame, r
from rpy2.robjects.packages import InstalledPackage

class RAdapter(BaseRAdapter):
    _module = None # type: InstalledPackage

    def call_raw(self, function_name, *args, **kwargs):
        # type: (str, *typing.Any, **typing.Any) -> DataFrame
        """
        Invokes an R function and returns the result as a DataFrame.
        """
        try:
            return getattr(self._module, function_name)(*args, **kwargs)
        except RRuntimeError as e:
            # Attempt to capture the traceback from R.
            try:
                e.context = {
                    # :kludge: Have to use `unlist` because `traceback`
                    #   returns a pairlist, which rpy2 doesn't know how
                    #   to repr.
                    'r_traceback': '\n'.join(r('unlist(traceback())'))
                }
            except Exception as traceback_exc:
                e.context = {
                    'r_traceback':
                        '(an error occurred while getting traceback from R)',

                    'r_traceback_err':  traceback_exc,
                }

            raise

    ...

使用 rpy2==2.8.3 测试。

Answer 2

rpy2 可以（大部分）很好地处理 R 配对列表。但是，它们的表示（方法 __repr__）似乎有一个错误：R 向量的一般 __repr__ 使用切片，而切片不适用于 pairlist 对象。

>>> from rpy2.robjects import baseenv
>>> opts = baseenv['.Options']
>>> opts.typeof # this is a pairlist
2
>>> print(opts) # working
...
>>> str(opts) # working
>>> opts.items() # working
>>> repr(opts) # ValueError: Cannot handle R type 2

引发 RRuntimeError 时从 R 捕获回溯

Capturing traceback from R when an RRuntimeError is raised

python-2.x

traceback

rpy2

python-2.7