os.write 上的 RPython 段错误

RPython segfaults on os.write

我正在尝试通过 ctypes 将一些 rpython 代码嵌入到 python 脚本中。 RPython 程序相当简单:

# check.py
from rpython.rlib.entrypoint import entrypoint_highlevel
from rpython.rtyper.lltypesystem import rffi

@entrypoint_highlevel(key='main', c_name='hello', argtypes=[rffi.LONGLONG])
def hello(value):
    os.write(1, "hello world")
    return 0


def main(args):
    return 0


def target(*args):
    return main, None

以直接的方式编译:

python /home/magniff/workspace/pypy3-v5.5.0-src/rpython/bin/rpython --shared check.py

生成共享对象:

(venv) magniff@magniffy700:~/workspace/rfplib $ ls -la libcheck-c.so 
-rwxrwxr-x 1 magniff magniff 320112 июн 20 12:33 libcheck-c.so

到目前为止一切顺利,但是当我尝试使用 ctypes 运行 时:

# script.py
import ctypes
l = ctypes.cdll.LoadLibrary("./libcheck-c.so")
l.hello(20)

它因严重的段错误而失败:

(venv) magniff@magniffy700:~/workspace/rfplib $ gdb --args python script.py 
(gdb) r
Starting program: /home/magniff/Downloads/pypy3-v5.5.0-linux64/venv/bin/python script.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff646da6e in pypy_g_write () from ./libcheck-c.so
(gdb) bt
#0  0x00007ffff646da6e in pypy_g_write () from ./libcheck-c.so
#1  0x00007ffff64560ee in hello () from ./libcheck-c.so
#2  0x00007ffff6698e40 in ffi_call_unix64 () from /usr/lib/x86_64-linux-gnu/libffi.so.6
#3  0x00007ffff66988ab in ffi_call () from /usr/lib/x86_64-linux-gnu/libffi.so.6
#4  0x00007ffff68a83df in _ctypes_callproc () from /home/magniff/Downloads/pypy3-v5.5.0-linux64/venv/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so
#5  0x00007ffff68acd82 in ?? () from /home/magniff/Downloads/pypy3-v5.5.0-linux64/venv/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so
#6  0x00000000004b0cb3 in PyObject_Call ()
#7  0x00000000004c9faf in PyEval_EvalFrameEx ()
#8  0x00000000004c2765 in PyEval_EvalCodeEx ()
#9  0x00000000004c2509 in PyEval_EvalCode ()
#10 0x00000000004f1def in ?? ()
#11 0x00000000004ec652 in PyRun_FileExFlags ()
#12 0x00000000004eae31 in PyRun_SimpleFileExFlags ()
#13 0x000000000049e14a in Py_Main ()
#14 0x00007ffff7810830 in __libc_start_main (main=0x49dab0 <main>, argc=2, argv=0x7fffffffdda8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdd98) at ../csu/libc-start.c:291
#15 0x000000000049d9d9 in _start ()

我觉得这很奇怪,因为 os.write 在独立编译模式下工作得很好。

GDB 段错误详细信息:

(gdb) p $_siginfo._sifields._sigfault.si_addr = (void *) 0x0

嗯,okeeeeey,我猜是空指针取消引用。会不会是因为垃圾收集器在 os.write 有机会实际打印它之前就杀死了字符串?

还有 pypy_g_write 的反汇编:

   0x00007ffff6695a20 <+0>: push   %r14
   0x00007ffff6695a22 <+2>: push   %r13
   0x00007ffff6695a24 <+4>: mov    %rdi,%r14
   0x00007ffff6695a27 <+7>: push   %r12
   0x00007ffff6695a29 <+9>: push   %rbp
   0x00007ffff6695a2a <+10>:    lea    0x216acf(%rip),%rdi        # 0x7ffff68ac500 <pypy_g_rpython_memory_gc_incminimark_IncrementalMiniMar>
   0x00007ffff6695a31 <+17>:    push   %rbx
   0x00007ffff6695a32 <+18>:    mov    %rsi,%rbx
   0x00007ffff6695a35 <+21>:    mov    [=15=]x4,%ebp
   0x00007ffff6695a3a <+26>:    sub    [=15=]x10,%rsp
   0x00007ffff6695a3e <+30>:    mov    0x10(%rsi),%r13
   0x00007ffff6695a42 <+34>:    callq  0x7ffff6683550 <pypy_g_IncrementalMiniMarkGC_can_move>
   0x00007ffff6695a47 <+39>:    test   %al,%al
   0x00007ffff6695a49 <+41>:    jne    0x7ffff6695ae8 <pypy_g_write+200>
   0x00007ffff6695a4f <+47>:    lea    0x18(%rbx),%r12
   0x00007ffff6695a53 <+51>:    mov    0x216dae(%rip),%rax        # 0x7ffff68ac808 <pypy_g_rpython_memory_gctypelayout_GCData+40>
   0x00007ffff6695a5a <+58>:    mov    %r12,%rsi
   0x00007ffff6695a5d <+61>:    mov    %r14,%rdi
   0x00007ffff6695a60 <+64>:    lea    0x8(%rax),%rdx
   0x00007ffff6695a64 <+68>:    mov    %rdx,0x216d9d(%rip)        # 0x7ffff68ac808 <pypy_g_rpython_memory_gctypelayout_GCData+40>
   0x00007ffff6695a6b <+75>:    mov    %r13,%rdx
=> 0x00007ffff6695a6e <+78>:    mov    %rbx,(%rax)  # segfault happens there

更新

看来问题出在默认GC(minimark)上,出于某种原因它提前释放了内存。通过设置 --gc=ref 它 运行 正确。

多亏了 Armin Rigo,我们得到了一个解决方案 - 为了 运行 正确地执行此代码,您应该首先初始化 rpython 内部结构,只需在调用实际入口点之前调用 void rpython_startup_code(void) 就可以了。