_cgo_topofstack@@Base 在一个剥离的二进制文件中

_cgo_topofstack@@Base in a stripped binary

在来自 Go 的剥离二进制文件的上下文中,_cgo_topofstack@@Base 是什么意思?

$ cat simple.go
package main
import
(
    "net"
    "time"
    "strconv"
)

func main() {
    tcpAddr, _ := net.ResolveTCPAddr("tcp4", ":7777")
    listener, _ := net.ListenTCP("tcp", tcpAddr)
    conn, _ := listener.Accept()
    daytime := time.Now().String()+strconv.Itoa(0xdeadface)
    conn.Write([]byte(daytime))
}

代码应该被剥离 - _cgo_topofstack@@Base 是什么意思?

$ go build -gcflags=-l -ldflags "-s -w" -o simple_wo_symbols simple.go
$ objdump -D -S simple_wo_symbols > simple_wo_symbols.human
$ sed -n "198899,198904p" simple_wo_symbols.human
  4b9860:   e8 db c1 fb ff          callq  475a40 <_cgo_topofstack@@Base+0xe4c0>
  4b9865:   48 8b 44 24 18          mov    0x18(%rsp),%rax
  4b986a:   48 89 44 24 70          mov    %rax,0x70(%rsp)
  4b986f:   48 8b 4c 24 20          mov    0x20(%rsp),%rcx
  4b9874:   48 89 4c 24 40          mov    %rcx,0x40(%rsp)
  4b9879:   ba ce fa ad de          mov    [=12=]xdeadface,%edx

编辑(更好地说明问题):

关于_cgo_topofstack的内容,大家可以看看introduced in its current form in Go 1.4, original name cgo_topofstack

(但是,如 Peter Cordes in , this does not explain why that symbol would still be present in a stripped binary 所述)

// Called from cgo wrappers, this function returns g->m->curg.stack.hi.
// Must obey the gcc calling convention.
TEXT cgo_topofstack(SB),NOSPLIT,[=10=]
    get_tls(CX)
    MOVL    g(CX), AX
    MOVL    g_m(AX), AX
    MOVL    m_curg(AX), AX
    MOVL    (g_stack+stack_hi)(AX), AX
    RET

修复golang/go/issue 8771:

cmd/cgo: C functions that return values fail if they call a Go callback that copies the stack

Cgo uses a wrapper function that calls C code, passing the address of the stack frame.
This wrapper function is compiled by GCC, and it calls the real function written by the user.

The user's function is permitted to call Go callbacks.
Those Go callbacks will run on the stack of the original caller.
They may cause a stack copy.

If the stack gets copied during a Go callback, then the caller of the GCC-compiled wrapper is running in a different location.
The stack frame pointer used by the GCC-compiled wrapper is not updated, since of course the stack copier knows nothing about GCC-compiled code.
I don't think this is a problem for the arguments to the function; they have already been copied out of the stack frame when the wrapper calls the real function.

However, it is a problem for C functions that return a value.
The wrapper will take the value returned by the C function, and store it using its pointer to the stack frame. That pointer will not have been updated if a stack copy occurs.
In other words, the wrapper may store the return value on the old stack, not the new one.

CL 144130043 添加:

cgo: adjust return value location to account for stack copies.

During a cgo call, the stack can be copied.
This copy invalidates the pointer that cgo has into the return value area.

To fix this problem, pass the address of the location containing the stack top value (which is in the G struct).
For cgo functions which return values, read the stktop before and after the cgo call to compute the adjustment necessary to write the return value.

修改为commit e1364a6


“@@”部分应该是 option of objdump--symbols

的结果

Displays the entries in symbol table section of the file, if it has one.
If a symbol has version information associated with it then this is displayed as well.

The version string is displayed as a suffix to the symbol name, preceeded by an @ character. For example foo@VER_1.

If the version is the default version to be used when resolving unversioned references to the symbol then it is displayed as a suffix preceeded by two @ characters. For example foo@@VER_2.

_cgo_topofstack@@Base 是一个符号,由于某种原因它仍然存在于你的剥离二进制文件中。你的调用是指向一个地址 0xe4c0 之外的地址,无论那里有什么函数,都与实际的 _cgo_topofstack 代码完全无关。

反汇编程序将地址描述为符号+偏移量是正常的。

这种风格对数据数组很有意义(例如,如果 global_array 的符号仍然存在,则将 x = global_array[10] 之类的东西编译成从 global_array+40 加载的内容),以及内部的跳转职能。它通常对这种情况没有帮助,除了让你看到附近的东西,并让你看到更小的数字。

与其实施花哨的逻辑来决定是否打印地址的 symbol+offset 版本,而不仅仅是数字绝对地址,对于汇编程序来说要容易得多(而且没有出错的风险)总是这样做。从地址向后搜索并获取找到的第一个符号。或者对于节中第一个符号之前的地址,打印为 foo - 0x...。这取决于人类使用判断和经验来理解输出,尤其是在查看剥离的二进制文件的反汇编时。

(没有反汇编程序可以查看的标志来检测是否已剥离二进制文件;检测这将是一种启发式的问题,比如注意到最直接的 call 目标是没有它们的地址自己的符号。)

AFAIK,GNU Binutils objdump 没有不打印地址符号版本的选项。 --no-addresses 做了一些不同的事情。


我不确定 @@Base 是什么意思。不过,它似乎并不是 Go 所独有的。在我的 x86-64 Arch GNU/Linux 系统上,objdump -d /bin/ls(这是一个剥离的 PIE 可执行文件)显示了很多地址,如 22d60 <_obstack_memory_used@@Base+0xc2a0>。所以这就是恰好在该程序的大部分代码之前最后一个符号。

@@ 的其他情况包括同一二进制文件中的 glibc 符号 ABI 版本控制,例如23298 <optarg@@GLIBC_2.2.5>。这个 Arch Linux 二进制文件是在最新的 Arch Linux 系统上编译的,实际上并没有链接到古老的 glibc 2.2.5,但我认为这意味着 optarg 的类型或自 glibc 2.2.5 以来,有些东西没有改变。可能不是从更早的时候开始,但是 2.2.5 可能是 glibc 开始以这种方式命名符号的时候。对这段话持保留态度,因为我真的不知道 libc.so 如何安排 ld 来用这些 @@ 版本名称替换像 stderr 这样的符号名称,或者这的历史。