如何提供 memcpy 的实现

How to provide an implementation of memcpy

我正在尝试编写一些带有 memset 式循环的裸机代码:

for (int i = 0; i < N; ++i) {
  arr[i] = 0;
}

它是用 GCC 编译的,GCC 足够聪明,可以将其转换为对 memset() 的调用。不幸的是,因为它是裸机,我没有 memset()(通常在 libc 中)所以我得到一个 link 错误。

 undefined reference to `memset'

似乎执行此 t运行sformation 的优化是 -ftree-loop-distribute-patterns:

Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by -fprofile-use and -fauto-profile.

所以one person's solution只是降低优化级别。不是很满意。

我还发现 this really helpful page 解释了 -ffreestanding 不足以让 GCC 不这样做,基本上别无选择,只能提供你自己的 memcpy 实现, memmovememsetmemcmp。我很乐意这样做,但是怎么做呢?

如果我只写 memset 编译器将检测其中的循环并将其运行sform 成对 memset 的调用!事实上,在我使用的 CPU 供应商提供的代码中,我发现了这条评论:

/*
// This is commented out because the assembly code that the compiler generates appears to be
// wrong.  The code would recursively call the memset function and eventually overruns the
// stack space.
void * memset(void *dest, int ch, size_t count)
...

所以我认为这是他们 运行 关注的问题。

如何提供 memset 的 C 实现,而不需要编译器将其优化为调用自身并且不禁用该优化?

啊哈,我签到了 the glibc code and there's a inhibit_loop_to_libcall modifier which sounds like it should do this. It is defined like this:

/* Add the compiler optimization to inhibit loop transformation to library
   calls.  This is used to avoid recursive calls in memset and memmove
   default implementations.  */
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
    __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif

你在问题中提到:

It seems like the optimisation that does this transformation is -ftree-loop-distribute-patterns

关闭此优化所需要做的就是将 -fno-tree-loop-distribute-patterns 传递给编译器。这将关闭全局优化。