如何提供 memcpy 的实现

Question

我正在尝试编写一些带有 memset 式循环的裸机代码：

for (int i = 0; i < N; ++i) {
  arr[i] = 0;
}

它是用 GCC 编译的，GCC 足够聪明，可以将其转换为对 memset() 的调用。不幸的是，因为它是裸机，我没有 memset()（通常在 libc 中）所以我得到一个 link 错误。

 undefined reference to `memset'

似乎执行此 t运行sformation 的优化是 -ftree-loop-distribute-patterns:

Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by -fprofile-use and -fauto-profile.

所以one person's solution只是降低优化级别。不是很满意。

我还发现 this really helpful page 解释了 -ffreestanding 不足以让 GCC 不这样做，基本上别无选择，只能提供你自己的 memcpy 实现， memmove、memset 和 memcmp。我很乐意这样做，但是怎么做呢？

如果我只写 memset 编译器将检测其中的循环并将其运行sform 成对 memset 的调用！事实上，在我使用的 CPU 供应商提供的代码中，我发现了这条评论：

/*
// This is commented out because the assembly code that the compiler generates appears to be
// wrong.  The code would recursively call the memset function and eventually overruns the
// stack space.
void * memset(void *dest, int ch, size_t count)
...

所以我认为这是他们运行关注的问题。

如何提供 memset 的 C 实现，而不需要编译器将其优化为调用自身并且不禁用该优化？

Answer 1

啊哈，我签到了 the glibc code and there's a inhibit_loop_to_libcall modifier which sounds like it should do this. It is defined like this:

/* Add the compiler optimization to inhibit loop transformation to library
   calls.  This is used to avoid recursive calls in memset and memmove
   default implementations.  */
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
    __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif

Answer 2

你在问题中提到：

It seems like the optimisation that does this transformation is -ftree-loop-distribute-patterns

关闭此优化所需要做的就是将 -fno-tree-loop-distribute-patterns 传递给编译器。这将关闭全局优化。

如何提供 memcpy 的实现

How to provide an implementation of memcpy

c

gcc

memset