如何提供 memcpy 的实现
How to provide an implementation of memcpy
我正在尝试编写一些带有 memset
式循环的裸机代码:
for (int i = 0; i < N; ++i) {
arr[i] = 0;
}
它是用 GCC 编译的,GCC 足够聪明,可以将其转换为对 memset()
的调用。不幸的是,因为它是裸机,我没有 memset()
(通常在 libc 中)所以我得到一个 link 错误。
undefined reference to `memset'
似乎执行此 t运行sformation 的优化是 -ftree-loop-distribute-patterns
:
Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by -fprofile-use
and -fauto-profile
.
所以one person's solution只是降低优化级别。不是很满意。
我还发现 this really helpful page 解释了 -ffreestanding
不足以让 GCC 不这样做,基本上别无选择,只能提供你自己的 memcpy
实现, memmove
、memset
和 memcmp
。我很乐意这样做,但是怎么做呢?
如果我只写 memset
编译器将检测其中的循环并将其运行sform 成对 memset 的调用!事实上,在我使用的 CPU 供应商提供的代码中,我发现了这条评论:
/*
// This is commented out because the assembly code that the compiler generates appears to be
// wrong. The code would recursively call the memset function and eventually overruns the
// stack space.
void * memset(void *dest, int ch, size_t count)
...
所以我认为这是他们 运行 关注的问题。
如何提供 memset
的 C 实现,而不需要编译器将其优化为调用自身并且不禁用该优化?
啊哈,我签到了 the glibc code and there's a inhibit_loop_to_libcall
modifier which sounds like it should do this. It is defined like this:
/* Add the compiler optimization to inhibit loop transformation to library
calls. This is used to avoid recursive calls in memset and memmove
default implementations. */
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif
你在问题中提到:
It seems like the optimisation that does this transformation is -ftree-loop-distribute-patterns
关闭此优化所需要做的就是将 -fno-tree-loop-distribute-patterns
传递给编译器。这将关闭全局优化。
我正在尝试编写一些带有 memset
式循环的裸机代码:
for (int i = 0; i < N; ++i) {
arr[i] = 0;
}
它是用 GCC 编译的,GCC 足够聪明,可以将其转换为对 memset()
的调用。不幸的是,因为它是裸机,我没有 memset()
(通常在 libc 中)所以我得到一个 link 错误。
undefined reference to `memset'
似乎执行此 t运行sformation 的优化是 -ftree-loop-distribute-patterns
:
Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by
-fprofile-use
and-fauto-profile
.
所以one person's solution只是降低优化级别。不是很满意。
我还发现 this really helpful page 解释了 -ffreestanding
不足以让 GCC 不这样做,基本上别无选择,只能提供你自己的 memcpy
实现, memmove
、memset
和 memcmp
。我很乐意这样做,但是怎么做呢?
如果我只写 memset
编译器将检测其中的循环并将其运行sform 成对 memset 的调用!事实上,在我使用的 CPU 供应商提供的代码中,我发现了这条评论:
/*
// This is commented out because the assembly code that the compiler generates appears to be
// wrong. The code would recursively call the memset function and eventually overruns the
// stack space.
void * memset(void *dest, int ch, size_t count)
...
所以我认为这是他们 运行 关注的问题。
如何提供 memset
的 C 实现,而不需要编译器将其优化为调用自身并且不禁用该优化?
啊哈,我签到了 the glibc code and there's a inhibit_loop_to_libcall
modifier which sounds like it should do this. It is defined like this:
/* Add the compiler optimization to inhibit loop transformation to library
calls. This is used to avoid recursive calls in memset and memmove
default implementations. */
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif
你在问题中提到:
It seems like the optimisation that does this transformation is
-ftree-loop-distribute-patterns
关闭此优化所需要做的就是将 -fno-tree-loop-distribute-patterns
传递给编译器。这将关闭全局优化。