不使用 memcpy() 复制字节

copy bytes without memcpy()

这是一项家庭作业。我想实施 memcpy()。有人告诉我内存区域不能重叠。其实我不明白那是什么意思,因为这段代码工作正常,但是有内存重叠的可能性。如何预防?

void *mem_copy(void *dest, const void *src, unsigned int n) {
    assert((src != NULL) && (n > 0));  
    int i = 0;
    char *newsrc = (char*)src;
    char *newdest = (char*)dest;
    while (i < n) {
        newdest[i] = newsrc[i];
        i++;
    }
    newdest[i]='[=10=]';
    return newdest;
}

Actually I don't understand what doest that mean [for memory to overlap]

考虑这个例子:

char data[100];
memcpy(&data[5], &data[0], 95);

从程序的角度来看,从srcsrc+n的地址范围不能与从destdest+n的范围重叠。

if there is possibility of memory overlap, how to prevent it?

如果 src 的地址在数字上低于 dest,则您可以通过决定从后面复制重叠区域来使您的算法在有或没有重叠的情况下工作。

注意: 因为你在做 memcpy,而不是 strcpy,用 newdest[i]='[=19=]' 强制空终止是不正确的,需要已删除。

当源和目标内存块重叠时,如果您的循环从索引 0 开始一个接一个地复制元素,它适用于 dest < source,但不适用于 dest > source(因为你在复制元素之前覆盖了元素),反之亦然。

您的代码从索引 0 开始复制,因此您可以简单地测试哪些情况有效,哪些情况无效。看下面的测试代码;它显示了如何向前移动测试字符串失败,而向后移动字符串却可以正常工作。此外,它还展示了在向后复制时向前移动测试字符串的效果如何:

#include <stdio.h>
#include <string.h>

void *mem_copy(void *dest, const void *src, size_t n) {
    size_t i = 0;
    char* newsrc = (char*)src;
    char* newdest = (char*)dest;
    while(i < n) {
        newdest[i] = newsrc[i];
        i++;
    }
    return newdest;
}

void *mem_copy_from_backward(void *dest, const void *src, size_t n) {
    size_t i;
    char* newsrc = (char*)src;
    char* newdest = (char*)dest;
    for (i = n; i-- > 0;) {
        newdest[i] = newsrc[i];
    }
    return newdest;
}

int main() {

    const char* testcontent = "Hello world!";
    char teststr[100] = "";

    printf("move teststring two places forward:\n");
    strcpy(teststr, testcontent);
    size_t length = strlen(teststr);
    printf("teststr before mem_copy: %s\n", teststr);
    mem_copy(teststr+2, teststr, length+1);
    printf("teststr after mem_copy: %s\n", teststr);

    printf("\nmove teststring two places backward:\n");
    strcpy(teststr, testcontent);
    length = strlen(teststr);
    printf("teststr before mem_copy: %s\n", teststr);
    mem_copy(teststr, teststr+2, length+1);
    printf("teststr after mem_copy: %s\n", teststr);

    printf("move teststring two places forward using copy_from_backward:\n");
    strcpy(teststr, testcontent);
    length = strlen(teststr);
    printf("teststr before mem_copy: %s\n", teststr);
    mem_copy_from_backward(teststr+2, teststr, length+1);
    printf("teststr after mem_copy: %s\n", teststr);
}

输出:

move teststring two places forward:
teststr before mem_copy: Hello world!
teststr after mem_copy: HeHeHeHeHeHeHeH

move teststring two places backward:
teststr before mem_copy: Hello world!
teststr after mem_copy: llo world!

move teststring two places forward using copy_from_backward:
teststr before mem_copy: Hello world!
teststr after mem_copy: HeHello world!

所以可以写一个函数,它决定是从索引 0 开始复制还是从索引 n 开始复制,这取决于调用者是想向前复制还是向后复制。棘手的事情是找出调用者是向前还是向后复制,因为 srcdest 上的指针算法实际上并不是在每种情况下都允许 if (src < dest) copy_from_backward(...) (参见标准, 例如这个 draft):

6.5.9 Equality operators

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

虽然我从来没有遇到过 src < dest 没有给我想要的结果的情况,但如果它们不属于同一个数组,那么以这种方式比较两个指针实际上是未定义的行为。

因此,如果你问"how to prevent it?",我认为唯一正确的答案一定是:"It's subject to the caller, because function mem_copy cannot decide whether it may compare src and dest correctly."

您重新实施 memcpy() 时存在一些问题:

  • 大小参数 n 的类型应为 size_t。索引变量 i 应与大小参数具有相同的类型。

  • 传一个数0就可以了。事实上,您的代码在这种情况下会正确运行,从 assert().

  • 中删除测试
  • 避免放弃 const 限定符,除非绝对必要。

  • 不要在目标的末尾加上 '[=22=]',这是不正确的,会导致缓冲区溢出。

这是更正后的版本:

void *mem_copy(void *dest, const void *src, size_t n) {
    assert(n == 0 || (src != NULL && dest != NULL));  
    size_t i = 0;
    const char *newsrc = (const char *)src;
    char *newdest = (char *)dest;
    while (i < n) {
        newdest[i] = newsrc[i];
        i++;
    }
    return dest;
}

关于源区和目标区之间的潜在重叠,如果目标指针大于源指针但在源区内,您的代码的行为将令人惊讶:

char buffer[10] = "12345";
printf("before: %s\n", buffer);
mem_copy(buffer + 1, buffer, 5);
printf("after: %s\n", buffer);

将输出:

before: 12345
after: 111111

没有完全可移植的方法来测试这种重叠,但在非奇特的体系结构上很容易,执行时间和代码大小的成本很小。 memcpy() 的语义是假定库不执行此类测试,因此程序员应仅在源区域和目标区域不可能重叠时才调用此函数。如有疑问,请使用 memmove() 正确处理重叠区域。

如果你想为此添加一个 assert,这里有一个最便携的:

assert(n == 0 || newdest + n <= newsrc || newdest >= newsrc + n);

这里是 memmove() 的简单重写,尽管不是完全可移植的:

void *mem_move(void *dest, const void *src, size_t n) {
    assert(n == 0 || (src != NULL && dest != NULL));  
    const char *newsrc = (const char *)src;
    char *newdest = (char *)dest;
    if (newdest <= newsrc || newdest >= newsrc + n) {
        /* Copying forward */
        for (size_t i = 0; i < n; i++) {
            newdest[i] = newsrc[i];
        }
    } else {
        /* Copying backwards */
        for (size_t i = n; i-- > 0;) {
            newdest[i] = newsrc[i];
        }
    }
    return dest;
}