当 strnlen() 的最大长度大于实际缓冲区大小时会发生什么?

What happens when strnlen() is used with a larger maximum length than the buffer size actually is?

我编写了以下代码以更好地理解 strnlen 的行为方式:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    char bufferOnStack[10]={'a','b','c','d','e','f','g','h','i','j'};
    char *bufferOnHeap = (char *) malloc(10);

    bufferOnHeap[ 0]='a';
    bufferOnHeap[ 1]='b';
    bufferOnHeap[ 2]='c';
    bufferOnHeap[ 3]='d';
    bufferOnHeap[ 4]='e';
    bufferOnHeap[ 5]='f';
    bufferOnHeap[ 6]='g';
    bufferOnHeap[ 7]='h';
    bufferOnHeap[ 8]='i';
    bufferOnHeap[ 9]='j';

    int lengthOnStack = strnlen(bufferOnStack,39);
    int lengthOnHeap  = strnlen(bufferOnHeap, 39);

    printf("lengthOnStack = %d\n",lengthOnStack);
    printf("lengthOnHeap  = %d\n",lengthOnHeap);

    return 0;
}

请注意两个缓冲区中故意缺少空终止。 根据文档,长度似乎应该 都是 39 岁:

RETURN VALUE The strnlen() function returns strlen(s), if that is less than maxlen, or maxlen if there is no null terminating ('[=13=]') among the first maxlen characters pointed to by s.

这是我的编译行:

$ gcc ./main_08.c -o main

并且输出:

$ ./main
lengthOnStack = 10
lengthOnHeap  = 10

这是怎么回事?谢谢!

首先,don't cast malloc.

其次,您正在阅读数组末尾。数组边界外的内存是未定义的,因此不能保证它不为零;在这种情况下,它是!

一般来说,这种行为是草率的 - 请参阅 this answer 以获得对潜在后果的良好总结

首先,strnlen()不是C标准定义的;这是一个 POSIX 标准函数。

话虽如此,请仔细阅读文档

The strnlen() function returns the number of bytes in the string pointed to by s, excluding the terminating null byte ('[=23=]'), but at most maxlen. In doing this, strnlen() looks only at the first maxlen bytes at s and never beyond s+maxlen.

所以这意味着,在调用该函数时,您需要确保,对于您为 maxlen 提供的值,数组索引对于提供的字符串对于 [maxlen -1] 是有效的,即, 字符串 中至少有 maxlen 个元素。

否则,在访问 字符串 时,您将冒险进入未分配给您的内存位置(数组越界访问),因此调用 undefined behaviour.

记住,这个函数是计算数组的长度,上限为一个值(maxlen)。这意味着,提供的数组至少等于或大于边界,而不是相反。


[脚注]:

根据定义,字符串 以 null 结尾。

引用 C11,章节 §7.1.1,术语定义

A string is a contiguous sequence of characters terminated by and including the first null character. [...]

当... "Undefined behaviour (UB)" 时会发生什么?

“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose”

您的标题实际上不是 UB,因为调用 strnlen("hi", 5) 是完全合法的,但您的问题的细节表明它确实是 UB。 ..

strlenstrnlen 都需要一个字符串,即 nul-terminated char 序列。向函数提供非 nul-terminatedchar 数组是 UB.

你的情况是函数读取了前 10 个 char,没有找到 '[=16=]',因为它没有读到 out-of-bounds 它继续进一步读取,并通过调用 UB(读取 un-allocated 内存)。可能是您的编译器随意以 '[=16=]' 结束您的数组,也可能是 '[=16=]' 之前就在那里……可能性仅受编译器设计者的限制。

你的问题大致等同于:

I know that a burglar alarm is supposed to prevent your house from getting robbed. This morning when I left the house, I turned off the burglar alarm. Sometime during the day when I was away, a burglar broke in and stole my stuff. How did this happen?

或对此:

I know you can use the cruise control on your car to help you avoid getting speeding tickets. Yesterday I was driving on a road where the speed limit was 65. I set the cruise control to 95. A cop pulled me over and I got a speeding ticket. How did this happen?

实际上,这些并不完全正确。这是一个更人为的类比:

I live in a house with a 10 yard long driveway to the street. I have trained my dog to fetch my newspaper. One day I made sure there were no newspapers on the driveway. I put my dog on a 39 yard leash, and I told him to fetch the newspapwer. I expected him to go to the end of the leash, 39 yards away. But instead, he only went 10 yards, then stopped. How did this happen?

当然还有很多答案。也许,当你的狗走到没有报纸的车道尽头时,他马上就在排水沟里发现了别人的报纸。或者,当皮带在车道尽头无法阻止他并且他继续进入街道时,他 运行 被一辆车撞倒了。

给你的狗拴上皮带的目的是将他限制在一个安全区域——在这种情况下,你的属性,你可以控制。如果你把他拴在这么长的皮带上,以至于他可以跑到街上或树林里,你就有点违背了用皮带拴住他来控制他的目的。


类似地,strnlen 的全部意义在于,如果在您定义的缓冲区内没有空字符供 strnlen 查找。

非空终止字符串的问题在于像 strlen 这样的函数(它盲目地搜索空终止符)离开末尾并在未定义的内存中盲目地翻找,拼命地试图找到终止符。例如,如果您说

char non_null_terminated_string[3] = "abc";
int len = strlen(non_null_terminated_string);

行为未定义,因为 strlen 驶离终点。解决此问题的一种方法是使用 strnlen:

char non_null_terminated_string[3] = "abc";
int len = strnlen(non_null_terminated_string, 3);

但是如果你把一个更大的数字交给strnlen,它就破坏了整个目的。你想知道当 strnlen 驶离终点时会发生什么,但没有办法回答这个问题。