深入理解strcat和strlen函数

Question

我们知道 strcat() recevies a poiner to a destination array as parameters and concatenate them with source string. The destination array should be large enough to store the concatenated result. Recently i found out that it is still possible for strcat() to execute as expected, for small programs, even when the destination array is not large enough to add second string. I start surfing Whosebug and found out - answers 这个问题。我想更深入地了解当我运行下面这段代码时硬件层到底发生了什么？

#include<iostream>
#include<iomanip>
#include<cmath>
#include<cstring>

using namespace std;

int main(){
    char p[6] = "Hello";
    cout << "Length of p before = " << strlen(p) << endl;
    cout << "Size of p before = " << sizeof(p) << endl;
    char as[8] = "_World!";
    cout << "Length of as before = " << strlen(as) << endl;
    cout << "Size of as before = " << sizeof(as) << endl;
    cout << strcat(p,as) << endl;
    cout << "After concatenation:" << endl;
    cout << "Length of p after = " << strlen(p) << endl;
    cout << "Size of p after = " << sizeof(p) << endl; 
    cout << "Length of as after = " << strlen(as) << endl;
    cout << "Size of as after = " << sizeof(as) << endl;

    return 0;
}

运行执行此代码后，数组 p[] 的长度为 12，p[] 的大小为 6。如何在物理上将这样的长度存储在这样的数组大小上？我的意思是对于这个数组，字节数是有限的，这是否意味着 strlen(p) 函数只查找 NULL 终止符，并一直计数直到找到它并忽略该数组的实际分配大小。 sizeof() 函数并不真正关心数组中的最后一个元素（专门为空字符分配）是否存储空字符。

Answer 1

数组 p 分配在函数堆栈帧上，因此 strcat "overflows" 缓冲区 p 并继续写入堆栈的其他区域 - 通常它会覆盖其他局部参数、函数 return 地址等（请记住，在 x86 平台上，函数堆栈通常会增长 "downwards"，即向较小的地址增长）。这是 well-known "buffer overflow" 漏洞。

strlen 无法知道缓冲区的实际大小，它只是寻找 0-终止符。另一方面，sizeof 是一个 compile-time 函数，它 return 以字节为单位的数组大小。

Answer 2

您正在 p 范围之外编写，因此您的程序的行为未定义。

虽然行为完全未定义，但会发生一些常见行为：

您覆盖了一些不相关的数据。这可能是其他局部变量、函数 return 地址等。如果不检查编译器为该特定程序生成的程序集，就不可能准确地猜测什么将被覆盖。这可能会导致严重的安全漏洞，因为它可能允许攻击者将他们自己的代码注入您程序的内存 space 并让他们覆盖函数的 return 地址，从而导致程序执行他们注入的代码。
程序崩溃。如果您写入数组末尾足够远以通过内存页面边界，就会发生这种情况。该程序可以尝试写入 OS 尚未映射到应用程序物理内存的虚拟内存地址。这会导致 OS 终止您的应用程序（例如 Linux 上的 SIGSEGV）。与 function-local 数组相比，动态分配的数组通常会更频繁地发生这种情况。

深入理解strcat和strlen函数

Deep understanding of strcat and strlen functions

c++

pointers

char

strcat