在 C 不好的做法循环中重复在堆栈上创建缓冲区？

Question

这个 post 的标题与我搜索的相关内容非常相似。我遇到的每一个结果都是关于缓冲区溢出的，而那不是我所追求的。

我让我的函数遍历我之前填充的 dirent 结构中的每个文件名。每个文件名的大小各不相同，从非常小到非常大。

以前我的函数要做的是创建大小为 2048 字节的缓冲区。然后进入循环。在循环的每次迭代中，缓冲区都填充了目标目录的路径，加上目录中的当前文件名连接到它的末尾。使用缓冲区中的新路径，我执行了一些相当小的文件操作。这种情况会发生，直到达到结构中的最终文件名。

然而，问题是并非每个完整路径都是 2048 字节。有些甚至不到这个尺寸的三分之一。

重新访问这个函数，我将缓冲区的创建移到循环内，循环的每次迭代都会创建大小为 n 的缓冲区，其中 n 是 the length of the target directory + the length of the current filename within the directory.

我想知道这是否会被视为不好的做法或其他任何东西。我是否最好事先创建缓冲区并始终为其设置大小，即使有时有 2/3 的缓冲区未使用？还是只根据我需要的大小创建缓冲区更好？

我希望我已经提供了充足的信息...提前致谢！

这是有问题的函数。

int verifyFiles(DIR *dp, const char *pathroot){
    struct dirent *dir;
    struct stat pathstat;
    //char path[2048];
    int status = 0;

    while((dir = readdir(dp)) != NULL){
        if(!strncmp(dir->d_name, ".", 1))
            continue;

        size_t len = strlen(pathroot) + strlen(dir->d_name) + 2;
        char path[len];
        snprintf(path, sizeof(path), "%s/%s", pathroot, dir->d_name);
        
        // verify shebang is present on the first line of path's contents.
        if(!shebangPresent(path)){
            status = -1;
            break;
        }

        // verify path belongs to the user.
        stat(path, &pathstat);
        if(pathstat.st_uid != getuid()){
            status = -1;
            break;
        }
    }

    return status;
}

Answer 1

拥有这样的固定缓冲区绝对没有错。不要担心这些小细节。该函数将分配 2kB 的内存，完成它的工作然后释放它。如果那是个问题，那么你的问题比这段代码还大。

我只会在递归函数的情况下担心这样的事情。就像你有这样的事情：

int foo(int n) 
{
    char buf[2048];
    int r = foo(n-1);
    // Do something with buf and return
}

上面的代码会很快吃掉大 n 的堆栈。但在你的情况下，我真的不会担心，直到你有一些证据或至少合理怀疑它实际上导致了问题。

如果它是一个更大的缓冲区，比如 100kB 的数量级，那么我肯定会使用动态分配。堆栈通常在 Windows 上有 1MB，在 Linux 上有 8MB。所以这不是“不浪费内存”的问题，而是不炸毁堆栈的问题。

Answer 2

Is repetitively creating buffers on the stack in a loop in C bad practice?
I am wondering whether or not this may be considered bad practice or anything otherwise.

不，char path[len]; 不是问题。

然而，此处用于确定缓冲区大小的方法很薄弱。

下面的代码 repeatedly calculates pathroot 的字符串长度。也许，一个好的编译器可能会分析并发现不需要重复调用。足够简单以确保计算完成一次。

size_t pathlen = strlen(pathroot); // add

while((dir = readdir(dp)) != NULL){
    if(!strncmp(dir->d_name, ".", 1))
        continue;

    // size_t len = strlen(pathroot) + strlen(dir->d_name) + 2;
    size_t len = pathlen + strlen(dir->d_name) + 2;

    char path[len];

Am I better off creating the buffer beforehand & always having a set size for it, even if 2/3 of the buffer is unused sometimes?

在这种情况下，路径大小可能有一个环境上限：可能作为MAXPATH或MAXPATHLEN。

我会考虑以下内容以避免重复复制路径 - 它可能很长。

char path[MAXPATH + 1];  // or malloc()
int len = snprintf(path, sizeof path, "%s/", pathroot);
if (len < 0 || len >= sizeof path) Handle_OutOfRoom();

while((dir = readdir(dp)) != NULL){
  int len2 = snprintf(path + len, sizeof path - len, "%s", dir->d_name);
  if (len2 < 0 || len2 >= sizeof path - len) Handle_OutOfRoom();
  ...

在 C 不好的做法循环中重复在堆栈上创建缓冲区？

Is repetitively creating buffers on the stack in a loop in C bad practice?

c

performance

stack

buffer

while-loop