在 C 中的堆栈 and/or 堆上实现用户定义的数组

Question

我在学习 C 语言编程时试图了解堆栈和堆之间的区别。
为此，我尝试实现二分查找。我想从用户那里获取二分查找的输入数据集。为此，我希望用户能够定义数据集的大小（在本例中为数组）。获得数组的大小后，我会初始化数组，然后要求用户用值填充它。

我对堆栈的（可能是错误的）理解是，如果在编译时知道数组的大小，则可以在堆栈上初始化数组。所以，我尝试了以下方法来实现堆栈上的用户输入（它 'works'）：

void getUserInput(int sizeArray, int inputArray[], int userIn[]){
    /* get input data set and the element to be searched for from user */

    printf("Enter data elements seperated by a comma, e.g. 1,2,3. Press Enter at the end. Do not enter more than %d numbers: \n", sizeArray);
    for (int i = 0; i < sizeArray; i++){
        scanf("%d, ", &inputArray[i]);
    }
    printf("\nEnter the number to be searched for: \n");
    scanf("%d", &userIn[1]);

    printf("\nFor iterative implementation, enter 0. For recursive implementation, enter 1 :\n");
    scanf("%d", &userIn[0]);
}

int main(int arg, char **argv){
   int sizeArray;
   int userIn[2];                                                  // userIn[1]: element to be searched for; userIn[0]: iterative or recursive implementations
   printf("Enter size of input array: \n");
   scanf("%d", &sizeArray);
        
   int inputArray[sizeArray];
        
   getUserInput(sizeArray, inputArray, userIn);

// more code ...

}

对于堆上的实现，我尝试使用动态内存分配（它也 'works'）：

int main(int arg, char **argv) {
    int sizeArray;
    int userIn[2];                                                  // userIn[1]: element to be searched for; userIn[0]: iterative or recursive implementations
    printf("Enter size of input array: \n");
    scanf("%d", &sizeArray);

    int *inputArray;

    inputArray = (int*) malloc(sizeArray * sizeof(int));
    if(inputArray == NULL) {
        printf("\n\nError! Memory not allocated, enter size of array again:\n");
        scanf("%d", &sizeArray);
        inputArray = (int*) malloc(sizeArray * sizeof(int));
    }
    getUserInput(sizeArray, inputArray, userIn);

// more code...
    free(inputArray)                                               // free memory allocated by malloc on the heap
}

现在，我想将这两种方法合并到一个文件中，所以我创建了一个开关来在堆栈和堆实现之间切换，如下所示：

int main(int arg, char **argv) {
    /* Obtain input */
    int stackHeap;                                                      // 0 = implementation on stack; 1 = implementation on heap
    printf("Implementation on stack or heap? Enter 0 for stack, 1 for heap: \n");
    scanf("%d", &stackHeap);

    int sizeArray;
    int userIn[2];                                                  // userIn[1]: element to be searched for; userIn[0]: iterative or recursive implementations
    printf("Enter size of input array: \n");
    scanf("%d", &sizeArray);
    int *inputArray;

    if (stackHeap == 0) {
        inputArray =  &inputArray[sizeArray];    
        printf("input array = %p\n", inputArray);
    } else {
        inputArray = (int*) malloc(sizeArray * sizeof(int));
        printf("input array = %p\n", inputArray);
        if(inputArray == NULL) {
            printf("\n\nError! Memory not allocated, enter size of array again:\n");
            scanf("%d", &sizeArray);
            inputArray = (int*) malloc(sizeArray * sizeof(int));
        }
    }

    getUserInput(sizeArray, inputArray, userIn);
// more code...
}

目前堆栈方法不起作用。我尝试在 if 语句中初始化 inputArray，而不是 inputArray = &inputArray[sizeArray]。然而，这是不允许的，因为它只在 if 语句的范围内有效。我想我对如何使用指针 *inputArray 来初始化堆栈上的数组感到困惑。

我一直在阅读 C 中的指针和数组，这就是为什么我认为实现它会很有趣。我非常感谢您的任何反馈（很高兴也很高兴我犯了任何基本错误 - 我对这个话题很陌生）。

非常感谢！

Answer 1

您的程序仅使用堆栈分配或仅使用堆分配都很好。堆栈分配的唯一潜在问题是，如果数组太大（即几 MB），它可能会溢出堆栈，而堆有一个 much 更高的限制。

组合程序的问题在于这没有意义：

inputArray =  &inputArray[sizeArray];

使用数组索引运算符实际上涉及取消引用指针，此时 inputArray 没有指向任何地方。

您需要选择其中之一，数组或指向动态分配数组开头的指针。

Answer 2

typedef enum
{
    STACK,
    HEAP,
}type;

size_t getSize(void)  // implement user input
{
    return 100;
}

type getType(void)   // implement user input
{
    return HEAP;
}

int main(void)
{
    size_t stackSize = 0;
    size_t size;
    type tp;
    int *inputArray;

    size = getSize();
    tp = getType();

    if(tp == STACK) stackSize = size;

    int array[stackSize];

    if(tp == STACK)
        inputArray = array;
    else
        inputArray = malloc(size * sizeof(*inputArray));

    /* ... */
}

Answer 3

TL;DR：除非您需要它们的特殊语义，否则不要使用 VLA。它们很棘手并且容易出错；使用堆通常更容易和更安全。

您运行遇到的基本问题是定义 VLA 是一个声明，因此无法有条件地创建一个在条件块。这几乎是设计使然 - 该块用于限制生命周期，因此可以自动释放它。

因此您的选择是无条件地创建一个（如果 stackHeap 为真则不使用它）或将您所有的 ...更多代码... 移动到条件块中。后者最容易通过将其全部放在一个函数中来实现（例如 processArray(int sizeArray, int inputArray[],... 很像您的 getUserInput 函数）。然后你只需要在 if 和 else 情况下调用函数两次。

Answer 4

I am trying to understand the difference between the stack and the heap and whilst learning programming in C.

那是你的第一个错误。堆栈和堆不是 C 语言的概念，C 语言不依赖于内存区域之间的任何此类区别。话虽如此，许多 C 实现实际上确实依赖于它们，因此了解它们可能是值得的。然而，就学习 C 语言而言，最好专注于相关的原生 C 概念。

在这个特定领域，这些将是对象的 生命周期 和 存储持续时间 的相关概念。对象的生命周期是程序执行的一部分，在此期间该对象存在并保留其最后存储的值，该值是其存储持续时间的函数，对于某些对象而言，还取决于它们的创建时间。存储持续时间由对象定义的位置（如果有）和应用于它的类型限定符定义。

在代码块内没有 static 限定符声明的对象具有“自动”存储持续时间，并且从进入块到块执行终止都有效。在基于堆栈的机器上，此类对象的存储通常分配在堆栈上。其他存储持续时间是“静态”、“已分配”和“线程”，并且在基于堆栈的机器上，具有这些持续时间的对象的存储通常在堆栈外部实现，有些人将其描述为在堆上。

My (potentially wrong) understanding of the stack is, that the array can be initialised on the stack if its size is known at compile time.

这有点不准确。

一方面，如果在编译时已知所需的大小，那么您当然可以为具有该大小的自动持续时间数组编写有效的声明。但是，这并不能保证程序真的可以在运行的时候得到想要的space。当声明具有非常大的存储大小的对象时，这是一个特殊的问题：当程序执行到必须分配此类对象的点时，面向堆栈的 C 实现可能没有足够的堆栈 space 可用。这就是该站点命名的“堆栈溢出”。

另一方面，所有符合标准的 C99 实现和大多数 C11 和 C17 实现都支持可变长度数组，其大小在运行时确定，并且可以是自动对象的类型。事实上，这就是您在第一个示例中为变量 inputArray 使用的内容。该代码有效。一些实现也可能提供其他机制来分配可变长度数据作为自动存储。例如，参见 alloca()。

For an implementation on the heap, I attempted to use dynamic memory allocation

这会为您提供一个具有“已分配”存储持续时间的对象。在您可以描述为维护堆栈/堆区别的实现中，此类对象的存储确实通常在堆上。

Now, I wanted to combine both approaches into one file, so I created a switch to switch between the stack and heap implementations

这是可行的，但不是您尝试的方式。宣布...

    int *inputArray;

...您需要先将该指针指向某物，然后才能访问其元素。在一种情况下，您通过使用 malloc 分配内存来做到这一点，但在另一种情况下，这 ...

        inputArray =  &inputArray[sizeArray];

... 根本没有按照您的意愿行事。与在表达式而不是声明中一样，inputArray[sizeArray] 不会导致分配任何内存。相反，它会尝试访问 inputArray 指向的对象中索引 sizeArray 处的 int。由于 inputArray 在这一点上还没有被分配一个值，这会产生未定义的行为。因此，确实，

Currently the stack approach doesn't work.

你也说得对

initialising the inputArray within the if statement [is] not allowed, since it is then only valid within the scope of the if statement.

所以当你说，

I think I am getting confused as how to use the pointer *inputArray to initialise the array on the stack.

，我倾向于同意。事实上，我认为你错过了一个关键点，那就是在任何一种情况下你都不能使用指针来初始化数组，也不能使用类似的东西。相反，您必须分配数组，并或多或少地分别分配指向它的指针。

您清楚地知道如何用 malloc 做到这一点，但我想您并没有从这些方面考虑。你这样做的方式是有效的，因为通过 malloc 分配的对象具有“分配的”存储持续时间，这意味着它的生命周期持续到它被释放为止。要对自动对象执行类似操作，您需要一个具有足够生命周期的对象来满足您的目的，在这种情况下，这意味着它必须直接在 main() 的主体内声明，而不是在任何嵌套块中声明。您可以这样实现：

    int sizeArray;
    int sizeAutomatic = 1;

    printf("Enter size of input array: \n");
    scanf("%d", &sizeArray);

    if (stackHeap == 0) {
        sizeAutomatic = sizeArray;
    }

    int automaticArray[sizeAutomatic];
    int *inputArray;

    if (stackHeap == 0) {
        inputArray = automaticArray;
        // or, equivalently: inputArray = &automaticArray[0];
    } else {
        inputArray = malloc(sizeArray * sizeof(int));
        if(inputArray == NULL) {
            /// handle allocation error ...
        }
    }

    printf("input array = %p\n", (void *)inputArray);

如果用户选择自动分配选项，inputArray 将指向 automaticArray 的第一个元素，有效地使两个别名成为彼此。在另一种情况下，inputArray 将指向一个分配的对象。 automaticArray 对象在后一种情况下仍然存在，但它的长度为 1 并且未被使用。这一 int 开销是您为允许用户决定使用哪种分配方法而付出的代价的一部分。

在 C 中的堆栈 and/or 堆上实现用户定义的数组

Implementation of user defined array on stack and/or heap in C

c

arrays

stack

pointers

heap-memory