尽管用于内置类型，为什么 memset 会导致问题？

Question

我是 C++ 的新手，正在 Codeforces 提交 this 问题，突然发现使用 memset() 导致 Wrong answer到其中一个测试用例。

这里是测试用例：

Input:
4 4
3 3 3 5
Participant's output
NO
Jury's answer
YES
1 2 3 4 
Checker comment
wrong answer Jury has the answer but the participant hasn't

代码如下：

#include<iostream>
using namespace std;


int check_if_painted[5010][5010];
int input_array[5010];
int main(){

    int n,k;
    cin>>n>>k;

    int occurence_count[n];//Keeps track of the total no. of occurences of an element in the input_array.
    memset(occurence_count,0,sizeof(occurence_count));
    /*
    The following loop checks if the occurrence of a particular 
    element is not more than k. If the occurence>k the "NO" is printed and program ends.
    */
    for (int i = 0; i < n; ++i)
    {
        cin>>input_array[i];
        ++occurence_count[input_array[i]];
        if(occurence_count[input_array[i]]>k){
            cout<<"NO";
            return 0;
        }
    }
    cout<<"YES\n";


    /*
    The following loop uses the array check_if_painted as a counter to check if the particular 
    occurrence of an element has been painted with a colour from 1 to k or not. 
    If some previous occurrence of this particular element has been painted with f%k+1, 
    then f is incremented until we encounter any value(of `f%k+1`) from 1 to k that hasn't been 
    used yet to colour and then we colour this element with that value by printing it.
    */
    int f=0;//
    /*
    f is a global value which increments to a very large value but f%k+1 is used 
    to restrict it within the 1 to k limit(both inclusive). So, we are able to check 
    if any previous occurrence of the current element has already been coloured with the value f%k+1 or not.  
    which essentially is 1 to k.
    */ 
    for(int i=0;i<n;++i){
        while(check_if_painted[input_array[i]][f%k+1]>0){
            ++f;
        }
        cout<<f%k+1<<" ";
        ++check_if_painted[input_array[i]][f%k+1];
        ++f;
    }
    return 0;
}

但是，当我尝试下面的代码时，成功接受了。

#include<iostream>
using namespace std;


int check_if_painted[5010][5010];
int input_array[5010];
int occurence_count[5010];
int main(){

    int n,k;
    cin>>n>>k;




    for (int i = 0; i < n; ++i)
    {
        cin>>input_array[i];
        ++occurence_count[input_array[i]];
        if(occurence_count[input_array[i]]>k){
            cout<<"NO";
            return 0;
        }
    }
    cout<<"YES\n";



    int f=0;

    for(int i=0;i<n;++i){
        while(check_if_painted[input_array[i]][f%k+1]>0){
            ++f;
        }
        cout<<f%k+1<<" ";
        ++check_if_painted[input_array[i]][f%k+1];
        ++f;
    }
    return 0;
}

从 this post 上，我发现 memset 在内置类型上运行良好。那么，当它用于默认类型的 int 数组时，为什么会导致我的问题。

此外，我还读到 std::fill() 是更好的选择，并在 this post 中读到 memset 是一个危险的函数，那为什么它仍然存在在使用中？

Answer 1

这与memset没有任何关系。您的代码超出了数组的边界，简单明了。

在您的输入案例中，您有 n = 4 和 k = 4，因此 occurrence_count 的长度为 4 个元素（其有效索引从 0 到 3（含））。然后，你

    cin>>input_array[i];
    ++occurence_count[input_array[i]];

鉴于最后一个值为 4，您最终将执行 ++occurence_count[4]，这超出了数组的边界。这是未定义的行为，在您的情况下表现为递增不属于该数组的内存，该数组很可能不会从 0 开始并弄乱以后的检查。

在您的第二个代码片段中没有发现问题，因为您使 occurence_count 5010 个元素变大并且默认零初始化，因为它是一个全局变量。

现在，如果您要计算数组值的出现次数，当然将出现次数数组的大小设置为与元素数量一样大是错误的 - 那就是您要读取的数字计数（并且它会尺寸 input_array) 没问题， 而不是 您可以读取的最大值。假设数组元素 values 是从 1 到 5000，则 occurrences 数组的大小必须为 5001（保持值不变）或 5000（将您读取的值减 1 到索引这样的数组）。

（一般来说，要小心，因为问题文本中的所有索引都是从 1 开始的，而 C++ 中的索引是从 0 开始的；如果您对问题索引进行推理，然后将它们用作C 索引，除非您将数组的大小增加 1 并忽略第 0 个元素）。

最后，补充几点：

如果您在启用足够多的警告或使用足够新的编译器的情况下进行编译，它会正确地抱怨 memset 未定义或它被隐式定义（顺便说一句，原型不正确）；你应该 #include <string.h> 使用 memset;
as @Nicol Bolas 在他的回答中用了很长的篇幅来解释，当用大小仅在运行时已知 (int occurence_count[n])。

VLA 不是标准的 C++，所以它们没有很好地指定，一些编译器不支持它们，并且一般来说，在很多方面都有问题（大多数情况下，你不应该真正分配一个未知的栈上的数据量，一般较小);

您可能应该避免使用它们而选择 std::vector，或者考虑到问题为您提供了颜色和元素 (5000) 的上限，只是静态数组。

尽管用于内置类型，为什么 memset 会导致问题？

Why is memset causing problem despite being used on built-in types?

c++

arrays

memset

built-in-types