在合并排序中写入堆外

Writing outside heap in mergesort

我一直在用 C 实现合并排序的普通版本。在排序部分本身,这是一个单独的函数,我通过 malloc 创建了一个临时数组,如下所示:

int *tmp_arr = malloc(sizeof *tmp_arr * (end - strt));

出于某种原因,我在比较后尝试用元素填充临时数组时,在整个临时数组中进行了无效的读取和写入。 int tmp_arr[end - strt]; 工作正常。

为什么我在堆外写入(如果我当前的理解不正确会发生什么)?

下面是使用 malloc 的函数,从 valgrind 输出中提取以及我在执行时遇到的确切错误。我没有忘记 free 东西。

我对初始数组进行了硬编码,因为我想编写这些东西并在之后打扰输入。

这个特定的数组 { 8, 1, 10, 54, 2, 0, -3, 5, 70, 60, 11, 4 } 具有我的程序可以容纳的最大长度 运行,如果更长 - 程序崩溃并出现 post 末尾描述的错误。 Valgrind 的输出原则上总是一样的,数组越长,我得到的读写错误就越多。

节目:

#include <stdio.h>
#include <stdlib.h>

void merge_sort(int *arr, int strt, int end);
void merge(int *arr, int strt, int mid, int end);

void print_arr(int *arr, int arr_len);

int main(void) {
  int arr[] = { 8, 1, 10, 54, 2, 0, -3, 5, 70, 60, 11, 4 };
  int arr_len = sizeof(arr) / sizeof(arr[0]); 
  print_arr(arr, arr_len);
  merge_sort(arr, 0, arr_len - 1);
  print_arr(arr, arr_len);
}

void print_arr(int *arr, int arr_len) {
  for (int i = 0; i < arr_len; i++) {
    printf("%i ", arr[i]);
  }
  printf("\n");
}

void merge_sort(int *arr, int strt, int end) {
  if (strt < end) {
    int mid = (end + strt) / 2;
    merge_sort(arr, strt, mid);
    merge_sort(arr, mid + 1, end);
    merge(arr, strt, mid, end); 
  }
}

void merge(int *arr, int strt, int mid, int end) {
  int *tmp_arr = malloc(sizeof *tmp_arr * (end - strt));
  //int tmp_arr[end - strt];
  int i = strt;
  int j = 0;
  int k = mid + 1;
  while ((i <= mid) && (k <= end)) {
    if (arr[i] < arr[k]) {
      tmp_arr[j] = arr[i];
      i++;
      j++;   
    }
    else {
      tmp_arr[j] = arr[k];
      k++;
      j++; 
    }
  }
  while(i <= mid) {
      tmp_arr[j] = arr[i];
      j++;
      i++;
  }
  while (k <= end) {
      tmp_arr[j] = arr[k];
      j++;
      k++;
  }
  for (int m = 0; m < j; m++) {
    arr[strt + m] = tmp_arr[m];
  }
  free(tmp_arr);
}

使用上面硬编码的数组从 valgrind 输出中提取:

 valgrind ./main==355== Memcheck, a memory error detector
==355== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.==355== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright inf
o
==355== Command: ./main==355== 
8 1 10 54 2 0 -3 5 70 60 11 4 ==355== Invalid write of size 4
==355==    at 0x108A40: merge (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x108917: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x108823: main (in /home/runner/MiniatureEarnestSecurity/ma
in)
==355==  Address 0x522d484 is 0 bytes after a block of size 4 alloc'd==355==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-
amd64-linux.so)==355==    by 0x108943: merge (in /home/runner/MiniatureEarnestSecurity/m
ain)
==355==    by 0x108917: merge_sort (in /home/runner/MiniatureEarnestSecur
ity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecur
ity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecur
ity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecur
ity/main)
==355==    by 0x108823: main (in /home/runner/MiniatureEarnestSecurity/ma
in)
==355== 
==355== Invalid read of size 4
==355==    at 0x108AC8: merge (in /home/runner/MiniatureEarnestSecurity/m
ain)
==355==    by 0x108917: merge_sort (in /home/runner/MiniatureEarnestSecur
ity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x108823: main (in /home/runner/MiniatureEarnestSecurity/main)
==355==  Address 0x522d484 is 0 bytes after a block of size 4 alloc'd
==355==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==355==    by 0x108943: merge (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x108917: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x1088EB: merge_sort (in /home/runner/MiniatureEarnestSecurity/main)
==355==    by 0x108823: main (in /home/runner/MiniatureEarnestSecurity/main)
==355== 
            
            //repeating till the end
            
        -3 0 1 2 4 5 8 10 11 54 60 70 
        ==355== 
        ==355== HEAP SUMMARY:
        ==355==     in use at exit: 0 bytes in 0 blocks
        ==355==   total heap usage: 12 allocs, 12 frees, 1,156 bytes allocated
        ==355== 
        ==355== All heap blocks were freed -- no leaks are possible
        ==355== 
        ==355== For counts of detected and suppressed errors, rerun with: -v
        ==355== ERROR SUMMARY: 22 errors from 16 contexts (suppressed: 0 from 0)

我在执行时遇到的错误:

main: malloc.c:2401: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inus e (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed. exited, aborted

问题是要排序的切片的长度不是 end - strt,而是 end - strt + 1,因为您正在使用容易出错的约定来包含结束元素。

分配应该是int *tmp_arr = malloc(sizeof *tmp_arr * (end - strt + 1));

代码似乎可以使用本地声明的自动存储数组的原因纯属偶然。将数据存储到 tmp_arr[end - strt] 时您仍然有未定义的行为,但碰巧它似乎没有任何不利的副作用,而当您对分配的对象这样做时,您可能会破坏 malloc() 跟踪分配的块,分配例程或 valgrind 执行的一致性检查会检测到。

包含切片中最后一个元素的索引非常容易出错。我希望在编程 类 中不再为任何具有基于 0 的数组索引的语言教授这种废话。它需要在代码中进行许多令人困惑的 +1/-1 调整,而在最后一个元素之后使用元素索引可以简化代码。

这是修改后的版本:

#include <stdio.h>
#include <stdlib.h>

void merge_sort(int *arr, int strt, int end);
void merge(int *arr, int strt, int mid, int end);
void print_arr(int *arr, int arr_len);

int main(void) {
  int arr[] = { 8, 1, 10, 54, 2, 0, -3, 5, 70, 60, 11, 4 };
  int arr_len = sizeof(arr) / sizeof(arr[0]); 
  print_arr(arr, arr_len);
  merge_sort(arr, 0, arr_len);
  print_arr(arr, arr_len);
}

void print_arr(int *arr, int arr_len) {
  for (int i = 0; i < arr_len; i++) {
    printf("%i ", arr[i]);
  }
  printf("\n");
}

void merge_sort(int *arr, int strt, int end) {
  if (end - strt > 1) {
    // avoid potential arithmetic overflow on start+end
    int mid = strt + (end - strt) / 2;
    merge_sort(arr, strt, mid);
    merge_sort(arr, mid, end);
    merge(arr, strt, mid, end); 
  }
}

void merge(int *arr, int strt, int mid, int end) {
  int *tmp_arr = malloc(sizeof(*tmp_arr) * (end - strt));
  //int tmp_arr[end - strt];
  int i = strt;
  int j = 0;
  int k = mid;
  while (i < mid && k < end) {
    if (arr[i] <= arr[k]) {
      tmp_arr[j++] = arr[i++];
    } else {
      tmp_arr[j++] = arr[k++];
    }
  }
  while (i < mid) {
      tmp_arr[j++] = arr[i++];
  }
  // copying the remaining elements from the right half is not needed as they
  // are already in the proper place
  //while (k < end) {
  //    tmp_arr[j++] = arr[k++];
  //}
  for (int m = 0; m < j; m++) {
    arr[strt + m] = tmp_arr[m];
  }
  free(tmp_arr);
}