malloc 与 mmap 性能

Question

我运行一项性能测试，将 1.28 亿个整数写入使用 malloc 分配的内存和使用 mmap 映射的内存文件（由磁盘上的文件支持）...我曾预计结果会有点类似于我的理解，当写入映射内存文件时，数据最初写入内存，pdflush 在后台写入磁盘（以可以配置的频率）。使用 malloc，写入 128M 整数需要 0.55 秒； mmap 耗时 1.9 秒。

所以我的问题是：为什么不同。我最初的想法是 pdflush 正在挤满总线，或者当 pdflush 正在访问内存时，它正在阻塞写入......但是，运行第二次 mmap 版本产生了 .52 秒的结果（由于缓存）这让我相信 mmap 后面的每个页面在被写入之前不会被分配（尽管通过调用 mmap 保留它）......我的理解也是 malloc 产生的内存直到首先写入...初始差异可能是因为在 malloc 初始写入内存之后，分配了整个块，并且使用 mmap，每次写入新页面时，os 必须首先分配它？

更新：

os：CentOS Linux 7.0.1406 版（核心）内核：3.10.0- 123.el7.x86_64 gcc: 4.8.2

代码：

int* pint = malloc(128000000 * sizeof(int));
int* pint_copy = pint;

clock_t start = clock();

int i;
for(i = 0; i < 128000000; ++i)
{
    *pint++ = i;
}   

clock_t end = clock();

double cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
printf("%f\n", cpu_time_used);

free(pint_copy);

对

int fd = open("db", O_RDWR | O_CREAT, 0666);
const size_t region_size = ((512000000 / sysconf(_SC_PAGE_SIZE)) + 1) * sysconf(_SC_PAGE_SIZE); 

int return_code = ftruncate(fd, region_size);

if (return_code < 0)
    printf("mapped memory file could not be truncated: %u\n", return_code);

int* pint = mmap(NULL, region_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
int* pint_copy = pint;
close(fd);  

clock_t start = clock();

int i;
for(i = 0; i < 128000000; ++i)
{
    *pint++ = i;
}   

clock_t end = clock();

double cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
printf("%f\n", cpu_time_used);

fgetc(stdin);

munmap(pint_copy, region_size);

添加：

int z = 512;
while(z < 128000000)
{
    pint[z] = 0;

    z += 1024;
}

之前：

  clock_t start = clock();

两次试验都产生 .37 秒，让我相信 "touching" 每个页面都会导致 os 分配物理内存（包括 mmap 和 malloc）......它也可能部分是因为 "touching" 页面将一些内存移动到缓存...有谁知道在大量写入内存期间（长时间），pdflush 是否会阻塞或减慢内存写入？

Answer 1

是的，你是对的。使用 mmap 获得的页面在您尝试访问它们之前不会被填充。你不能保证这一点，但通常操作系统使用 write-back (there is no penalty for this only gain) and demand-paging（你必须支付第一次访问费用）。

Answer 2

我不知道答案，但在我看来这就像你在比较苹果和橘子。

也就是说，一方面你正在写入（malloc'd）内存，而另一方面你正在写入内存 and 到（mmap'd ）光盘。我希望第二个会导致设备 I/O activity，比第一个慢一个数量级，不会导致任何 I/O.

malloc 与 mmap 性能

malloc vs mmap performance

c

malloc

memory-management

mmap