使用结构读取 PGM 文件

Question

我正在做一个涉及将函数写入 read/write 和 encode/decode PGM 文件的项目。我正在使用一个具有读取 PGM 文件功能的结构。我对结构及其语法非常陌生，所以我只是想知道这段代码是否可以将扫描数据正确读入我的结构。

这是我的代码 (C)：

#include <stdio.h>
#include "image.h"

int **allocatePGM(int numCols, int numRows){
        int ** = malloc(sizeof(int *) * numRows);
        for (int i=0; i<numRows; i++)
            pixels[i] = malloc(sizeof(int) * numCols);
        return pixels;

}

ImagePGM *readPGM(char *filename, ImagePGM *pImagePGM){
    FILE *inFile = NULL
    char PGMcheck[5];
    int max_value = 0;
    unsigned int width = 0, height = 0;
    unsigned int i = 0;
    int pixeldata = 0;




    inFile = fopen(filename, "r");
    if (inFile == NULL)
    printf("File could not be opened\n");
    exit(1);

fgets(PGMcheck, sizeof(PGMcheck), inFile);
if (strcmp(version, "P5")) {
    fprintf(stderr, "Wrong file type!\n");
    exit(1);
}
    printf("This file does not contain the PGM indicator \"P2\"");
    exit(1);
    }




    fscanf(inFile, "%d", &width);
    fscanf(inFile, "%d", &height);
    fscanf(inFile, "%d", max_value);

    struct ImagePGM.pImagePGM
    pImagePGM.magic = PGMcheck;
    pImagePGM.width = width;
    pImagePGM.height = height;
    pImagePGM.max_value = max_value;

    pImagePGM->pixels = allocatePGM(pImagePGM->width, pImagePGM->height);
    if (pImagePGM->max_value > 255) {
        for (i = 0; i < height; ++i) {
            for (j = 0; j < width; ++j) {
                pImagePGM->pixels[i][j];
            }
        }
    }
    return pImagePGM;

}

我的头文件包含如下结构...

typedef struct _imagePGM {
 char magic[3]; // magic identifier, "P2" for PGM
 int width; // number of columns
 int height; // number of rows
 int max_value; // maximum grayscale intensity
 int **pixels; // the actual grayscale pixel data, a 2D array
} ImagePGM;

你们觉得还好吗？

Answer 1

我不知道 PGM 规范，但您有三个常见错误，这些错误可能会导致您的代码在与您的平台不同的平台上编译时出现异常：

字节顺序。您必须为您的数据格式精确定义它。在您的情况下，int 可能是小端，在将代码移植到大端平台时必须考虑到这一点。另见 https://en.wikipedia.org/wiki/Endianness
结构包装。根据平台的不同，编译器可以填充结构中的字段以加快访问速度。您可能希望为您的结构使用 pragma pack 之类的构造，否则，您的代码可能会再次与其他编译器发生问题（即使假设使用相同的平台）。另见 http://www.catb.org/esr/structure-packing/#_structure_alignment_and_padding
使用固定宽度的类型。例如。使用 int64_t 而不是 long 等。另见 https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_types

Answer 2

继续我之前的评论，您有几个与处理 Plain PGM File Format 有关的问题，这些问题将阻止您成功读取文件。

首先，fgets(PGMcheck, sizeof(PGMcheck), inFile);不保证能正确读出PGMcheck。 magic-number 后面可能跟着 "(blanks, TABs, CRs, LFs)" 所以 fgets 将读取的不仅仅是 magic-number 除非它后跟一个 '\n' -- 格式不保证。虽然 fgets() 通常是进行面向行输入的正确方法，但不能保证 PGM 格式按行格式化，因此您只能使用 formatted-input功能，或逐个字符的方法。

（您可以使用 fgets()，但这需要解析生成的缓冲区并保存超出 magic-number 的缓冲区的任何部分以作为下一次阅读的开始部分）

您已经更正了使用 != 而不是 strcmp 进行字符串比较的尝试，但您仍然必须将 magic-number 与 "P2" 用于读取 Plain-PGM 格式文件（正如您的问题最初包含的那样）继续将 magic-number 读入字符串，但使用格式化输入函数（fscanf) 只读取直到遇到第一个空格，不管那个空格是什么。

最后，不需要将 magic-number 存储为 plain_pgm 结构的一部分。这是您在尝试填充结构之前 validate 的内容。它要么是 "P2"，要么不是——不需要存储它。

为了便携性，读取图像文件时最好使用exact-width类型进行存储。有很多好处，但最主要的是，无论运行在 x86_64 上还是在 TI-MSP432 芯片上，您的程序都会正常运行。 stdint.h 中定义了精确的宽度类型，inttypes.h 中提供了打印和读取精确宽度类型的宏。你有 int8_t 而不是 char，你有 uint8_t 而不是 unsigned char，等等，其中数值指定类型的确切字节数。

你的 pgm 结构看起来像：

typedef struct {            /* struct for plain pgm image */
    uint32_t w, h;          /* use exact width types for portable code */
    uint16_t max;           /* 16-bit max */
    uint16_t **pixels;      /* pointer-to-pointer for pixel values */
} plain_pgm;

您的分配在很大程度上是正确的，但是重新排列为 return 一个 指针到指针 uint16_t（足以 maximum gray value 像素值），你可以这样做：

uint16_t **alloc_pgm_pixels (uint32_t w, uint32_t h)
{
    uint16_t **pixels = NULL;

    /* allocate/validate height number of pointers */
    if (!(pixels = malloc (h * sizeof *pixels))) {
        perror ("malloc-pixels");
        return NULL;
    }
    /* allocate/validate width number of values per-pointer */
    for (uint32_t i = 0; i < h; i++)
        if (!(pixels[i] = malloc (w * sizeof *pixels[i]))) {
            perror ("malloc-pixels[i]");
            return NULL;
        }

    return pixels;  /* return allocated pointers & storage */
}

您的阅读功能需要很多帮助。首先，您通常希望打开并验证文件是否已打开以在调用函数中读取，并将打开的 FILE * 指针作为参数而不是文件名传递给读取函数。（如果文件无法在调用者中打开，则无需首先进行函数调用）。通过该更改并将指针传递给您的结构，您的读取函数可能如下所示：

int read_pgm (FILE *fp, plain_pgm *pgm)
{
    char buf[RDBUF];            /* buffer for magic number */
    uint32_t h = 0, w = 0;      /* height/width counters */

    if (fscanf (fp, "%s", buf) != 1) {  /* read magic number */
        fputs ("error: invalid format - magic\n", stderr);
        return 0;
    }

    if (strcmp (buf, MAGIC_PLN) != 0) { /* validate magic number */
        fprintf (stderr, "error: invalid magic number '%s'.\n", buf);
        return 0;
    }

    /* read pgm width, height, max gray value */
    if (fscanf (fp, "%" SCNu32 " %" SCNu32 " %" SCNu16, 
                &pgm->w, &pgm->h, &pgm->max) != 3) {
        fputs ("error: invalid format, h, w, max or included comments.\n",
                stderr);
        return 0;
    }

    /* validate allocation of pointers and storage for pixel values */
    if (!(pgm->pixels = alloc_pgm_pixels (pgm->w, pgm->h)))
        return 0;

    for (;;) {  /* loop continually until image read */
        if (fscanf (fp, "%" SCNu16, &pgm->pixels[h][w]) != 1) {
            fputs ("error: stream error or short-read.\n", stderr);
            return 0;
        }
        if (++w == pgm->w)
            w = 0, h++;
        if (h == pgm->h)
            break;
    }

    return 1;
}

(注意： 这个读取函数不考虑 注释行 ，实现忽略注释行留给你。你可以在读取幻数、宽度、高度和最大灰度值的每个部分之前和之间使用类似于 " # %[^\n']" 的内容额外调用 fscanf 以跳过任意数量的空白并读取到和包括下一个 '#' 字符和行尾，或者只在循环中使用 fgetc 搜索下一个非空白字符并检查它是否是 '#' ，如果不使用ungetc，如果是，清除到行尾。）

举个例子，你可以这样做：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>

#define RDBUF       32      /* if you need a constant, #define one (or more) */
#define MAGIC_PLN  "P2"

typedef struct {            /* struct for plain pgm image */
    uint32_t w, h;          /* use exact width types for portable code */
    uint16_t max;           /* 16-bit max */
    uint16_t **pixels;      /* pointer-to-pointer for pixel values */
} plain_pgm;

uint16_t **alloc_pgm_pixels (uint32_t w, uint32_t h)
{
    uint16_t **pixels = NULL;

    /* allocate/validate height number of pointers */
    if (!(pixels = malloc (h * sizeof *pixels))) {
        perror ("malloc-pixels");
        return NULL;
    }
    /* allocate/validate width number of values per-pointer */
    for (uint32_t i = 0; i < h; i++)
        if (!(pixels[i] = malloc (w * sizeof *pixels[i]))) {
            perror ("malloc-pixels[i]");
            return NULL;
        }

    return pixels;  /* return allocated pointers & storage */
}

int read_pgm (FILE *fp, plain_pgm *pgm)
{
    char buf[RDBUF];            /* buffer for magic number */
    uint32_t h = 0, w = 0;      /* height/width counters */

    if (fscanf (fp, "%s", buf) != 1) {  /* read magic number */
        fputs ("error: invalid format - magic\n", stderr);
        return 0;
    }

    if (strcmp (buf, MAGIC_PLN) != 0) { /* validate magic number */
        fprintf (stderr, "error: invalid magic number '%s'.\n", buf);
        return 0;
    }

    /* read pgm width, height, max gray value */
    if (fscanf (fp, "%" SCNu32 " %" SCNu32 " %" SCNu16, 
                &pgm->w, &pgm->h, &pgm->max) != 3) {
        fputs ("error: invalid format, h, w, max or included comments.\n",
                stderr);
        return 0;
    }

    /* validate allocation of pointers and storage for pixel values */
    if (!(pgm->pixels = alloc_pgm_pixels (pgm->w, pgm->h)))
        return 0;

    for (;;) {  /* loop continually until image read */
        if (fscanf (fp, "%" SCNu16, &pgm->pixels[h][w]) != 1) {
            fputs ("error: stream error or short-read.\n", stderr);
            return 0;
        }
        if (++w == pgm->w)
            w = 0, h++;
        if (h == pgm->h)
            break;
    }

    return 1;
}

int main (int argc, char **argv) {

    plain_pgm pgm = { .w = 0 }; /* plain_pgm struct instance */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }

    if (!read_pgm (fp, &pgm)) { /* validate/allocate/read pgm file */
        fputs ("error: read_pgm failed.\n", stderr);
        return 1;
    }
    if (fp != stdin)            /* close file if not stdin */
        fclose (fp);

    /* output success */
    printf ("successful read of '%s'\n%" PRIu32 "x%" PRIu32 " pixel values.\n",
            argc > 1 ? argv[1] : "stdin", pgm.w, pgm.h);

    for (uint32_t i = 0; i < pgm.h; i++)    /* free pixel storage */
        free (pgm.pixels[i]);
    free (pgm.pixels);                      /* free pointers */
}

例子Use/Output

使用示例 apollonian_gasket.ascii.pgm, a 600 wide by 600 high image of an Apollonian gasket 文件作为测试文件，您将得到：

$ ./bin/read_pgm_plain dat/apollonian_gasket.ascii.pgm
successful read of 'dat/apollonian_gasket.ascii.pgm'
600x600 pixel values.

内存Use/Error检查

在您编写的任何动态分配内存的代码中，您对分配的任何内存块负有 2 责任：(1) 始终保留指向内存块的起始地址 因此，(2) 当不再需要它时可以释放。

您必须使用内存错误检查程序来确保您不会尝试访问内存或写入 beyond/outside 您分配的块的边界，尝试读取或基于未初始化的条件跳转值，最后，确认您释放了所有已分配的内存。

对于Linux valgrind是正常的选择。每个平台都有类似的内存检查器。它们都很简单易用，只需运行你的程序就可以了。

$ valgrind ./bin/read_pgm_plain dat/apollonian_gasket.ascii.pgm
==8086== Memcheck, a memory error detector
==8086== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8086== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==8086== Command: ./bin/read_pgm_plain dat/apollonian_gasket.ascii.pgm
==8086==
successful read of 'dat/apollonian_gasket.ascii.pgm'
600x600 pixel values.
==8086==
==8086== HEAP SUMMARY:
==8086==     in use at exit: 0 bytes in 0 blocks
==8086==   total heap usage: 604 allocs, 604 frees, 730,472 bytes allocated
==8086==
==8086== All heap blocks were freed -- no leaks are possible
==8086==
==8086== For counts of detected and suppressed errors, rerun with: -v
==8086== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

始终确认您已释放所有分配的内存并且没有内存错误。

查看所做的更改，如果您不明白为什么做某事，请发表评论询问，我很乐意进一步提供帮助。

使用结构读取 PGM 文件

Reading a PGM file using structures

c

struct

pointers

pgm

dynamic-arrays