C: strtok 不在分隔符处停止并读取多余的项目?

C: strtok does not stop at delimiter and reads in excess items?

我正在尝试创建一个例程来转换从标准输入读取的用户输入缓冲区 成一个int数组。我计划允许用户使用 space 或逗号作为分隔符。这是我的尝试:

#include <stdio.h>
#include <stdlib.h> /* for malloc and realloc */
#include <string.h> /* for strtok */

#define BUFF 1024
#define INIT_CAP 3

int *getarr (char *buff, char *delim, int *len)
{
    char *token;
    size_t capacity, i;
    int elem, *arr;
    capacity = INIT_CAP;
    arr = (int *) malloc (sizeof (int) * capacity);
    if (arr == NULL) {
        perror ("malloc");
        *len = -1;
        return NULL;
    }
    printf ("array capacity has been set to %lu\n", capacity);
    i = 0;
    token = strtok (buff, delim); 
    while (token != NULL) {
        if (sscanf (token, "%d", &elem) == 0) {
            fprintf (stderr, "invalid input\n"); 
            *len = -1;
            return NULL;
        }
        printf ("read in element %d\n", elem);
        if (i == capacity) {
            capacity *= 2; 
            arr = (int *) realloc (arr, sizeof (int) * capacity);
            if (arr == NULL) { 
                perror ("realloc"); 
                break; /* return whatever has been converted */
            }
            printf ("array capacity has been doubled to %lu\n", capacity);
        }
        arr[i++] = elem;
        token = strtok (NULL, delim);
    }
    *len = i;
    return arr;
}

int main (void)
{ 
    char input[BUFF];
    int *arr, len;
    printf ("Array (space/comma delimited)? = " );
    if (fgets (input, BUFF, stdin) == NULL) {
        fprintf (stderr, "IO Error\n");
        exit (1);
    }
    arr = getarr (input, " ,", &len);
    if (arr == NULL) {
        fprintf (stderr, "failed to generate array from input buffer.\n"
                "Exiting process");
        exit (1);
    }
    for (int i = 0; i < len; i++) {
        printf ("%d ", arr[i]);
    }
    putchar ('\n');
    free (arr);
    return 0;
}

当用户输入不包含任何尾随 space 但循环时,它工作正常 运行s 提供尾随 spaces 时的额外时间。这里有一些输出来证明我在说什么:

没有尾随 spaces

尾随 spaces

我可以猜测为什么会发生这种情况 - 循环可能 运行 是一个额外的时间并且 elem 的最后一个值被插入到数组中。但我不明白为什么 当最后一个 space 之后没有标记时,循环应该 运行 吗?这是 strtok:

手册页的摘录

Each call to strtok() returns a pointer to a null-terminated string containing the next token. This string does not include the delimiting byte. If no more tokens are found, strtok() returns NULL.

关于如何解决这个问题的任何想法?

来自fgets manual

If a newline is read, it is stored into the buffer.

也就是说,缓冲区中有一个尾随换行符,因此就 strtok 而言,尾随空格后有一个额外的非分隔符标记。要解决此问题,请更改分隔符列表以包含换行符:

arr = getarr (input, " ,\n", &len);

那么为什么 sscanf 在尝试将换行符解析为 int 时不标记错误?确实如此,但是您的错误检查是错误的。来自 sscanf manual:

These functions return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure.

The value EOF is returned if the end of input is reached before either the first successful conversion or a matching failure occurs.

EOF出错可以返回。所以你的错误检查应该是:

if (sscanf (token, "%d", &elem) != 1)