isspace() 是否接受 getchar() 值?

Does isspace() accept getchar() values?

isspace() works if the input is representable as unsigned char 或等于 EOF.

getchar() 从标准输入读取下一个字符。

getchar()!=EOF;所有 getchar() 返回值都可以表示为 unsigned char 吗?

uintmax_t count_space = 0;
for (int c; (c = getchar()) != EOF; )
  if (isspace(c))
    ++count_space;

此代码是否会导致未定义的行为?

根据 C11 WG14 draft version N1570:

§7.21.7.6/2 The getchar function is equivalent to getc with the argument stdin.

§7.21.7.5/2 The getc function is equivalent to fgetc...

§7.21.7.1/2 [!=EOF case] ...the fgetc function obtains that character as an unsigned char converted to an int...text in [...] is mine.

  • isspace() 接受 getchar() 个值
  • 所有 getchar()!=EOF 值都可以表示为 unsigned char
  • 这里没有未定义的行为。

如果你觉得太明显了(“还能是什么”),再想一想。例如,在 the related case 中:isspace(CHAR_MIN) 可能未定义,即,将字符传递给字符分类函数可能是未定义的行为!

如果 UCHAR_MAX > INT_MAX 结果可能是实现定义的:

§6.3.1.3/3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

getchar() 的 return 值与 fgetc() 的格式相同。 C11定义7.21.7.1p2-3fgetc()的return值:

  1. If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).

Returns

  1. If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end- of-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function returns EOF. [289]

因为这是一个 unsigned char 转换为 intintalmost 总是与 unsigned char 具有相同的值.

sizeof(int) == 1 的某些平台上,对于高值可能并非如此;然而,这些大多是 DSP 平台,因此几乎可以肯定这些平台不需要字符分类。


is*函数经过精心定义,可以直接与*getc*的return值一起使用 C11 7.4p1:

1 The header <ctype.h> declares several functions useful for classifying and mapping characters. [198] In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.

即甚至将 EOF 传递给 is* 函数也是合法的。当然 isanything(EOF) 将始终 return 0,因此要计算 连续 空白字符,可以简单地使用类似的东西:

while (isspace(getchar())) space_count ++;

但是,有符号的 char 值是不正确的,例如,如果将 EOF 以外的负值传递给任何字符分类函数。