isspace() 是否接受 getchar() 值?
Does isspace() accept getchar() values?
isspace()
works if the input is representable as unsigned char
或等于 EOF
.
getchar()
从标准输入读取下一个字符。
当getchar()!=EOF
;所有 getchar()
返回值都可以表示为 unsigned char
吗?
uintmax_t count_space = 0;
for (int c; (c = getchar()) != EOF; )
if (isspace(c))
++count_space;
此代码是否会导致未定义的行为?
根据 C11 WG14 draft version N1570:
§7.21.7.6/2 The getchar
function is equivalent to getc
with the argument stdin.
§7.21.7.5/2 The getc
function is equivalent to fgetc
...
§7.21.7.1/2 [!=EOF
case] ...the fgetc
function obtains that character as an unsigned char
converted to an int
...text in [...] is mine.
即
isspace()
接受 getchar()
个值
- 所有
getchar()!=EOF
值都可以表示为 unsigned char
- 这里没有未定义的行为。
如果你觉得太明显了(“还能是什么”),再想一想。例如,在 the related case 中:isspace(CHAR_MIN)
可能未定义,即,将字符传递给字符分类函数可能是未定义的行为!
如果 UCHAR_MAX > INT_MAX
结果可能是实现定义的:
§6.3.1.3/3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
getchar()
的 return 值与 fgetc()
的格式相同。 C11定义7.21.7.1p2-3中fgetc()
的return值:
- If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the
fgetc
function obtains that character as an unsigned char
converted to an int
and advances the associated file position indicator for the stream (if defined).
Returns
- If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end- of-file indicator for the stream is set and the
fgetc
function returns EOF
. Otherwise, the fgetc
function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc
function returns EOF
. [289]
因为这是一个 unsigned char
转换为 int
,int
将 almost 总是与 unsigned char 具有相同的值.
在 sizeof(int) == 1
的某些平台上,对于高值可能并非如此;然而,这些大多是 DSP 平台,因此几乎可以肯定这些平台不需要字符分类。
is*
函数经过精心定义,可以直接与*getc*
的return值一起使用 C11 7.4p1:
1 The header <ctype.h>
declares several functions useful for classifying and mapping characters. [198] In all cases the argument is an int
, the value of which shall be representable as an unsigned char
or shall equal the value of the macro EOF
. If the argument has any other value, the behavior is undefined.
即甚至将 EOF
传递给 is*
函数也是合法的。当然 isanything(EOF)
将始终 return 0,因此要计算 连续 空白字符,可以简单地使用类似的东西:
while (isspace(getchar())) space_count ++;
但是,有符号的 char 值是不正确的,例如,如果将 EOF
以外的负值传递给任何字符分类函数。
isspace()
works if the input is representable as unsigned char
或等于 EOF
.
getchar()
从标准输入读取下一个字符。
当getchar()!=EOF
;所有 getchar()
返回值都可以表示为 unsigned char
吗?
uintmax_t count_space = 0;
for (int c; (c = getchar()) != EOF; )
if (isspace(c))
++count_space;
此代码是否会导致未定义的行为?
根据 C11 WG14 draft version N1570:
§7.21.7.6/2 The
getchar
function is equivalent togetc
with the argument stdin.§7.21.7.5/2 The
getc
function is equivalent tofgetc
...§7.21.7.1/2 [
!=EOF
case] ...thefgetc
function obtains that character as anunsigned char
converted to anint
...text in [...] is mine.
即
isspace()
接受getchar()
个值- 所有
getchar()!=EOF
值都可以表示为unsigned char
- 这里没有未定义的行为。
如果你觉得太明显了(“还能是什么”),再想一想。例如,在 the related case 中:isspace(CHAR_MIN)
可能未定义,即,将字符传递给字符分类函数可能是未定义的行为!
如果 UCHAR_MAX > INT_MAX
结果可能是实现定义的:
§6.3.1.3/3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
getchar()
的 return 值与 fgetc()
的格式相同。 C11定义7.21.7.1p2-3中fgetc()
的return值:
- If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the
fgetc
function obtains that character as anunsigned char
converted to anint
and advances the associated file position indicator for the stream (if defined).Returns
- If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end- of-file indicator for the stream is set and the
fgetc
function returnsEOF
. Otherwise, thefgetc
function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and thefgetc
function returnsEOF
. [289]
因为这是一个 unsigned char
转换为 int
,int
将 almost 总是与 unsigned char 具有相同的值.
在 sizeof(int) == 1
的某些平台上,对于高值可能并非如此;然而,这些大多是 DSP 平台,因此几乎可以肯定这些平台不需要字符分类。
is*
函数经过精心定义,可以直接与*getc*
的return值一起使用 C11 7.4p1:
1 The header
<ctype.h>
declares several functions useful for classifying and mapping characters. [198] In all cases the argument is anint
, the value of which shall be representable as anunsigned char
or shall equal the value of the macroEOF
. If the argument has any other value, the behavior is undefined.
即甚至将 EOF
传递给 is*
函数也是合法的。当然 isanything(EOF)
将始终 return 0,因此要计算 连续 空白字符,可以简单地使用类似的东西:
while (isspace(getchar())) space_count ++;
但是,有符号的 char 值是不正确的,例如,如果将 EOF
以外的负值传递给任何字符分类函数。