当与 fscanf 配对时,这个格式字符串究竟做了什么?

What exactly does this format string do when paired with fscanf?

我在查看一些代码时发现了这一行:

fscanf(file, "%*[: ]%16s", dest);

%*[: ]%16s 格式字符串说明符有什么作用?

此格式字符串

"%*[: ]%16s"

表示输入流中必须跳过所有符号':'' '(格式字符串中放在方括号中的符号),然后最多读取16个字符字符数组。

格式字符串中符号*是赋值抑制字符

这是一个演示程序。为了可见性,我使用 sscanf 而不是 fscanf.

#include <stdio.h>

int main( void ) 
{
    const char *stream = "::: : : : :::Hello";
    char s[17];
    
    sscanf( stream, "%*[: ]%16s", s );
    
    printf( "\"%s\"\n", s );

    return 0;
}

程序输出为

"Hello"

它读取任何空格或 :(冒号)字符然后丢弃它们,然后将最多 16 个非空白字符读入 dest(17 个包括空终止符 [=12=] ).

%后面的*是“赋值抑制符”。 %s 之间的数字是“最大字段宽度”。 方括号表示匹配其中的字符或除这些字符之外的所有字符(带插入符号)。破折号和插入符号经过特殊处理。

来自 Linux scanf 联机帮助页:

Each conversion specification in format begins with either the character '%' or the character sequence "%n$" (see below for the distinction) followed by:

· An optional '*' assignment-suppression character: scanf() reads input as directed by the conversion specification, but discards the input. No corresponding pointer argument is required, and this specification is not included in the count of successful assignments returned by scanf(). [snip]

· An optional decimal integer which specifies the maximum field width. Reading of characters stops either when this maximum is reached or when a nonmatching character is found, whichever happens first. Most conversions discard initial white space characters (the exceptions are noted below), and these discarded characters don't count toward the maximum field width. String input conversions store a terminating null byte ('[=17=]') to mark the end of the input; the maximum field width does not include this terminator.

The following conversion specifiers are available:

[snip]

s Matches a sequence of non-white-space characters; the next pointer must be a pointer to the initial element of a character array that is long enough to hold the input sequence and the terminating null byte ('[=18=]'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.

[snip]

[ Matches a nonempty sequence of characters from the specified set of accepted characters; the next pointer must be a pointer to char, and there must be enough room for all the characters in the string, plus a terminating null byte. The usual skip of leading white space is suppressed. The string is to be made up of characters in (or not in) a particular set; the set is defined by the characters between the open bracket [ character and a close bracket ] character. The set excludes those characters if the first character after the open bracket is a circumflex (^). To include a close bracket in the set, make it the first character after the open bracket or the circumflex; any other position will end the set. The hyphen character - is also special; when placed between two other characters, it adds all intervening characters to the set. To include a hyphen, make it the last character before the final close bracket. For instance, [^]0-9-] means the set "everything except close bracket, zero through nine, and hyphen". The string ends with the appearance of a character not in the (or, with a circumflex, in) set or when the field width runs out.

– Linux scanf(3)

联机帮助页

What does the "%*[: ]%16s" format string specifier do?

  1. "%*[: ]":读取并丢弃(由于"*"至少个[=58的输入字符=]scan_set : ':', ' '。如果没有找到':'' ',停止扫描。

  2. "%16s" 有 3 个步骤:1) 读取并丢弃任何(0 个或更多)前导空格。例如' ''\n''\t'等 2)读取并保存到dest 至少一个但不超过16个非-white-spaces - 否则停止扫描。 3) 在dest后面追加一个空字符。因此 dest 应该至少为 17:char dest[16+1];


高级

fscanf()sscanf() 的一个奇怪区别是当 sscanf() 读取 空字符 时,扫描停止。使用 fscanf(),扫描继续。

fscanf(file "%s", dest)'\t''1''2''3''[=34=]'、[=35的8个字符文件数据=], 'y', 'z', dest[] 将得到 '1', '2', '3', '[=34=]', 'x', 'y', 'z', '[=34=]'。在文本文件中出现 空字符 是不常见的。