将 wscanf 用于 UTF-8 时不要忽略空格

Question

我正在尝试从标准输入中将宽字符读入 wchar_t 数组。但是，ls 的取反扫描集说明符 ([^characters]) 无法按预期正常工作。

目标是我希望每个空格都读入 str 而不是被忽略。因此，[^\n] 是我尝试过的方法，但没有运气，结果令人沮丧，并不断将乱码文本打印到标准输出。

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <wchar.h>
#include <wctype.h>
#include <locale.h>

int main(void)
{
    wchar_t str[8];

    if (setlocale(LC_ALL, "en_US.UTF-8") == NULL)  {
        fprintf(stderr, "Failed to set locale LC_ALL = en_US.UTF-8.\n");
        exit(EXIT_FAILURE);
    }

    // correct (but not what I want)
    // whitespaces and EOLs are ignored
    // while (wscanf(L"%7ls", str) != EOF)  {
    //     wprintf(L"%ls", str);
    // }

    // incorrect
    // whitespaces (except EOLs) are properly read into str (what I want)
    // input: 不要忽略白空格 (for instance)
    // output: endless loop (garbled text)
    while (wscanf(L"%7[^\n]ls", str) != EOF)  {
        if (ferror(stdin) && errno == EILSEQ)  {
            fprintf(stderr, "Encountered an invalid wide character.\n");
            exit(EXIT_FAILURE);
        }
        wprintf(L"%ls", str);
    }
}

Answer 1

Don't ignore whitespaces ...
... trying to read wide characters into an array of wchar_t

将一行 text（所有字符和空格最多 '\n'）读入宽字符 string, 使用 fgetws();

#define STR_SIZE 8
wchar_t str[STR_SIZE];

while (fgetws(str, STR_SIZE, str)) {
  // lop off the potential \n if desired
  size_t len = wcslen(str);
  if (len > 0 && str[len-1] == L'\n') {
    str[--len] = L'[=10=]';
  }
  ...
}

将 wscanf 用于 UTF-8 时不要忽略空格

Don't ignore whitespaces when using wscanf for UTF-8

c

wchar-t

specifier