设置 UTF-8 语言环境后使用 fgetws?

Using fgetws after setting a UTF-8 locale?

GCC 4.8、5.1、6.2 和 Clang 3.8.1 on Ubuntu 16.10 以及 -std=c11-std=c++11-std=c++14-std=c++17 都显示在 setlocale(LC_ALL, "any_THING.utf8");.

之后使用 fgetws(buf, (int) bufsize, stdin) 时出现这种奇怪的行为

示例程序:

#include <locale.h>
#include <wchar.h>
#include <stdlib.h>
#include <stdio.h>

int main(const int argc, const char* const * const argv) {
  (void) argc;

  setlocale(LC_ALL, argv[1]);

  const size_t len = 3;

  wchar_t *buf = (wchar_t *) malloc(sizeof (wchar_t) * len),
         *stat = fgetws(buf, (int) len, stdin);

  wprintf(L"[%ls], [%ls]\n", stat, buf);

  free(buf);

  return EXIT_SUCCESS;
}

转换 malloc 仅用于 C++ 兼容。

这样编译:cc -std=c11 fg.c -o fg.

运行 使用 argv[1] = "C" 并在 Valgrind 下回显 10 个字节到 STDIN,我们发现...

$ python3 -c 'print("5" * 10)' | \
  valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all ./f C
==1775== Memcheck, a memory error detector
==1775== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1775== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==1775== Command: ./f C
==1775== 
[55], [55]
==1775== 
==1775== HEAP SUMMARY:
==1775==     in use at exit: 0 bytes in 0 blocks
==1775==   total heap usage: 5 allocs, 5 frees, 25,612 bytes allocated
==1775== 
==1775== All heap blocks were freed -- no leaks are possible
==1775== 
==1775== For counts of detected and suppressed errors, rerun with: -v
==1775== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

程序运行完美,没有内存错误。

如果它是 运行,UTF-8 区域设置为 argv[1],那么我们得到正确的输出,但在 0x18 处出现内存错误和致命的分段错误。

$ python3 -c 'print("5" * 10)' | \
  valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all ./f en_US.utf8
==1934== Memcheck, a memory error detector
==1934== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1934== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info
==1934== Command: ./f en_US.utf8
==1934== 
[55], [55]
==1934== Invalid read of size 8
==1934==    at 0x4EAF575: _IO_wfile_sync (wfileops.c:534)
==1934==    by 0x4EB6DB1: _IO_default_setbuf (genops.c:523)
==1934==    by 0x4EB2FC8: _IO_file_setbuf@@GLIBC_2.2.5 (fileops.c:459)
==1934==    by 0x4EB79B5: _IO_unbuffer_all (genops.c:921)
==1934==    by 0x4EB79B5: _IO_cleanup (genops.c:966)
==1934==    by 0x4E73282: __run_exit_handlers (exit.c:96)
==1934==    by 0x4E73339: exit (exit.c:105)
==1934==    by 0x4E593F7: (below main) (libc-start.c:325)
==1934==  Address 0x18 is not stack'd, malloc'd or (recently) free'd
==1934== 
==1934== 
==1934== Process terminating with default action of signal 11 (SIGSEGV)
==1934==  Access not within mapped region at address 0x18
==1934==    at 0x4EAF575: _IO_wfile_sync (wfileops.c:534)
==1934==    by 0x4EB6DB1: _IO_default_setbuf (genops.c:523)
==1934==    by 0x4EB2FC8: _IO_file_setbuf@@GLIBC_2.2.5 (fileops.c:459)
==1934==    by 0x4EB79B5: _IO_unbuffer_all (genops.c:921)
==1934==    by 0x4EB79B5: _IO_cleanup (genops.c:966)
==1934==    by 0x4E73282: __run_exit_handlers (exit.c:96)
==1934==    by 0x4E73339: exit (exit.c:105)
==1934==    by 0x4E593F7: (below main) (libc-start.c:325)
==1934==  If you believe this happened as a result of a stack
==1934==  overflow in your program's main thread (unlikely but
==1934==  possible), you can try to increase the size of the
==1934==  main thread stack using the --main-stacksize= flag.
==1934==  The main thread stack size used in this run was 8388608.
==1934== 
==1934== Process terminating with default action of signal 11 (SIGSEGV)
==1934==  Access not within mapped region at address 0x18
==1934==    at 0x4EAF575: _IO_wfile_sync (wfileops.c:534)
==1934==    by 0x4EB6DB1: _IO_default_setbuf (genops.c:523)
==1934==    by 0x4EB2FC8: _IO_file_setbuf@@GLIBC_2.2.5 (fileops.c:459)
==1934==    by 0x4EB79B5: _IO_unbuffer_all (genops.c:921)
==1934==    by 0x4EB79B5: _IO_cleanup (genops.c:966)
==1934==    by 0x4FAA93B: __libc_freeres (in /lib/x86_64-linux-gnu/libc-2.24.so)
==1934==    by 0x4A276EC: _vgnU_freeres (vg_preloaded.c:77)
==1934==    by 0x1101: ???
==1934==    by 0x3805234F: ??? (mc_malloc_wrappers.c:483)
==1934==    by 0x51FA8BF: ??? (in /lib/x86_64-linux-gnu/libc-2.24.so)
==1934==  If you believe this happened as a result of a stack
==1934==  overflow in your program's main thread (unlikely but
==1934==  possible), you can try to increase the size of the
==1934==  main thread stack using the --main-stacksize= flag.
==1934==  The main thread stack size used in this run was 8388608.
==1934== 
==1934== HEAP SUMMARY:
==1934==     in use at exit: 35,007 bytes in 149 blocks
==1934==   total heap usage: 233 allocs, 84 frees, 46,936 bytes allocated
==1934== 
==1934== 11 bytes in 1 blocks are still reachable in loss record 1 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6396B: new_composite_name (setlocale.c:167)
==1934==    by 0x4E63F91: setlocale (setlocale.c:378)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 32 bytes in 1 blocks are still reachable in loss record 2 of 24
==1934==    at 0x4C2EB55: calloc (vg_replace_malloc.c:711)
==1934==    by 0x4EF288B: __wcsmbs_load_conv (wcsmbsload.c:168)
==1934==    by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75)
==1934==    by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223)
==1934==    by 0x4EAFC58: _IO_fwide (iofwide.c:124)
==1934==    by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58)
==1934==    by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934==    by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 42 bytes in 1 blocks are still reachable in loss record 3 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934==    by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 50 bytes in 1 blocks are still reachable in loss record 4 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934==    by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 56 bytes in 1 blocks are still reachable in loss record 5 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934==    by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 92 bytes in 2 blocks are still reachable in loss record 6 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934==    by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 104 bytes in 1 blocks are still reachable in loss record 7 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934==    by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 132 bytes in 12 blocks are still reachable in loss record 8 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4EC5C49: strndup (strndup.c:43)
==1934==    by 0x4E64AB4: _nl_find_locale (findlocale.c:315)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 132 bytes in 12 blocks are still reachable in loss record 9 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4EC5BF9: strdup (strdup.c:42)
==1934==    by 0x4E63BCE: setlocale (setlocale.c:369)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 144 bytes in 2 blocks are still reachable in loss record 10 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934==    by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 208 bytes in 1 blocks are still reachable in loss record 11 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E631C9: __gconv_lookup_cache (gconv_cache.c:372)
==1934==    by 0x4E5B34B: __gconv_find_transform (gconv_db.c:752)
==1934==    by 0x4EF296A: __wcsmbs_getfct (wcsmbsload.c:91)
==1934==    by 0x4EF296A: __wcsmbs_load_conv (wcsmbsload.c:186)
==1934==    by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75)
==1934==    by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223)
==1934==    by 0x4EAFC58: _IO_fwide (iofwide.c:124)
==1934==    by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58)
==1934==    by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934==    by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 208 bytes in 1 blocks are still reachable in loss record 12 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E630EB: __gconv_lookup_cache (gconv_cache.c:372)
==1934==    by 0x4E5B34B: __gconv_find_transform (gconv_db.c:752)
==1934==    by 0x4EF2A0D: __wcsmbs_getfct (wcsmbsload.c:91)
==1934==    by 0x4EF2A0D: __wcsmbs_load_conv (wcsmbsload.c:189)
==1934==    by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75)
==1934==    by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223)
==1934==    by 0x4EAFC58: _IO_fwide (iofwide.c:124)
==1934==    by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58)
==1934==    by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934==    by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 365 bytes in 12 blocks are still reachable in loss record 13 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 461 bytes in 12 blocks are still reachable in loss record 14 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 672 bytes in 12 blocks are still reachable in loss record 15 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 826 bytes in 24 blocks are still reachable in loss record 16 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 1,024 bytes in 1 blocks are still reachable in loss record 17 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4EA7381: _IO_file_doallocate (filedoalloc.c:101)
==1934==    by 0x4EA890C: _IO_wfile_doallocate (wfiledoalloc.c:70)
==1934==    by 0x4EAD159: _IO_wdoallocbuf (wgenops.c:390)
==1934==    by 0x4EAF39C: _IO_wfile_overflow (wfileops.c:441)
==1934==    by 0x4EACA12: __woverflow (wgenops.c:226)
==1934==    by 0x4EACA12: _IO_wdefault_xsputn (wgenops.c:331)
==1934==    by 0x4EAF7A0: _IO_wfile_xsputn (wfileops.c:1033)
==1934==    by 0x4E925EB: vfwprintf (vfprintf.c:1320)
==1934==    by 0x4EABA98: wprintf (wprintf.c:32)
==1934==    by 0x10885D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 1,248 bytes in 12 blocks are still reachable in loss record 18 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 1,600 bytes in 1 blocks are still reachable in loss record 19 of 24
==1934==    at 0x4C2CA6F: malloc (vg_replace_malloc.c:298)
==1934==    by 0x4C2EDEF: realloc (vg_replace_malloc.c:785)
==1934==    by 0x4E6B692: extend_alias_table (localealias.c:397)
==1934==    by 0x4E6B692: read_alias_file (localealias.c:319)
==1934==    by 0x4E6B8B0: _nl_expand_alias (localealias.c:203)
==1934==    by 0x4E648D7: _nl_find_locale (findlocale.c:161)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 1,728 bytes in 24 blocks are still reachable in loss record 20 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241)
==1934==    by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285)
==1934==    by 0x4E64A05: _nl_find_locale (findlocale.c:218)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 2,048 bytes in 1 blocks are still reachable in loss record 21 of 24
==1934==    at 0x4C2ED5F: realloc (vg_replace_malloc.c:785)
==1934==    by 0x4E6B61C: read_alias_file (localealias.c:331)
==1934==    by 0x4E6B8B0: _nl_expand_alias (localealias.c:203)
==1934==    by 0x4E648D7: _nl_find_locale (findlocale.c:161)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 3,344 bytes in 12 blocks are still reachable in loss record 22 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4E64F09: _nl_intern_locale_data (loadlocale.c:95)
==1934==    by 0x4E64F09: _nl_load_locale (loadlocale.c:266)
==1934==    by 0x4E649B9: _nl_find_locale (findlocale.c:234)
==1934==    by 0x4E63B7B: setlocale (setlocale.c:340)
==1934==    by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 4,096 bytes in 1 blocks are still reachable in loss record 23 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4EA7381: _IO_file_doallocate (filedoalloc.c:101)
==1934==    by 0x4EA890C: _IO_wfile_doallocate (wfiledoalloc.c:70)
==1934==    by 0x4EB6875: _IO_doallocbuf (genops.c:398)
==1934==    by 0x4EAE493: _IO_wfile_underflow (wfileops.c:197)
==1934==    by 0x4EAC431: _IO_wdefault_uflow (wgenops.c:213)
==1934==    by 0x4EAB0E5: _IO_getwline_info (iogetwline.c:65)
==1934==    by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934==    by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== 16,384 bytes in 1 blocks are still reachable in loss record 24 of 24
==1934==    at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
==1934==    by 0x4EA88D8: _IO_wfile_doallocate (wfiledoalloc.c:79)
==1934==    by 0x4EB6875: _IO_doallocbuf (genops.c:398)
==1934==    by 0x4EAE493: _IO_wfile_underflow (wfileops.c:197)
==1934==    by 0x4EAC431: _IO_wdefault_uflow (wgenops.c:213)
==1934==    by 0x4EAB0E5: _IO_getwline_info (iogetwline.c:65)
==1934==    by 0x4EAAC4A: fgetws (iofgetws.c:53)
==1934==    by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f)
==1934== 
==1934== LEAK SUMMARY:
==1934==    definitely lost: 0 bytes in 0 blocks
==1934==    indirectly lost: 0 bytes in 0 blocks
==1934==      possibly lost: 0 bytes in 0 blocks
==1934==    still reachable: 35,007 bytes in 149 blocks
==1934==         suppressed: 0 bytes in 0 blocks
==1934== 
==1934== For counts of detected and suppressed errors, rerun with: -v
==1934== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 0 from 0)

我的问题归结为:这是 libc6libstdc++6 中的错误吗?或者 fgetws 在设置 UTF-8 语言环境后是否表现出某种未定义的行为(根据 glibc 文档或 C 标准),或者我的代码有什么问题?

请注意,根据 Valgrind 的堆栈跟踪,它似乎可能是 Valgrind 中的一个错误,但程序在 Valgrind 下不 运行 或使用 AddressSanitizer 运行 时会出现段错误(libasan ) 代替。

幸运的是,我的系统产生了与您的相同的错误和相同的回溯(相同的文件,相同的行号),所以我能够进行一些调查。

这是导致分段失败的原因:在 main 的 return 之后,所有内部结构都被释放,其中之一是 stdin。直到 _IO_wfile_sync the same first goes off for stdout which does not cause any problems, so it is intended to happen. The difference is that for stdin, the delta at line 508 的点对于 stdout 为零,导致跳过大部分函数代码,但对于 stdin 为非零。此时,fp->_wide_data->_IO_read_end指向(可以理解)输入字符串的末尾L"5555555555\n",而fp->_wide_data->_IO_read_ptr指向第三个字符(读了两个),区别是-9.

现在,如果您问我在某种名为 _IO_ssize_t 气味的类型中存储负面差异,这确实会造成麻烦。 Line 531 calls the function do_length 需要缓冲区大小的参数 max 并接收 -9(或者,大概是 2^word_size - 9)。此函数的第一行是声明

wchar_t to_buf[max];

这导致增加堆栈指针而不是减少它,以及本应安全存储在那里的数据(其中[=30的指针fp =],因为它最终存储在寄存器中 rbx) 会在第一时间被覆盖。

在函数 fp 中的 return 被一些没有意义的东西覆盖后(NULL,在我的例子中),即使它从未接触过它,并在 line 534 上取消引用它会导致 SIGSEGV,正如回溯告诉我们的那样。

我没有阅读足够的代码来对 line 508 是否应该说

做出有根据的猜测
delta = fp->_wide_data->_IO_read_end - fp->_wide_data->_IO_read_ptr;

而不是相反,或者如果 -delta 应该为 max 传递,或者 _ptr 指向 _end 之前的意外行为,但肯定会导致将负值传递给可变长度数组的任何事情都不行。 由于此处引用的两个文件都是 glibc 的一部分,我认为可以安全地假设这是将错误报告定向到的正确位置。 这与来自非 glibc 系统的否定确认。

PS。对于非 UTF 区域设置不会发生这种情况,因为导致 do_length 的调用仅针对可变长度编码执行(它包装在 line 518 上的 if-else 中)。如果它是 8 位或固定的 16 位或 32 位 UCS(据推测),delta 只会乘以一个常数。如果每个字符的编码可以有不同的字节长度,则相应的计算必须查看缓冲区内部以确定它代表多少个字符,或者构造表示以确定它将占用多少个字符。