将 `int_least8_t` 转换为 `char` 时如何发出警告?
How to have warning when casting `int_least8_t` to `char`?
我正在构建一个同时支持 ascii 和 utf8 的字符串库。
我为 t_ascii
和 t_utf8
创建了两个 typedef。 ascii 读作 utf8 是安全的,但 utf8 读作 ascii 是不安全的。
当从 t_utf8
隐式转换为 t_ascii
时,我有什么办法发出警告,但当隐式转换 t_ascii
到 t_utf8
时,我没有办法发出警告吗?
理想情况下,我希望发出这些警告(并且仅发出这些警告):
#include <stdint.h>
typedef char t_ascii;
typedef uint_least8_t t_utf8;
int main()
{
t_ascii const* asciistr = "Hello world"; // Ok
t_utf8 const* utf8str = "你好世界"; // Ok
asciistr = utf8str; // Warning: utf8 to ascii is not safe
utf8str = asciistr; // Ok: ascii to utf8 is safe
t_ascii asciichar = 'A';
t_utf8 utf8char = 'B';
asciichar = utf8char; // Warning: utf8 to ascii is not safe
utf8char = asciichar; // Ok: ascii to utf8 is safe
}
目前,在使用 -Wall(甚至使用 -funsigned-char
)构建时,我收到以下警告:
gcc main.c -Wall -Wextra
main.c: In function ‘main’:
main.c:10:35: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const unsigned char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
10 | t_utf8 const* utf8str = "你好世界"; // Ok
| ^~~~~~~~~~
main.c:12:18: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const unsigned char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
12 | asciistr = utf8str; // Warning: utf8 to ascii is not safe
| ^
main.c:16:17: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const unsigned char *’} differ in signedness [-Wpointer-sign]
16 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
用 -Wall
编译。始终使用 -Wall
.
进行编译
<user>@squall:~/src/p1$ gcc -Wall -c test2.c
test2.c: In function ‘main’:
test2.c:9:31: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const signed char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
9 | t_utf8 const* utf8str = "你好世界";
| ^~~~~~~~~~~~~~
test2.c:11:13: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const signed char *’} differ in signedness [-Wpointer-sign]
11 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
test2.c:12:14: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const signed char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
12 | asciistr = utf8str; // Should issue warning: utf8 to ascii is not safe
| ^
您希望从 t_ascii
从 t_utf8
投射是安全的,但事实并非如此。签名不同。
警告与有效的 utf8 有时不是有效的 ASCII 这一事实无关——编译器对此一无所知。警告是关于标志的。
如果你想要一个无符号的 char
,用 -funsigned-char
编译。但随后不会发出任何警告。
(顺便说一句,如果您认为类型 int_least8_t
能够保存多字节字符/完整的 utf8 代码点编码 - 它不会。所有 int_least8_t
和因此 utf8_t
在单个编译单元中将具有完全相同的大小。)
用标准的C编译器编译即可。 What compiler options are recommended for beginners learning C?
结果:
<source>: In function 'main':
<source>:9:31: error: pointer targets in initialization of 'const t_utf8 *' {aka 'const unsigned char *'} from 'char *' differ in signedness [-Wpointer-sign]
9 | t_utf8 const* utf8str = "你好世界"; // Ok
| ^~~~~~~~~~
<source>:11:14: error: pointer targets in assignment from 'const t_utf8 *' {aka 'const unsigned char *'} to 'const t_ascii *' {aka 'const char *'} differ in signedness [-Wpointer-sign]
11 | asciistr = utf8str; // Warning: utf8 to ascii is not safe
| ^
<source>:12:13: error: pointer targets in assignment from 'const t_ascii *' {aka 'const char *'} to 'const t_utf8 *' {aka 'const unsigned char *'} differ in signedness [-Wpointer-sign]
12 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
but not when implicitely casting t_ascii to t_utf8 ?
不,你不能在标准 C 中使用它,因为它是一个无效的指针转换。您可以使用显式强制转换使编译器静音,但如果这样做,您将调用未定义的行为。
除此之外,您可以使用 C11 _Generic
找出哪种类型 uint_least8_t
归结为:
#include <stdint.h>
#include <stdio.h>
#define what_type(obj) printf("%s is same as %s\n", #obj, \
_Generic ((obj), \
char: "char", \
unsigned char: "unsigned char", \
signed char: "signed char") );
int main (void)
{
typedef char t_ascii;
typedef uint_least8_t t_utf8;
t_ascii ascii;
t_utf8 utf8;
what_type(ascii);
what_type(utf8);
}
gcc x86 上的输出 Linux:
ascii is same as char
utf8 is same as unsigned char
我正在构建一个同时支持 ascii 和 utf8 的字符串库。
我为 t_ascii
和 t_utf8
创建了两个 typedef。 ascii 读作 utf8 是安全的,但 utf8 读作 ascii 是不安全的。
当从 t_utf8
隐式转换为 t_ascii
时,我有什么办法发出警告,但当隐式转换 t_ascii
到 t_utf8
时,我没有办法发出警告吗?
理想情况下,我希望发出这些警告(并且仅发出这些警告):
#include <stdint.h>
typedef char t_ascii;
typedef uint_least8_t t_utf8;
int main()
{
t_ascii const* asciistr = "Hello world"; // Ok
t_utf8 const* utf8str = "你好世界"; // Ok
asciistr = utf8str; // Warning: utf8 to ascii is not safe
utf8str = asciistr; // Ok: ascii to utf8 is safe
t_ascii asciichar = 'A';
t_utf8 utf8char = 'B';
asciichar = utf8char; // Warning: utf8 to ascii is not safe
utf8char = asciichar; // Ok: ascii to utf8 is safe
}
目前,在使用 -Wall(甚至使用 -funsigned-char
)构建时,我收到以下警告:
gcc main.c -Wall -Wextra
main.c: In function ‘main’:
main.c:10:35: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const unsigned char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
10 | t_utf8 const* utf8str = "你好世界"; // Ok
| ^~~~~~~~~~
main.c:12:18: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const unsigned char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
12 | asciistr = utf8str; // Warning: utf8 to ascii is not safe
| ^
main.c:16:17: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const unsigned char *’} differ in signedness [-Wpointer-sign]
16 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
用 -Wall
编译。始终使用 -Wall
.
<user>@squall:~/src/p1$ gcc -Wall -c test2.c
test2.c: In function ‘main’:
test2.c:9:31: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const signed char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
9 | t_utf8 const* utf8str = "你好世界";
| ^~~~~~~~~~~~~~
test2.c:11:13: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const signed char *’} differ in signedness [-Wpointer-sign]
11 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
test2.c:12:14: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const signed char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
12 | asciistr = utf8str; // Should issue warning: utf8 to ascii is not safe
| ^
您希望从 t_ascii
从 t_utf8
投射是安全的,但事实并非如此。签名不同。
警告与有效的 utf8 有时不是有效的 ASCII 这一事实无关——编译器对此一无所知。警告是关于标志的。
如果你想要一个无符号的 char
,用 -funsigned-char
编译。但随后不会发出任何警告。
(顺便说一句,如果您认为类型 int_least8_t
能够保存多字节字符/完整的 utf8 代码点编码 - 它不会。所有 int_least8_t
和因此 utf8_t
在单个编译单元中将具有完全相同的大小。)
用标准的C编译器编译即可。 What compiler options are recommended for beginners learning C?
结果:
<source>: In function 'main':
<source>:9:31: error: pointer targets in initialization of 'const t_utf8 *' {aka 'const unsigned char *'} from 'char *' differ in signedness [-Wpointer-sign]
9 | t_utf8 const* utf8str = "你好世界"; // Ok
| ^~~~~~~~~~
<source>:11:14: error: pointer targets in assignment from 'const t_utf8 *' {aka 'const unsigned char *'} to 'const t_ascii *' {aka 'const char *'} differ in signedness [-Wpointer-sign]
11 | asciistr = utf8str; // Warning: utf8 to ascii is not safe
| ^
<source>:12:13: error: pointer targets in assignment from 'const t_ascii *' {aka 'const char *'} to 'const t_utf8 *' {aka 'const unsigned char *'} differ in signedness [-Wpointer-sign]
12 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
but not when implicitely casting t_ascii to t_utf8 ?
不,你不能在标准 C 中使用它,因为它是一个无效的指针转换。您可以使用显式强制转换使编译器静音,但如果这样做,您将调用未定义的行为。
除此之外,您可以使用 C11 _Generic
找出哪种类型 uint_least8_t
归结为:
#include <stdint.h>
#include <stdio.h>
#define what_type(obj) printf("%s is same as %s\n", #obj, \
_Generic ((obj), \
char: "char", \
unsigned char: "unsigned char", \
signed char: "signed char") );
int main (void)
{
typedef char t_ascii;
typedef uint_least8_t t_utf8;
t_ascii ascii;
t_utf8 utf8;
what_type(ascii);
what_type(utf8);
}
gcc x86 上的输出 Linux:
ascii is same as char
utf8 is same as unsigned char