为什么 C 关键字区分大小写?
Why C keywords are case sensitive?
C 关键字由 C 编译器预定义,在 C89 中为小写。既然只有 32 个,为什么不能将它们定义为不区分大小写呢?
因为 C 是区分大小写的,而且那些是关键字。让它们不区分大小写会使编译器变慢,但真正的原因是它是如何定义的。
60 年代对字母大小写进行了大量实验。一时间,BCPL reserved lowercase words as system keywords to distinguish from user names, which had to be uppercase, or single-letter lowercase. But then they switched to uppercase (and later back to lowercase), and whether it was case sensitive depended on the compiler. Same was true for FORTRAN/Fortran, which generally isn't case sensitive at all, but sometimes is in wildly complicated ways.
所以当我说 "it would have made the compiler slower," 时,我并不是说 "because it's an older language and processor time was more precious." 大多数现代语言都区分大小写。许多较旧的语言对其历史和实现具有不同程度的大小写敏感性。但从根本上说,区分大小写对计算机来说更简单。这就是大多数 Unix 的设计方式,C(最初是 B)是 built to be the system language for Unix. But again, this was just a particular design decision in Unix,而不是一些 "it must be this way" 深度选择。
但是所有这些都只是在谈论和反驳答案。答案是:因为C就是这么定义的。
您可以:
#include <ctype.h>
#include <string.h>
#include <stdio.h>
char const * const kw[]={
"_Alignas",
"_Alignof",
"_Atomic",
"auto",
"_Bool",
"break",
"case",
"char",
"_Complex",
"const",
"continue",
"default",
"do",
"double",
"else",
"enum",
"extern",
"float",
"for",
"_Generic",
"goto",
"if",
"_Imaginary",
"inline",
"int",
"long",
"_Noreturn",
"register",
"restrict",
"return",
"short",
"signed",
"sizeof",
"static",
"_Static_assert",
"struct",
"switch",
"_Thread_local",
"typedef",
"union",
"unsigned",
"void",
"volatile",
"while",
};
static int comb(char * s, int ix, char const * orig){
if(s[ix]==0){
if(0!=strcmp(s,orig))
return -(0>printf("#define %s %s\n", s, orig));
return 0;
}
s[ix]=tolower(s[ix]);
if(0>comb(s,ix+1,orig))
return -1;
s[ix]=toupper(s[ix]);
if(0>comb(s,ix+1,orig))
return -1;
return 0;
}
static int mk_defines(char const* s){
char b[20];
int len = strlen(s);
memcpy(b,s,len+1);
if(0>comb(b,0,s))
return -1;
return 0;
}
int main()
{
int n = sizeof(kw)/sizeof(kw[0]);
for(int i=0;i<n;i++){
if(0>mk_defines(kw[i]))
return -1;
}
return 0;
}
而且只有 29,694 定义(是的!)。
C 关键字由 C 编译器预定义,在 C89 中为小写。既然只有 32 个,为什么不能将它们定义为不区分大小写呢?
因为 C 是区分大小写的,而且那些是关键字。让它们不区分大小写会使编译器变慢,但真正的原因是它是如何定义的。
60 年代对字母大小写进行了大量实验。一时间,BCPL reserved lowercase words as system keywords to distinguish from user names, which had to be uppercase, or single-letter lowercase. But then they switched to uppercase (and later back to lowercase), and whether it was case sensitive depended on the compiler. Same was true for FORTRAN/Fortran, which generally isn't case sensitive at all, but sometimes is in wildly complicated ways.
所以当我说 "it would have made the compiler slower," 时,我并不是说 "because it's an older language and processor time was more precious." 大多数现代语言都区分大小写。许多较旧的语言对其历史和实现具有不同程度的大小写敏感性。但从根本上说,区分大小写对计算机来说更简单。这就是大多数 Unix 的设计方式,C(最初是 B)是 built to be the system language for Unix. But again, this was just a particular design decision in Unix,而不是一些 "it must be this way" 深度选择。
但是所有这些都只是在谈论和反驳答案。答案是:因为C就是这么定义的。
您可以:
#include <ctype.h>
#include <string.h>
#include <stdio.h>
char const * const kw[]={
"_Alignas",
"_Alignof",
"_Atomic",
"auto",
"_Bool",
"break",
"case",
"char",
"_Complex",
"const",
"continue",
"default",
"do",
"double",
"else",
"enum",
"extern",
"float",
"for",
"_Generic",
"goto",
"if",
"_Imaginary",
"inline",
"int",
"long",
"_Noreturn",
"register",
"restrict",
"return",
"short",
"signed",
"sizeof",
"static",
"_Static_assert",
"struct",
"switch",
"_Thread_local",
"typedef",
"union",
"unsigned",
"void",
"volatile",
"while",
};
static int comb(char * s, int ix, char const * orig){
if(s[ix]==0){
if(0!=strcmp(s,orig))
return -(0>printf("#define %s %s\n", s, orig));
return 0;
}
s[ix]=tolower(s[ix]);
if(0>comb(s,ix+1,orig))
return -1;
s[ix]=toupper(s[ix]);
if(0>comb(s,ix+1,orig))
return -1;
return 0;
}
static int mk_defines(char const* s){
char b[20];
int len = strlen(s);
memcpy(b,s,len+1);
if(0>comb(b,0,s))
return -1;
return 0;
}
int main()
{
int n = sizeof(kw)/sizeof(kw[0]);
for(int i=0;i<n;i++){
if(0>mk_defines(kw[i]))
return -1;
}
return 0;
}
而且只有 29,694 定义(是的!)。