就机器代码而言,GCC、clang、and/or LLVM 中实现的体系结构支持在哪里?

Where is the architecture support implemented in GCC, clang, and/or LLVM in terms of machine code?

我在看this:

Architecture characteristic key
-----------------------------------------------------------------------
H       A hardware implementation does not exist.
M       A hardware implementation is not currently being manufactured.
S       A Free simulator does not exist.
L       Integer registers are narrower than 32 bits.
Q       Integer registers are at least 64 bits wide.
N       Memory is not byte addressable, and/or bytes are not eight bits.
F       Floating point arithmetic is not included in the instruction set
I       Architecture does not use IEEE format floating point numbers
C       Architecture does not have a single condition code register.
B       Architecture has delay slots.
D       Architecture has a stack that grows upward.

l       Port cannot use ILP32 mode integer arithmetic.
q       Port can use LP64 mode integer arithmetic.
r       Port can switch between ILP32 and LP64 at runtime.
        (Not necessarily supported by all subtargets.)
c       Port uses cc0.
p       Port uses define_peephole (as opposed to define_peephole2).
b       Port uses '"* ..."' notation for output template code.
f       Port does not define prologue and/or epilogue RTL expanders.
m       Port does not use define_constants.
g       Port does not define TARGET_ASM_FUNCTION_(PRO|EPI)LOGUE.
i       Port generates multiple inheritance thunks using
        TARGET_ASM_OUTPUT_MI(_VCALL)_THUNK.
a       Port uses LRA (by default, i.e. unless overridden by a switch).
t       All insns either produce exactly one assembly instruction, or
        trigger a define_split.
e       <arch>-elf is not a supported target.
s       <arch>-elf is the correct target to use with the simulator
        in /cvs/src.
          |      Characteristics
Target     | HMSLQNFICBD lqrcpbfmgiates
-----------+---------------------------
aarch64    |     Q        q   b  gia  s
alpha      |  ?  Q   C    q     mgi  e
arc        |          B       b  gia
arm        |                  b   ia  s
avr        |    L  FI    l  cp   g
bfin       |       F             gi
c6x        |   S     CB          gi
cr16       |    L  F C      c    g    s
cris       |       F  B          gi   s
csky       |                  b   ia
epiphany   |         C           gi   s
fr30       | ??    FI B      pb mg    s
frv        | ??       B       b   i   s
gcn        |   S     C D  q        a e
h8300      |       FI B          g    s
i386       |     Q        q   b   ia
ia64       |   ? Q   C    qr  b m i
iq2000     | ???   FICB       b  g  t
lm32       |       F             g
m32c       |    L  FI    l    b  g    s
m32r       |       FI         b       s
m68k       |                 pb   i
mcore      |  ?    FI        pb mg    s
mep        |       F C        b  g  t s
microblaze |         CB           i   s
mips       |     Q   CB   qr      ia  s
mmix       | HM  Q   C    q       i  e
mn10300    | ??                  gi   s
moxie      |       F             g  t s
msp430     |    L  FI    l    b  g    s
nds32      |       F C            ia  s
nios2      |         C            ia
nvptx      |   S Q   C    q     mg   e
pa         |     Q   CBD  qr  b   i  e
pdp11      |    L   IC    qr  b      e
powerpcspe |     Q   C    qr pb   ia
pru        |    L  F               a  s
riscv      |     Q   C    qr     gia
rl78       |    L  F     l       g    s
rs6000     |     Q   C    qr pb   ia
rx         |                          s
s390       |     Q        qr     gia e
sh         |     Q   CB   qr p    i
sparc      |     Q   CB   qr  b   ia
stormy16   | ???L  FIC D l    b   i
tilegx     |     Q   C    q      gi  e
tilepro    |   S   F C           gi  e
v850       |                     g a  s
vax        |  M     I         b   i  e
visium     |          B          g  t s
xtensa     |         C

看起来大约有 50 种架构。 GitHub 上的源代码中实现的所有内容在哪里?对于 GCC、clang、and/or LLVM(或在实施架构集成方面可能感兴趣的任何其他关键项目)。

GCC 的简要概述:

GCC 的 .md 机器定义文件使用与 GNU C 内联汇编类似的约束语法告诉它哪些指令可用以及它们做什么。 (GCC 不知道机器代码,只知道 asm 文本,这就是为什么它只能输出 .s for as to assemble 分别。)还有一些 C 函数知道该体系结构的通用规则,我猜是寄存器名称之类的东西。

GCC 内部手册有一节 6.3.9 Anatomy of a Target Back End 记录了相关文件在 GCC 源代码树中的位置。

对于 LLVMClang 是基于 LLVM 的前端),您可以在 llvm/lib/Target/ 目录中找到每个体系结构的后端代码。 LLVM 使用 .td 目标描述文件来描述目标(它是一种自己的语言)。比如指令、寄存器、调用约定等等。

这些目录中还有实现某些功能的 .cpp 文件。例如 llvm/lib/Target/Mips/MipsDelaySlotFiller.cpp 文件实现一个通道以使用对 Mips 架构有用的指令填充延迟槽。

this repository of git patches for adding RISC-V target, which is a nice example how to implement a backend incrementally. Also there is 很多文档,但请记住它已经过时了,但仍然有有用的信息。