为什么 Linux 内核包含重复代码？

Question

所以我查看了 Linux 内核 (v 2.6.9)，发现 i386 和 x86_64 架构的引导扇区代码具有相同的代码。但编程的黄金法则不是让您的代码保持干爽（不要编写重复代码）吗？那么为什么 Linux 维护者不只是重新使用这个文件而不是复制粘贴它呢？我正在努力学习好的和干净的代码，这让我感到困惑。我的意思是，如果 2 个架构共享相同的引导代码 - 只是重用它？

在Linux源代码中，这两个文件完全相同。

/arch/i386/boot/bootsect.S 和 /arch/x86_64/boot/bootsect.S

/*
 *  bootsect.S      Copyright (C) 1991, 1992 Linus Torvalds
 *
 *  modified by Drew Eckhardt
 *  modified by Bruce Evans (bde)
 *  modified by Chris Noe (May 1999) (as86 -> gas)
 *  gutted by H. Peter Anvin (Jan 2003)
 *
 * BIG FAT NOTE: We're in real mode using 64k segments.  Therefore segment
 * addresses must be multiplied by 16 to obtain their respective linear
 * addresses. To avoid confusion, linear addresses are written using leading
 * hex while segment addresses are written as segment:offset.
 *
 */

#include <asm/boot.h>

SETUPSECTS  = 4         /* default nr of setup-sectors */
BOOTSEG     = 0x07C0        /* original address of boot-sector */
INITSEG     = DEF_INITSEG       /* we move boot here - out of the way */
SETUPSEG    = DEF_SETUPSEG      /* setup starts here */
SYSSEG      = DEF_SYSSEG        /* system loaded at 0x10000 (65536) */
SYSSIZE     = DEF_SYSSIZE       /* system size: # of 16-byte clicks */
                    /* to be loaded */
ROOT_DEV    = 0             /* ROOT_DEV is now written by "build" */
SWAP_DEV    = 0         /* SWAP_DEV is now written by "build" */

#ifndef SVGA_MODE
#define SVGA_MODE ASK_VGA
#endif

#ifndef RAMDISK
#define RAMDISK 0
#endif

#ifndef ROOT_RDONLY
#define ROOT_RDONLY 1
#endif

.code16
.text

.global _start
_start:

    # Normalize the start address
    jmpl    $BOOTSEG, $start2

start2:
    movw    %cs, %ax
    movw    %ax, %ds
    movw    %ax, %es
    movw    %ax, %ss
    movw    [=11=]x7c00, %sp
    sti
    cld

    movw    $bugger_off_msg, %si

msg_loop:
    lodsb
    andb    %al, %al
    jz  die
    movb    [=11=]xe, %ah
    movw    , %bx
    int [=11=]x10
    jmp msg_loop

die:
    # Allow the user to press a key, then reboot
    xorw    %ax, %ax
    int [=11=]x16
    int [=11=]x19
    
    # int 0x19 should never return.  In case it does anyway,
    # invoke the BIOS reset code...
    ljmp    [=11=]xf000,[=11=]xfff0


bugger_off_msg:
    .ascii  "Direct booting from floppy is no longer supported.\r\n"
    .ascii  "Please use a boot loader program instead.\r\n"
    .ascii  "\n"
    .ascii  "Remove disk and press any key to reboot . . .\r\n"
    .byte   0


    # Kernel attributes; used by setup

    .org 497
setup_sects:    .byte SETUPSECTS
root_flags: .word ROOT_RDONLY
syssize:    .word SYSSIZE
swap_dev:   .word SWAP_DEV
ram_size:   .word RAMDISK
vid_mode:   .word SVGA_MODE
root_dev:   .word ROOT_DEV
boot_flag:  .word 0xAA55

Answer 1

32 位 x86 (i386) 和 64 位 x86 (x86-64) 确实有一些共性，例如引导到完全相同的环境，x86 处理器处于传统 16 位实模式。

内核开发人员也没有忽视这两者之间的重复，参见例如这篇 LWN 文章（自 2007 年起）：i386 and x86_64: back together?。（2.6.9好像是2004年左右的，到现在已经16岁多了。）

@IanAbbott 在评论中提到将这两者加入 /arch/x86 最终发生在 2.6.24（2008 年 1 月发布，大约 13 年前）。

为什么 Linux 内核包含重复代码？

Why does the Linux kernel contain duplicate code?

c

linux

dry

clean-architecture