程序集：movl data_items(,%edi,4), %eax 在这个程序中的目的是什么

Question

这个程序（来自 Jonathan Bartlett 的 Programming From the Ground Up）循环遍历内存中存储的所有数字 .long 并将最大的数字放入 EBX 寄存器以供程序完成时查看。

.section .data
data_items:
    .long 3, 67, 34, 222, 45, 75, 54, 34, 44, 33, 22, 11, 66, 0

.section .text
.globl _start

_start:
    movl [=11=], %edi
    movl data_items (,%edi,4), %eax
    movl %eax, %ebx
start_loop:
    cmpl [=11=], %eax
    je loop_exit
    incl %edi
    movl data_items (,%edi,4), %eax
    cmpl %ebx, %eax
    jle start_loop
    movl %eax, %ebx
    jmp start_loop
loop_exit:
    movl , %eax
    int [=11=]x80

我不确定此程序中 (,%edi,4) 的用途。我读到逗号是用来分隔的，4是为了提醒我们的计算机数据项中的每个数字都是4个字节长。既然我们已经用.long声明了每个数字都是4个字节，为什么还要在这里重新声明呢？另外，有人可以更详细地解释一下这两个逗号在这种情况下的作用吗？

Answer 1

在AT&T语法中，内存操作数有the following syntax¹:

displacement(base_register, index_register, scale_factor)

The base, index and displacement components can be used in any combination, and every component can be omitted

但很明显，如果您省略了基址寄存器，则必须保留逗号，否则汇编程序将无法理解您遗漏了哪些组件。

所有这些数据将结合起来计算您指定的地址，公式如下：

effective_address = displacement + base_register + index_register*scale_factor

（顺便说一句，这几乎正是您在 Intel 语法中指定的方式）。

因此，有了这些知识，我们就可以解码您的指令：

movl data_items (,%edi,4), %eax

匹配上面的语法，你会看到：

data_items为位移；
base_register省略了，所以不带入上面的公式；
%edi 是 index_register;
4 是 scale_factor.

所以，你告诉 CPU 到 move a long 从位置 data_items+%edi*4 到寄存器 %eax.

*4 是必需的，因为数组的每个元素都是 4 字节宽，因此要将索引（在 %edi 中）转换为从数组开头的偏移量（以字节为单位）数组你必须乘以 4.

Since we've already declared that each number is 4 bytes with .long, why do we need to do it again here?

汇编程序是对类型一无所知的低级工具。

.long 不是数组声明，只是指令汇编程序发出与其参数的 32 位表示对应的字节；
data_items 不是一个数组，只是一个被解析到某个内存位置的符号，与其他标签完全一样；事实上，你在它后面放置了一个 .long 指令，这对汇编程序没有特别的意义。

备注

从技术上讲，也会有段说明符，但考虑到我们在 Linux 上讨论 32 位代码，我将完全省略段，因为它们只会增加混乱。

程序集：movl data_items(,%edi,4), %eax 在这个程序中的目的是什么

Assembly: What is the purpose of movl data_items(,%edi,4), %eax in this program

x86

assembly

comma