汇编：数组中整数的出现

Question

我正在用 masm 汇编语言编写一个程序来计算 return 数组中整数出现的次数。我目前有以下代码，允许我用随机整数填充数组。我正在苦苦挣扎的是如何实现一个计数器，它将每次出现的整数存储在数组的索引处。例如，如果随机数组是 [3,4,3,3,4,5,7,8]，我希望我的计数数组包含 [3, 2, 1, 1, 1]，因为有（三个 3，两个 4，等等）。

我将随机数的范围固定在 3/8，所以我知道它们会在这个范围内。我目前的想法是在添加时将每个数字与 3-8 进行比较，并分别递增我的计数数组。我主要缺乏理解的是如何增加数组的特定索引。这段代码是我生成随机整数数组的方式，以及我如何开始计算整数出现次数的想法，但我不知道我是否朝着正确的方向前进。有什么建议吗？

push ebp
mov  ebp, esp
mov  esi, [ebp + 16]    ; @ holds array to store count of integer occurances
mov  edi, [ebp + 12]  ; @ holds array to be populated with random ints
mov  ecx, [ebp + 8]   ; value of request in ecx


MakeArray:              
    mov     eax, UPPER          ; upper boundary for random num in array
    sub     eax, LOWER          ; lower boundary for random num in array
    inc     eax             
    call    RandomRange
    add     eax, LOWER
    cmp     eax, 3          ; Where I start to compare the random numbers added
    je      inc_3           ; current thought is it cmp to each num 3-8
    mov     [edi], eax  ; put random number in array
    add     edi, 4      ; holds address of current element, moves to next element
    loop    fillArrLoop

inc_3:          ; if random num == 3
    inc     esi         ; holds address of count_array, increments count_array[0] to 1?
    mov     [edi], eax  ; put random number in array to be displayed
    add     edi, 4      ; holds address of current element, moves to next element
    loop    MakeArray

Answer 1

如果你的范围是固定的 (3-8)，你有一个固定长度的数组可以保存你的计数：

(index0:Count of 3),(index1:Count of 4)..(index5:Count of 8s)

一旦你从随机数组中获得一个元素，你只需取出该元素并将其放入开关即可：

cmp 3, [element]
jne compare4
mov ebx, [countsArrayAddress]     ; Can replace [countsArrayAddress] with  [ebp + 16]
add ebx, 0                        ; First index, can comment out this line
mov ecx, [ebx]
add ecx, 1                        ; Increment count
mov [ebx], ecx                    ; Count at the zeroth offset is now incremented
compare4:
cmp 4, [element]
jne compare5
mov ebx, [countsArrayAddress]
add ebx, 4                         ; Second index (1*4)
mov ecx, [ebx]
add ecx, 1
mov [ebx], ecx
...

这是你的意思吗？我来自使用 fasm 语法，但它看起来非常相似。上面的块有点未优化，但认为这显示了如何构建计数数组。该数组具有固定长度，必须在堆栈上（sub rsp 正确的数量）或在堆上分配，即使用 heapalloc/malloc 调用。（已编辑，看到您使用的是 32 位寄存器）

Answer 2

My current thinking is to compare each number to 3-8 as it is added

不，你把它复杂化了。您不想线性搜索 j（计数中的索引）使得 arr[i] == j，只需使用 j = arr[i].

制作直方图的标准方法是++counts[ arr[i] ]。在您的例子中，您知道可能的值为 3..8，因此您可以使用 arr[i] - 3 将数组值映射到计数桶，因此您将对 counts[0..5] 进行操作。 具有缩放索引寻址模式的内存目标 add 指令可以在一条 x86 指令中执行此操作，给定寄存器中的元素值。

如果可能的值不连续，您通常会使用散列 table 将值映射到计数桶。您可以将这个简单的案例视为允许一个简单的哈希函数。

由于您在生成直方图的同时生成填充 arr[i] 的随机数，因此您可以将这两个任务结合起来，而不是减去 3，只是不要添加它。

; inputs: unsigned len, int *values, int *counts
; outputs: values[0..len-1] filled with random numbers, counts[] incremented
; clobbers: EAX, ECX, EDX  (not the other registers)
fill_array_and_counts:
    push ebp
    mov  ebp, esp

    push esi        ; Save/restore the caller's ESI.
 ;; Irvine32 functions like RandomRange are special and don't clobber EAX, ECX, or EDX except as return values,
 ;; so we can use EDX and ECX even in a loop that makes a function call.

    mov  edi, [ebp + 16]    ; int *counts  ; assumed already zeroed?
    mov  edx, [ebp + 12]    ; int *values  ; output pointers
    mov  ecx, [ebp + 8]     ; size_t length

    MakeArray:                       ; do{
        mov     eax, UPPER - LOWER + 1   ; size of random range, calculated at assemble time
        call    RandomRange                  ; eax = 0 .. eax-1
        add     dword ptr [edi + eax*4], 1   ; ++counts[ randval ]

        add     eax, LOWER               ; map 0..n to LOWER..UPPER
        mov     [edx], eax               ; *values = randval+3
        add     edx, 4                   ; values++

        dec     ecx
        jnz     MakeArray            ; }while(--ecx);

    pop   edi                  ; restore call-preserved regs
    pop   ebp                  ; including tearing down the stack frame
    ret

如果调用者没有为你清零 counts 数组，你应该自己做，也许用 rep stosd 和 EAX=0 作为 ECX 双字元素的 memset , 然后从堆栈参数中重新加载 EDI 和 ECX。

我假设 UPPER 和 LOWER 是 assemble 时间常数，例如 UPPER = 8 或 LOWER equ 3，因为您为它们使用了全大写的名称，但它们不是函数参数。如果是这样，那么就不需要在运行时进行数学运算，只需让 assembler 为您计算 UPPER - LOWER + 1。

我避免了 loop 指令 because it's slow，并且没有做任何其他简单指令不能做的事情。

只有几个桶的直方图的一个标准性能技巧是有多个计数数组并展开它们：Methods to vectorise histogram in SIMD?。当同一计数器需要连续递增多次时，这会隐藏 store/reload 的延迟。不过，您的随机值通常会避免长时间运行相同的值，因此可以避免最坏的性能。

由于只有 6 个可能的存储桶：，对于大型阵列可能会从 AVX2 中获益。（如果需要，您可以使用 AVX2 xorshift128+ PRNG 在 SIMD 向量中生成随机数。）

汇编：数组中整数的出现

Assembly: Occurrence of Integers in Array

arrays

x86

assembly

masm

histogram