ARM® 架构的过程调用标准:2 个独立但相关的 return 值

Procedure Call Standard for the ARM® Architecture: 2 separate, but related return values

函数 readWord 应该 return:

下面的例子是遵循AAPCS吗? 考虑到 AAPCS,有没有更好的方法来做到这一点?

// OUT:
//  R0 (status code): 0 if valid, else > 0
//  R1 (data):  
//  if the R0 represents a successful read: data, which was read,
//  else: undetermined
read32:
    PUSH    {LR}

    LDR     R0, =memoryMappedAddressOfDataSource
    LDR     R4, [R0]
    BL      checkValidRead
    // If the read was invalid, directly return the error 
    // code set by checkValidRead in R0 and do not change R1.
    CBNZ    R0, read32_return

    // R0 is 0, so the read was valid and the the data is returned in R1.
    MOV     R1, R4

read32_return:  
    POP     {PC}

// IN: none
// Checks a special status register to determine, 
// whether the last read was successful.    
// OUT:
//  R0: 0 if valid, else > 0
checkValidRead:
    ...

来自 AAPCS(第 18 页):

A double-word sized Fundamental Data Type (e.g., long long, double and 64-bit containerized vectors) is returned in r0 and r1.

A Composite Type larger than 4 bytes, or whose size cannot be determined statically by both caller and callee, is stored in memory at an address passed as an extra argument when the function was called (§5.5, rule A.4). The memory to be used for the result may be modified at any point during the function call.

但是,我不知道它是容器化的 64 位向量还是聚合复合类型,甚至……。否则:

The content of a containerized vector is opaque to most of the procedure call standard: the only defined aspect of its layout is the mapping between the memory format (the way a fundamental type is stored in memory) and different classes of register at a procedure call interface.

A Composite Type is a collection of one or more Fundamental Data Types that are handled as a single entity at the procedure call level. A Composite Type can be any of: An aggregate, where the members are laid out sequentially in memory [...]

您引用的文档说任何不适合单个寄存器的复合类型都通过隐藏指针返回。这将包括一个 C 结构。

寄存器对中只能返回单个宽整数或 FP 类型。

通过隐藏指针,寄存器对比 store/reload 更有效,因此不幸的是,您必须绕过调用约定而不是仅仅返回 struct { uint32_t flag, value; }

为了向 C 编译器描述您想要的调用约定,您告诉它您要返回一个 uint64_t,并将其拆分为两个 32 位整数变量.这将免费发生,因为编译器已经将它们放在单独的寄存器中。

例如(源代码+asm on the Godbolt compiler explorer)。我用的是 union,但你同样可以使用 shift。

#include <stdint.h>

uint64_t read32(void);

union unpack32 {
    uint64_t u64;
    uint32_t u32[2];
};

void ext(uint32_t);        // something to do with a return value

unsigned read32_wrapper() {
    union unpack32 ret = { read32() };
    if (ret.u32[0]) {
        ext(ret.u32[1]);
    }
    return ret.u32[0];
}

这样编译:

    push    {r4, lr}
    bl      read32
    subs    r4, r0, #0          @ set flags and copy the flag to r4

    movne   r0, r1
    blne    ext                 @ the if() body.

    mov     r0, r4              @ always return the status flag
    pop     {r4, lr}
    bx      lr