LLVM IR 中 "i32 (...)**" 的含义是什么?

What is the meaning of "i32 (...)**" in LLVM IR?

我正在阅读 Clang++ 生成的以下代码的 LLVM IR 代码:

class Shape {
  public:
  // pure virtual function providing interface framework.
  virtual int getArea(char* me) = 0;
  void setWidth(int w) {
    width = w;
  }
  
  void setHeight(int h) {
    height = h;
  }
  
  protected:
  int width;
  int height;
};

// Derived classes
class Rectangle: public Shape {
  public:
  int getArea(char * me) {
    return (width * height);
  }
};

产生以下 LLVM IR:

%class.Rectangle = type { %class.Shape }
%class.Shape = type { i32 (...)**, i32, i32 }

这个“i32 (...)**”是什么?它有什么作用?

从“i32 (...)**”的外观来看,这看起来像函数指针,但用于对对象进行位转换。

像这样:

define linkonce_odr dso_local void @_ZN9RectangleC2Ev(%class.Rectangle* %0) unnamed_addr #5 comdat align 2 {
  %2 = alloca %class.Rectangle*, align 8
  store %class.Rectangle* %0, %class.Rectangle** %2, align 8
  %3 = load %class.Rectangle*, %class.Rectangle** %2, align 8
  %4 = bitcast %class.Rectangle* %3 to %class.Shape*
  call void @_ZN5ShapeC2Ev(%class.Shape* %4) #3
  %5 = bitcast %class.Rectangle* %3 to i32 (...)***
  store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV9Rectangle, i32 0, inrange i32 0, i32 2) to i32 (...)**), i32 (...)*** %5, align 8
  ret void
}

让我们看更简单的代码

struct A {
  virtual void f1();
  int width;
};

struct B: public A {
  void f1() {};
};

B a;

编译后可以得到this:

%struct.B = type { %struct.A.base, [4 x i8] }
%struct.A.base = type <{ i32 (...)**, i32 }>

@a = dso_local local_unnamed_addr global %struct.B { %struct.A.base <{ i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1B, i32 0, inrange i32 0, i32 2) to i32 (...)**), i32 0 }>, [4 x i8] zeroinitializer }, align 8
@_ZTV1B = linkonce_odr dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8*, i8* }* @_ZTI1B to i8*), i8* bitcast (void (%struct.B*)* @_ZN1B2f1Ev to i8*)] }, comdat, align 8

可以看到,i32 (...)**是给_ZTV1B的,拆解后会变成vtable for B

我们看到神秘函数是:

getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* @_ZTV1B, i32 0, inrange i32 0, i32 2)

在 GEP 之后是 _ZN1B2f1Ev,在 demangling 之后是 B::f1()

我也试过了this example:

auto f(B *a) {
    a->f1();
}

生成的代码是:

define dso_local void @_Z1fP1B(%struct.B* %0) local_unnamed_addr #0 {
  %2 = bitcast %struct.B* %0 to void (%struct.B*)***
  %3 = load void (%struct.B*)**, void (%struct.B*)*** %2, align 8, !tbaa !3
  %4 = load void (%struct.B*)*, void (%struct.B*)** %3, align 8
  tail call void %4(%struct.B* nonnull align 8 dereferenceable(12) %0)
  ret void
}

可以看出,它只是获取所需的函数并调用它。

P.S.

We currently use i32 (...)** as the type of the vptr field in the LLVM struct type. LLVM's GlobalOpt prefers any bitcasts to be on the side of the data being stored rather than on the pointer being stored to.