通过指针与数组中的索引链接结构图

Question

这是关于良好做法

的问题

考虑典型的情况，例如在 3D 引擎、物理引擎、Finite element method or classical molecular dynamics 求解器中：您有各种类型的对象（例如顶点、边、面、有界实体体积），它们相互交叉链接（例如顶点知道哪条边与其连接反之亦然）。为了性能和使用方便，这种引擎对于能够快速浏览这种连接的网络至关重要。

问题是：通过数组中的索引或指针指向链接对象更好吗？ ...尤其是性能方面

typedef index_t uint16_t;

class Vertex{
    Vec3 pos;
    #ifdef BY_POINTER
    Edge*   edges[nMaxEdgesPerVertex];
    Face*   faces[nMaxFacesPerVertex];
    #else
    index_t edges[nMaxEdgesPerVertex];
    index_t faces[nMaxFacesPerVertex];
    #endif
}

class Edge{
    Vec3   direction;
    double length;
    #ifdef BY_POINTER
    Vertex* verts[2];
    Faces*  faces[nMaxFacesPerEdge];  
    #else
    index_t verts[2];
    index_t faces[nMaxFacesPerEdge];
    #endif
}

class Face{
    Vec3   normal;
    double isoVal; // Plane equation: normal.dot(test_point)==isoVal
    #ifdef BY_POINTER
    Vertex* verts[nMaxVertsPerFace];
    Edge*   edges[nMaxEdgesPerFace];
    #else
    index_t verts[nMaxVertsPerFace];
    index_t edges[nMaxEdgesPerFace];
    #endif
}

#ifndef BY_POINTER
// we can use other datastructure here, such as std:vector or even some HashMap 
int nVerts,nEdges,nFaces;
Vertex verts[nMaxVerts];
Edge   edges[nMaxEdges];
Vertex faces[nMaxFaces];
#endif

指数优势：

当我们使用 uint8_t 或 uint16_t 作为索引而不是 32 位或 64 位指针时，使用索引可以提高内存效率
索引可以携带一些额外的信息（例如关于边缘的方向）编码在一些位中；
数组中对象的排序可以携带一些关于结构的信息（例如，立方体的顶点可以排序为 {0b000,0b001,0b010,0b011,0b100,0b101,0b110,0b111}）。此信息在指针中不可见

指针的优点：

我们不需要关心数组（或其他数据结构）来存储对象。这些对象可以通过 new Vertex().
May be faster (?)因为不需要加上数组的基地址（？）。但这在内存延迟方面可能可以忽略不计 (?)

Answer 1

对于性能而言，重要的是您可以以热路径中通常执行的任何遍历顺序读取 "next" 元素的速度。

例如，如果您有一系列代表某条路径的边，您会希望它们按照它们出现的顺序连续存储在内存中（而不是对每个边使用 new）已连接。

对于这种情况（边形成路径），显然不需要指针，也不需要索引。连接由存储位置暗示，因此您只需要一个指向第一个和最后一个边缘的指针（即您可以将整个路径存储在 std::vector<Edge>）。

第二个示例说明了我们可以利用的领域知识：假设我们有一款最多支持 8 名玩家的游戏，并且想要存储 "who has visited each of the edges in a path." 同样，我们不需要指针或索引来引用这 8 名玩家。相反，我们可以简单地在每个 Edge 中存储一个 uint8_t 并将这些位用作每个玩家的标志。是的，这是低级别的 bit banging，但是一旦我们有了 Edge*，它就会为我们提供紧凑的存储和高效的查找。但是如果我们需要从另一个方向进行查找，从玩家到 Edges，最有效的方法是存储例如每个玩家内部的 uint32_t 向量，并在 Edge 数组中进行索引。

但是如果可以在路径中间添加和删除边呢？那么我们可能需要一个链表。在这种情况下，我们应该使用一个 侵入式 链表，并将 Edge 分配到一个池中。完成此操作后，我们可以在每个玩家对象中存储指向 Edge 的指针，并且它们永远不会更改或需要更新。我们使用侵入式链表，理解 Edge 只是单个路径的一部分，因此链表指针的外部存储会很浪费（std::list 需要存储指向每个对象的指针; 侵入式列表不会）。

因此，每个案例都必须单独考虑，并尽可能多地了解我们可以提前发现的领域。指针和索引都不应该是第一种方法。

Answer 2

using index can be more memory efficient when we use uint8_t or uint16_t for index instead of 32-bit or 64-bit pointer

没错。具有较小的表示可以减少结构的总大小，从而减少遍历时的缓存未命中。

index can carry some additional information ( e.g. about orientation of edge ) encoded in some bits;

正确。

We don't need to care about the arrays (or other data-structures) to store the objects. The objects can be simply allocated dynamically on the heap by new Vertex().

这正是你不想做的，说到表演。您要确保 Vertex 全部打包，以避免不必要的缓存丢失。在这种情况下，阵列可以使您免受错误的诱惑。您还希望再次按顺序访问它们，至少尽可能多地访问它们以最大程度地减少缓存未命中。

你的数据结构有多少打包、小和顺序访问，才是真正推动性能的因素。

May be faster (?) because it does not need to add the base address of the array (?). But this is probably negligible with respect to memory latency (?)

可能可以忽略不计。可能取决于特定的硬件和|或编译器。

索引的另一个缺失优势：重新分配时更易于管理。考虑一个可以增长的结构，如下所示：

struct VertexList
{
  std::vector<Vertex> vertices;

  Vertex *start; // you can still access using vector if you prefer; start = &vertices[0];
}

如果您使用指针引用给定的顶点，并且发生了重新分配，您最终会得到一个无效的指针。

通过指针与数组中的索引链接结构图

Linking graphs of structures by pointer vs. index in array

c++

performance

3d-engine

graph-traversal