从基 class 访问联合的公共部分

Question

我有一个 Result<T> 模板 class，它包含一些 error_type 和 T 的联合。我想在基础 class 中公开公共部分（错误）而不求助于虚拟函数。

这是我的尝试：

using error_type = std::exception_ptr;

struct ResultBase
{
    error_type error() const
    {
        return *reinterpret_cast<const error_type*>(this);
    }

protected:
    ResultBase() { }
};

template <class T>
struct Result : ResultBase
{
    Result() { new (&mError) error_type(); }

    ~Result() { mError.~error_type(); }

    void setError(error_type error) { mError = error; }

private:
    union { error_type mError; T mValue; };
};

static_assert(std::is_standard_layout<Result<int>>::value, "");

void check(bool condition) { if (!condition) std::terminate(); }

void f(const ResultBase& alias, Result<int>& r)
{
    r.setError(std::make_exception_ptr(std::runtime_error("!")));
    check(alias.error() != nullptr);

    r.setError(std::exception_ptr());
    check(alias.error() == nullptr);
}

int main()
{
    Result<int> r;
    f(r, r);
}

（这是精简的，如果不清楚请参见extended version）。

基class利用标准布局在偏移量零处找到错误字段的地址。然后它将指针转换为 error_type（假设这确实是联合的当前动态类型）。

我认为这是便携式的是否正确？或者它是否违反了一些指针别名规则？

编辑：我的问题是 'is this portable'，但是很多评论者对这里使用继承感到困惑，所以我会澄清一下。

首先，这是一个玩具示例。请不要从字面上理解它，也不要假设基数 class.

没有用

设计有三个目标：

紧凑度。 error 和 result 是互斥的，所以他们应该在一个联合中。
无运行时开销。虚函数被排除在外（另外，持有虚表指针与目标 1 冲突）。 RTTI 也排除在外。
均匀度。不同 Result 类型的公共字段应该可以通过同质指针或包装器访问。例如：如果我们谈论的不是 Result<T>，而是 Future<T>，那么无论 a / b 具体类型如何，都应该可以执行 whenAny(FutureBase& a, FutureBase& b)。

如果愿意牺牲（1），这就变得微不足道了。类似于：

struct ResultBase
{
    error_type mError;
};

template <class T>
struct Result : ResultBase
{
    std::aligned_storage_t<sizeof(T), alignof(T)> mValue;
};

如果我们牺牲 (2) 而不是目标 (1)，它可能看起来像这样：

struct ResultBase
{
    virtual error_type error() const = 0;
};

template <class T>
struct Result : ResultBase
{
    error_type error() const override { ... }

    union { error_type mError; T mValue; };
};

同样，理由不相关。我只想确保原始示例符合 C++11 代码。

Answer 1

抽象基础 class，错误和数据的两个实现，都具有多重继承，并使用 RTTI 或 is_valid() 成员在运行时判断它是哪个。

Answer 2

union {
    error_type mError;
    T mValue;
};

类型 T 不能保证与联合一起工作，例如它可能有一个非平凡的构造函数。关于联合和构造函数的一些信息：Initializing a union with a non-trivial constructor

Answer 3

回答问题：那是便携式的吗？

不，这甚至不可能

详情：

如果没有至少 type erasure，这是 不可能的（不需要 RTTI/dynamic_cast，但至少需要一个虚函数）。已经有类型擦除的工作解决方案 (Boost.Any)

原因如下：

您想实例化 class

Result<int> r;

实例化模板class意味着允许编译器推断成员变量的大小，以便它可以在堆栈上分配对象。

但是在您的实施中：

private: union { error_type mError; T mValue; };

您有一个变量 error_type，您似乎想以多态方式使用它。但是，如果您在模板实例化时修复了类型，您以后将无法更改它（不同的类型可能具有不同的大小！您也可以强加自己来修复对象的大小，但不要那样做。丑陋和骇人听闻）。

所以你有2个解决方案，使用虚函数，或者使用错误代码。

可以做你想做的事，但你不能那样做：

Result<int> r; r.setError(...);

具有您想要的确切界面。

只要您允许虚函数和错误代码，就有许多可能的解决方案，为什么您不想在这里使用虚函数？如果性能很重要，请记住 "setting" 错误的成本与设置指向虚拟 class 的指针一样多（如果没有错误，则不需要解析 Vtable，并且无论如何，模板代码中的 Vtable 可能会在大多数时候被优化掉）。

此外，如果您不想 "allocate" 错误代码，您可以预先分配它们。

您可以执行以下操作：

template< typename Rtype> class Result{ //... your detail here ~Result(){ if(error) delete resultOrError.errorInstance; else delete resultOrError.resultValue; } private: union { bool error; std::max_align_t mAligner; }; union uif { Rtype * resultValue; PointerToVirtualErrorHandler errorInstance; } resultOrError; }

您有 1 个结果类型，或 1 个指向虚拟 class 的指针并出现所需错误。您检查布尔值以查看当前是否有错误或结果，然后您从联合中获得相应的值。仅当您出错时才会支付虚拟成本，而对于常规结果，您只会受到布尔检查的惩罚。

当然，在上面的解决方案中，我使用了指向结果的指针，因为它允许通用结果，如果您对基本数据类型结果或仅具有基本数据类型的 POD 结构感兴趣，那么您也可以避免对结果使用指针.

注意在你的情况下 std::exception_ptr 确实已经输入了擦除，但是你丢失了一些类型信息，重新获得缺少类型信息，您可以自己实现类似于 std::exception_ptr 的东西，但具有足够的虚拟方法以允许安全地转换为正确的异常类型。

Answer 4

C++ 程序员常犯的一个错误是认为虚函数会导致 CPU 和内存的使用率更高。我称之为错误，即使我知道使用虚函数会消耗内存和 CPU。但是，在大多数情况下，虚函数机制的手写替代是最糟糕的。

您已经说过如何使用虚函数实现目标 - 重复一遍：

class ResultBase
{
public:
    virtual ~ResultBase() {}

    virtual bool hasError() const = 0;

    virtual std::exception_ptr error() const = 0;

protected:
    ResultBase() {}
};

及其实现：

template <class T>
class Result : public ResultBase
{
public:
    Result(error_type error) { this->construct(error); }
    Result2(T value) { this->construct(value); }

    ~Result(); // this does not change
    bool hasError() const override { return mHasError; }
    std::exception_ptr error() const override { return mData.mError; }

    void setError(error_type error); // similar to your original approach
    void setValue(T value); // similar to your original approach
private:
    bool mHasError;
    union Data
    {
        Data() {} // in this way you can use also Non-POD types
        ~Data() {}

        error_type mError;
        T mValue;
    } mData;

    void construct(error_type error)
    {
        mHasError = true;
        new (&mData.mError) error_type(error);
    }
    void construct(T value)
    {
        mHasError = false;
        new (&mData.mValue) T(value);
    }
};

查看完整示例 here。如您所见，带有虚函数的版本小了 3 倍，快了 7（！）倍 - 所以，还不错......

另一个好处是您可能有 "cleaner" 设计并且没有 "aliasing"/"aligning" 问题。

如果您真的有某种称为紧凑性的理由（我不知道它是什么）- 通过这个非常简单的示例，您可以手动实现虚函数（但为什么？？？！！！）。你在这里：

class ResultBase;
struct ResultBaseVtable
{
    bool (*hasError)(const ResultBase&);
    error_type (*error)(const ResultBase&);
};

class ResultBase
{
public:
    bool hasError() const { return vtable->hasError(*this); }

    std::exception_ptr error() const { return vtable->error(*this); }

protected:
    ResultBase(ResultBaseVtable* vtable) : vtable(vtable) {}
private:
    ResultBaseVtable* vtable;
};

实现与之前的版本相同，区别如下：

template <class T>
class Result : public ResultBase
{
public:
    Result(error_type error) : ResultBase(&Result<T>::vtable)
    {
        this->construct(error);
    }
    Result(T value) : ResultBase(&Result<T>::vtable)
    {
        this->construct(value);
    }

private:
    static bool hasErrorVTable(const ResultBase& result)
    {
        return static_cast<const Result&>(result).hasError();
    }
    static error_type errorVTable(const ResultBase& result)
    {
        return static_cast<const Result&>(result).error();
    }
    static ResultBaseVtable vtable;
};

template <typename T>
ResultBaseVtable Result<T>::vtable{
    &Result<T>::hasErrorVTable, 
    &Result<T>::errorVTable,    
};

以上版本在 CPU/memory 用法与 "virtual" 实现（惊喜）方面是相同的...

Answer 5

这是我自己尝试的一个严格关注可移植性的答案。

标准布局在§9.1[class.name]/7:

中定义

A standard-layout class is a class that:

has no non-static data members of type non-standard-layout class (or array of such types) or reference,

has no virtual functions (10.3) and no virtual base classes (10.1),

has the same access control (Clause 11) for all non-static data members,

has no non-standard-layout base classes,

either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and

has no base classes of the same type as the first non-static data member.

根据此定义 Result<T> 是标准布局，前提是：

error_type 和 T 都是标准布局。请注意，并非 std::exception_ptr 的保证，尽管在实践中可能如此。
T 不是 ResultBase。

§9.2[class.mem]/20 指出：

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note ]

这意味着空基 class 优化对于标准布局类型是强制性的。假设 Result<T> 确实有标准布局，ResultBase 中的 this 保证指向 Result<T>.

中的第一个字段

9.5[class.union]/1 状态：

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...] Each non-static data member is allocated as if it were the sole member of a struct.

另外§3.10[basic.lval]/10:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined

the dynamic type of the object,

a cv-qualified version of the dynamic type of the object,

a type similar (as defined in 4.4) to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to the dynamic type of the object,

a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

a char or unsigned char type.

这保证 reinterpret_cast<const error_type*>(this) 将产生指向 mError 字段的有效指针。

撇开所有争议不谈，这项技术看起来很便携。请记住形式限制：error_type 和 T 必须是标准布局，并且 T 可能不是类型 ResultBase.

旁注：在大多数编译器（至少是 GCC、Clang 和 MSVC）上，非标准布局类型也可以工作。只要 Result<T> 具有可预测的布局，错误和结果类型就无关紧要。

从基 class 访问联合的公共部分

Accessing common part of an union from base class

c++

strict-aliasing

c++11