通过指针反序列化时boost::serialization如何分配内存?

How does boost::serialization allocate memory when deserializing through a pointer?

简而言之,我想知道boost::serialization如何在通过指针反序列化时为对象分配内存。在下面,您将找到我的问题的一个示例,它与配套代码一起清楚地说明了。这段代码应该功能齐全并且可以正常编译,本身没有错误,只是关于代码实际工作方式的问题。

#include <cstddef> // NULL
#include <iomanip>
#include <iostream>
#include <fstream>
#include <string>

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>

class non_default_constructor; // Forward declaration for boost serialization namespacing below


// In order to "teach" boost how to save and load your class with a non-default-constructor, you must override these functions
// in the boost::serialization namespace. Prototype them here.
namespace boost { namespace serialization {
    template<class Archive>
    inline void save_construct_data(Archive& ar, const non_default_constructor* ndc, const unsigned int version);
    template<class Archive>
    inline void load_construct_data(Archive& ar, non_default_constructor* ndc, const unsigned int version);
}}

// Here is the actual class definition with no default constructor
class non_default_constructor
{
public:
    explicit non_default_constructor(std::string initial)
    : some_initial_value{initial}, state{0}
    {

    }

    std::string get_initial_value() const { return some_initial_value; } // For save_construct_data

private:
    std::string some_initial_value;
    int state;

    // Notice that we only serialize state here, not the
    // some_initial_value passed into the ctor
    friend class boost::serialization::access;
    template<class Archive>
    void serialize(Archive& ar, const unsigned int version)
    {
        std::cout << "serialize called" << std::endl;
        ar & state;
    }
};

// Define the save and load overides here.
namespace boost { namespace serialization {
    template<class Archive>
    inline void save_construct_data(Archive& ar, const non_default_constructor* ndc, const unsigned int version)
    {
        std::cout << "save_construct_data called." << std::endl;
        ar << ndc->get_initial_value();
    }
    template<class Archive>
    inline void load_construct_data(Archive& ar, non_default_constructor* ndc, const unsigned int version)
    {
        std::cout << "load_construct_data called." << std::endl;
        std::string some_initial_value;
        ar >> some_initial_value;

        // Use placement new to construct a non_default_constructor class at the address of ndc
        ::new(ndc)non_default_constructor(some_initial_value);
    }
}}


int main(int argc, char *argv[])
{

    // Now lets say that we want to save and load a non_default_constructor class through a pointer.

    non_default_constructor* my_non_default_constructor = new non_default_constructor{"initial value"};

    std::ofstream outputStream("non_default_constructor.dat");
    boost::archive::text_oarchive outputArchive(outputStream);
    outputArchive << my_non_default_constructor;

    outputStream.close();

    // The above is all fine and dandy. We've serialized an object through a pointer.
    // non_default_constructor will call save_construct_data then will call serialize()

    // The output archive file will look exactly like this:

    /*
        22 serialization::archive 17 0 1 0
        0 13 initial value 0
    */


    /*If I want to load that class back into an object at a later time
    I'd declare a pointer to a non_default_constructor */
    non_default_constructor* load_from_archive;

    // Notice load_from_archive was not initialized with any value. It doesn't make
    // sense to intialize it with a value, because we're trying to load from
    // a file, not create a whole new object with "new".

    std::ifstream inputStream("non_default_constructor.dat");
    boost::archive::text_iarchive inputArchive(inputStream);

    // <><><> HERE IS WHERE I'M CONFUSED <><><>
    inputArchive >> load_from_archive;

    // The above should call load_construct_data which will attempt to
    // construct a non_default_constructor object at the address of
    // load_from_archive, but HOW DOES IT KNOW HOW MUCH MEMORY A NON_DEFAULT_CONSTRUCTOR
    // class uses?? Placement new just constructs at the address, assuming
    // memory at the passed address has been allocated for construction.

    // So my question is this:
    // I want to verify that *something* is (or isn't) allocating memory for a non_default_constructor
    // class to be constructed at the address of load_from_archive.

    std::cout << load_from_archive->get_initial_value() << std::endl; // This works.

    return 0;

}

根据要(反)序列化 boost::serialization documentation when a class with a non-default constructor,使用 load/save_construct_data,但我实际上没有看到为要加载的对象分配内存的地方,就在 placement new 正在内存地址处构造对象的地方。但是是什么在该地址分配了内存?

可能是对这条线的工作方式有误解:

::new(ndc)non_default_constructor(some_initial_value);

但我想知道我的误会在哪里。这是我的第一个问题,所以如果我在提问的方式上犯了某种错误,我深表歉意。谢谢你的时间。

这是一个很好的示例程序,带有非常贴切的注释。让我们深入挖掘。

// In order to "teach" boost how to save and load your class with a
// non-default-constructor, you must override these functions in the
// boost::serialization namespace. Prototype them here.

你不必。除了 in-class 选项之外,任何可通过 ADL 访问的重载(不是覆盖)就足够了。

直接跳到正题:

// So my question is this: I want to verify that *something* is (or isn't)
// allocating memory for a non_default_constructor
// class to be constructed at the address of load_from_archive.

是的。文档说明了这一点。但这有点棘手,因为它是有条件的。原因是对象跟踪。比如说,我们序列化多个指向同一个对象的指针,它们将被序列化一次。

反序列化时,对象将在存档流中以对象跟踪 ID 表示。只有第一个实例会导致分配。

参见 documentation


这是一个简化的反例:

  • 展示日常生活能力
  • 演示对象跟踪
  • 删除所有前向声明(由于 template POI 它们是不必要的)

它用指针的 10 个副本序列化一个向量。我使用 unique_ptr 来避免泄漏实例(在 main 中手动创建的实例,以及反序列化创建的实例)。

Live On Coliru

#include <iomanip>
#include <iostream>
#include <fstream>

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/vector.hpp>

namespace mylib {
    // Here is the actual class definition with no default constructor
    class non_default_constructor {
      public:
        explicit non_default_constructor(std::string initial)
                : some_initial_value{ initial }, state{ 0 } {}

        std::string get_initial_value() const {
            return some_initial_value;
        } // For save_construct_data

      private:
        std::string some_initial_value;
        int state;

        // Notice that we only serialize state here, not the some_initial_value
        // passed into the ctor
        friend class boost::serialization::access;
        template <class Archive> void serialize(Archive& ar, unsigned) {
            std::cout << "serialize called" << std::endl;
            ar& state;
        }
    };

    // Define the save and load overides here.
    template<class Archive>
    inline void save_construct_data(Archive& ar, const non_default_constructor* ndc, unsigned)
    {
        std::cout << "save_construct_data called." << std::endl;
        ar << ndc->get_initial_value();
    }
    template<class Archive>
    inline void load_construct_data(Archive& ar, non_default_constructor* ndc, unsigned)
    {
        std::cout << "load_construct_data called." << std::endl;
        std::string some_initial_value;
        ar >> some_initial_value;

        // Use placement new to construct a non_default_constructor class at the address of ndc
        ::new(ndc)non_default_constructor(some_initial_value);
    }
}

int main() {
    using NDC = mylib::non_default_constructor;
    auto owned = std::make_unique<NDC>("initial value");

    {
        std::ofstream outputStream("vector.dat");
        boost::archive::text_oarchive outputArchive(outputStream);

        // serialize 10 copues, for fun
        std::vector v(10, owned.get());
        outputArchive << v;
    }

    /*
        22 serialization::archive 17 0 0 10 0 1 1 0
        0 13 initial value 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
    */

    std::vector<NDC*> restore;

    {
        std::ifstream inputStream("vector.dat");
        boost::archive::text_iarchive inputArchive(inputStream);

        inputArchive >> restore;
    }

    std::unique_ptr<NDC> take_ownership(restore.front());
    for (auto& el : restore) {
        assert(el == take_ownership.get());
    }

    std::cout << "restored: " << restore.size() << " copies with " << 
        std::quoted(take_ownership->get_initial_value()) << "\n";
}

版画

save_construct_data called.
serialize called
load_construct_data called.
serialize called
restored: 10 copies with "initial value"

vector.dat 文件包含:

22 serialization::archive 17 0 0 10 0 1 1 0
0 13 initial value 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

图书馆内部

你不应该很在意,但你当然可以阅读源代码。可以预见的是,它比您天真地预期的要复杂得多,毕竟:这是 C++

库处理重载的类型 operator new。在这种情况下,它调用 T::operator new 而不是全局变量 operator new。正如您正确推测的那样,它总是通过 sizeof(T)

代码位于异常安全包装器中:detail/iserializer.hpp

struct heap_allocation {
    explicit heap_allocation() { m_p = invoke_new(); }
    ~heap_allocation() {
        if (0 != m_p)
            invoke_delete(m_p);
    }
    T* get() const { return m_p; }

    T* release() {
        T* p = m_p;
        m_p = 0;
        return p;
    }

  private:
    T* m_p;
};

是的,使用 C++11 或更高版本可以简化此代码。此外,析构函数中的 NULL-guard 对于 operator delete.

的兼容实现是多余的

现在当然是 invoke_newinvoke_delete 了。呈现浓缩:

    static T* invoke_new() {
        typedef typename mpl::eval_if<boost::has_new_operator<T>,
                mpl::identity<has_new_operator>,
                mpl::identity<doesnt_have_new_operator>>::type typex;
        return typex::invoke_new();
    }
    static void invoke_delete(T* t) {
        typedef typename mpl::eval_if<boost::has_new_operator<T>,
                mpl::identity<has_new_operator>,
                mpl::identity<doesnt_have_new_operator>>::type typex;
        typex::invoke_delete(t);
    }
    struct has_new_operator {
        static T* invoke_new() { return static_cast<T*>((T::operator new)(sizeof(T))); }
        static void invoke_delete(T* t) { (operator delete)(t); }
    };
    struct doesnt_have_new_operator {
        static T* invoke_new() { return static_cast<T*>(operator new(sizeof(T))); }
        static void invoke_delete(T* t) { (operator delete)(t); }
    };

有一些条件编译和冗长的注释,所以如果你想要完整的图片,请使用源代码。