将链递归插入内存失败

Recursive insert of a chain into memory fails

这可能是一个很长的问题,但我希望有人能帮助我找出问题所在。

我正在使用我自己的数据类型将一个 JSON 对象插入到已分配的内存中,该数据类型基本上包含一个与数据的联合和一个 ptrdiff_t 到 8 位步骤中的下一个数据类型。

template <typename T>
class BaseType
{
public:
    BaseType();
    explicit BaseType(T& t);
    explicit BaseType(const T& t);

    ~BaseType();
    inline void setNext(const ptrdiff_t& next);
    inline std::ptrdiff_t getNext();
    inline void setData(T& t);
    inline void setData(const T& t);
    inline T getData() const;

protected:
    union DataUnion
    {
        T data;
        ::std::ptrdiff_t size;

        DataUnion()
        {
            memset(this, 0, sizeof(DataUnion));
        } //init with 0
        explicit DataUnion(T& t);
        explicit DataUnion(const T& t);
    } m_data;

    long long m_next;
};

实现是直接的,所以没有什么特别的,只是 setting/getting 定义的值。 (我将跳过 impl。这里)

所以这里开始出现问题的代码:

std::pair<void*, void*> Page::insertObject(const rapidjson::GenericValue<rapidjson::UTF8<>>& value,
         BaseType<size_t>* last)
 {
     //return ptr to the first element
     void* l_ret = nullptr;
     //prev element ptr
     BaseType<size_t>* l_prev = last;

     //position pointer
     void* l_pos = nullptr;
     //get the members
     for (auto it = value.MemberBegin(); it != value.MemberEnd(); ++it)
     {
         switch (it->value.GetType())
         {
             case rapidjson::kNullType:
                 LOG_WARN << "null type: " << it->name.GetString();
                 continue;

             case rapidjson::kFalseType:
             case rapidjson::kTrueType:
                 {
                     l_pos = find(sizeof(BaseType<bool>));

                     void* l_new = new (l_pos) BaseType<bool>(it->value.GetBool());

                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             case rapidjson::kObjectType:
                 {
                     //pos for the obj id
                     //and insert the ID of the obj
                     l_pos = find(sizeof(BaseType<size_t>));
                     std::string name = it->name.GetString();
                     void* l_new = new (l_pos) BaseType<size_t>(common::FNVHash()(name));

                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                     //TODO something strange happens here!

                     // pass the objid Object to the insertobj!
                     // now recursive insert the obj
                     // the second contains the last element inserted
                     // l_pos current contains the last inserted element and get set to the
                     // last element of the obj we insert
                     l_pos = (insertObject(it->value, reinterpret_cast<BaseType<size_t>*>(l_new)).second);
                 }
                 break;

             case rapidjson::kArrayType:
                 {//skip this at the moment till the bug is fixed
                 }
                 break;

             case rapidjson::kStringType:
                 {
                     // find pos where the string fits
                     // somehow we get here sometimes and it does not fit!
                     // which cant be since we lock the whole page
                     l_pos = find(sizeof(StringType) + strlen(it->value.GetString()));

                     //add the String Type at the pos of the FreeType
                     auto* l_new = new (l_pos) StringType(it->value.GetString());
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;

             case rapidjson::kNumberType:
                 {
                     //doesnt matter since long long and double are equal on x64
                     //find pos where the string fits
                     l_pos = find(sizeof(BaseType<long long>));

                     void* l_new;
                     if (it->value.IsInt())
                     {
                         //insert INT
                         l_new = new (l_pos) BaseType<long long>(it->value.GetInt64());
                     }
                     else
                     {
                         //INSERT DOUBLE
                         l_new = new (l_pos) BaseType<double>(it->value.GetDouble());
                     }
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             default:
                 LOG_WARN << "Unknown member Type: " << it->name.GetString() << ":" << it->value.GetType();
                 continue;
         }
         //so first element is set now, store it to return it.
         if(l_ret == nullptr)
         {
             l_ret = l_pos;
         }
         //prev is the l_pos now so cast it to this;
         l_prev = reinterpret_cast<BaseType<size_t>*>(l_pos);
     }
     //if we get here its in!
     return{ l_ret, l_pos };
 }

我开始这样插入:

auto firstElementPos = insertObject(value.MemberBegin()->value, nullptr).first;

虽然 value.MemberBegin()->value 是要插入的对象,而 ->name 保存对象的名称。在下面的例子中,它的 Person 和 {}.

之间的所有内容

问题是,如果我插入一个 JSON 对象,其中有一个对象,如下所示:

"Person":
{
    "age":25,
    "double": 23.23,
    "boolean": true,
    "double2": 23.23,
    "firstInnerObj":{
        "innerDoub": 12.12
    }   
}

它工作正常,我可以重现对象。但是如果我有更多这样的内部对象:

"Person":
{
    "age":25,
    "double": 23.23,
    "boolean": true,
    "double2": 23.23,
    "firstInnerObj":{
        "innerDoub": 12.12
    },
    "secondInnerObj":{
        "secInnerDoub": 12.12
    }
}

它失败了,我丢失了数据,所以我认为我的递归出错了,但我不明白为什么。如果您需要更多信息,请告诉我。我来看看here and the client here.

test.json 需要像上面那样包含一个 json 对象。而find只需要包含{"oid__":2}就可以得到插入的第二个对象


我可以将问题追溯到我在代码中递归地重新创建对象的那一点。某些 Nextpointer 似乎不正确:

    void* Page::buildObject(const size_t& hash, void* start, rapidjson::Value& l_obj,
                            rapidjson::MemoryPoolAllocator<>& aloc)
    {
        //get the meta information of the object type
        //to build it
        auto& l_metaIdx = meta::MetaIndex::getInstance();
        //get the meta dataset
        auto& l_meta = l_metaIdx[hash];

        //now we are already in an object here with l_obj!
        auto l_ptr = start;
        for (auto it = l_meta->begin(); it != l_meta->end(); ++it)
        {
            //create the name value
            rapidjson::Value l_name(it->name.c_str(), it->name.length(), aloc);
            //create the value we are going to add
            rapidjson::Value l_value;
            //now start building it up again
            switch (it->type)
            {
                case meta::OBJECT:
                    {
                        auto l_data = static_cast<BaseType<size_t>*>(l_ptr);
                        //get the hash to optain the metadata
                        auto l_hash = l_data->getData();
                        //set to object and create the inner object
                        l_value.SetObject();

                        //get the start pointer which is the "next" element
                        //and call recursive
                        l_ptr = static_cast<BaseType<size_t>*>(buildObject(l_hash,
                                                               (reinterpret_cast<char*>(l_data) + l_data->getNext()), l_value, aloc));
                    }
                    break;
                case meta::ARRAY:
                    {
                        l_value.SetArray();
                        auto l_data = static_cast<ArrayType*>(l_ptr);
                        //get the hash to optain the metadata
                        auto l_size = l_data->size();
                        l_ptr = buildArray(l_size, static_cast<char*>(l_ptr) + l_data->getNext(), l_value, aloc);
                    }
                    break;
                case meta::INT:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<long long>*>(l_ptr);
                        //with length attribute it's faster ;)
                        l_value = l_data->getData();
                    }
                    break;
                case meta::DOUBLE:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<double>*>(l_ptr);
                        //with length attribute it's faster ;)
                        l_value = l_data->getData();
                    }
                    break;
                case meta::STRING:
                    {
                        //create the data
                        auto l_data = static_cast<StringType*>(l_ptr);
                        //with length attribute it's faster
                        l_value.SetString(l_data->getString()->c_str(), l_data->getString()->length(), aloc);
                    }
                    break;
                case meta::BOOL:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<bool>*>(l_ptr);
                        l_value = l_data->getData();
                    }
                    break;
                default:
                    break;
            }
            l_obj.AddMember(l_name, l_value, aloc);
            //update the lptr
            l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();
        }
        //return the l_ptr which current shows to the next lement. //see line above
        return l_ptr;
    }

经过数小时的调试后,我发现了导致此问题的小问题。插入对象后构建对象的方法 returns 指向 actuall last element->next which was inserted and after the switch case i did call the ->next again 的指针,这会导致数据丢失,因为它在单链表中删除了一个元素。

解决这个问题的方法是把线

l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();

仅适用于不是对象或数组的开关情况。 Fix Commit 这实际上也解决了插入数组的问题。

当然,真正的问题可能不知道这里有人没有深入研究代码,但我仍然想在这里展示修复。感谢@sehe 帮助我们弄清楚这里出了什么问题。