yaml-cpp 的主要性能问题
Major Performance Issues with yaml-cpp
所以我正在使用 yaml-cpp
以便能够在 c++ 中将 yaml 用于我的游戏数据文件,但是我 运行 遇到了一些主要的性能问题。
我想测试一个有点大的文件,所以我创建了一些虚拟数据来写出:
Player newPlayer = Player();
newPlayer.name = "new player";
newPlayer.maximumHealth = 1000;
newPlayer.currentHealth = 1;
Inventory newInventory;
newInventory.maximumWeight = 10.9f;
for (int z = 0; z < 10000; z++) {
InventoryItem* newItem = new InventoryItem();
newItem->name = "Stone";
newItem->baseValue = 1;
newItem->weight = 0.1f;
newInventory.items.push_back(newItem);
}
YAML::Node newSavedGame;
newSavedGame["player"] = newPlayer;
newSavedGame["inventory"] = newInventory;
然后我编写了这个函数来获取数据并将其写入文件:
void YamlUtility::saveAsFile(YAML::Node node, std::string filePath) {
std::ofstream myfile;
myfile.open(filePath);
myfile << node << std::endl;
myfile.close();
}
现在,在我添加这段代码之前,我的游戏内存使用量约为 22MB。在我添加 newPlayer
、newInventory
和 InventoryItems
之后,它的大小达到了大约 23MB。然后当我在 YAML::Node newSavedGame
中添加时,内存上升到 108MB。另外,写出的文件只有 570KB,所以我想不出为什么它会将内存增加 85MB。
第二个问题是这段代码写入文件大约需要8秒。这对我来说有点不对劲。
我决定使用 YAML::Emitter
重写保存函数,代码如下所示:
static void buildYamlManually(std::ofstream& file, YAML::Node node) {
YAML::Emitter out;
out << YAML::BeginMap << YAML::Key << "player" << YAML::Value << YAML::BeginMap << YAML::Key << "name" << YAML::Value
<< node["player"]["name"].as<std::string>() << YAML::Key << "maximumHealth" << YAML::Value
<< node["player"]["maximumHealth"].as<int>() << YAML::Key << "currentHealth" << YAML::Value
<< node["player"]["currentHealth"].as<int>() << YAML::EndMap;
out << YAML::BeginSeq;
std::vector<InventoryItem*> items = node["inventory"]["items"].as<std::vector<InventoryItem*>>();
for (InventoryItem* const value : items) {
out << YAML::BeginMap << YAML::Key << "name" << YAML::Value << value->name << YAML::Key << "baseValue"
<< YAML::Value << value->baseValue << YAML::Key << "weight" << YAML::Value << value->weight << YAML::EndMap;
}
out << YAML::EndSeq;
out << YAML::EndMap;
file << out.c_str() << std::endl;
}
这似乎对性能影响很小,但保存文件的时间仍接近 7 秒(而不是 8 秒)。
然后我决定看看如果我完全不使用 yaml-cpp
手动编写文件会是什么样子,该代码如下所示:
static void buildYamlManually(std::ofstream& file, SavedGame savedGame) {
file << "player: \n"
<< " name: " << savedGame.player.name << "\n maximumHealth: " << savedGame.player.maximumHealth
<< "\n currentHealth: " << savedGame.player.currentHealth << "\ninventory:"
<< "\n maximumWeight: " << savedGame.inventory.maximumWeight << "\n items:";
for (InventoryItem* const value : savedGame.inventory.items) {
file << "\n - name: " << value->name << "\n baseValue: " << value->baseValue
<< "\n weight: " << value->weight;
}
}
删除此代码和所有 yaml-cpp
代码后,内存从 23MB 变为 24MB,写入文件大约需要 0.15 秒。
虽然我理解使用 yaml-cpp
与手动处理文件(就像文本一样)会产生一些开销,但这种性能差异似乎是错误的。
我想说我做错了什么,但根据 yaml-cpp
文档,我看不出那可能是什么。
您需要提供一个完整的示例来实际演示问题。我一直想尝试 yaml-cpp,所以今天早上我试图重现您的问题,但未能成功。使用下面与您提供的代码片段非常相似的代码,在我的 VM 中编写文件花费了大约 0.06 秒。看起来问题不是 yaml-cpp 固有的,而是您代码中的某个地方。
#include <string>
#include <vector>
#include <iostream>
#include <yaml-cpp/yaml.h>
#include <fstream>
#include <chrono>
class Player
{
public:
Player(const std::string& name, int maxHealth, int curHealth) :
m_name(name),
m_maxHealth(maxHealth),
m_currentHealth(curHealth)
{
}
const std::string& name() const { return m_name;}
int maxHealth() const { return m_maxHealth; }
int currentHealth() const { return m_currentHealth; }
private:
const std::string m_name;
int m_maxHealth;
int m_currentHealth;
};
class Item
{
public:
Item(const std::string& name, int value, double weight) :
m_name(name),
m_value(value),
m_weight(weight)
{
}
const std::string& name() const { return m_name; }
int value() const { return m_value; }
double maxWeight() const { return m_weight; }
private:
const std::string m_name;
int m_value;
double m_weight;
};
class Inventory
{
public:
Inventory(double maxWeight) :
m_maxWeight(maxWeight)
{
m_items.reserve(10'000);
}
std::vector<Item>& items() { return m_items;}
const std::vector<Item>& items() const { return m_items;}
double maxWeight() const { return m_maxWeight; };
private:
double m_maxWeight;
std::vector<Item> m_items;
};
namespace YAML
{
template<>
struct convert<Inventory>
{
static Node encode(const Inventory& rhs)
{
Node node;
node.push_back(rhs.maxWeight());
for(const auto& item : rhs.items())
{
node.push_back(item.name());
node.push_back(item.value());
node.push_back(item.maxWeight());
}
return node;
}
// TODO decode Inventory
};
template<>
struct convert<Player>
{
static Node encode(const Player& rhs)
{
Node node;
node.push_back(rhs.name());
node.push_back(rhs.maxHealth());
node.push_back(rhs.currentHealth());
return node;
}
//TODO Decode Player
};
}
void saveAsFile(const YAML::Node& node, const std::string& filePath)
{
std::ofstream myFile(filePath);
myFile << node << std::endl;
}
int main(int arg, char **argv)
{
Player newPlayer("new player", 1'000, 1);
Inventory newInventory(10.9f);
for(int z = 0; z < 10'000; z++)
{
newInventory.items().emplace_back("Stone", 1, 0.1f);
}
std::cout << "Inventory has " << newInventory.items().size() << " items\n";
YAML::Node newSavedGame;
newSavedGame["player"] = newPlayer;
newSavedGame["inventory"] = newInventory;
//Measure it
auto start = std::chrono::high_resolution_clock::now();
saveAsFile(newSavedGame, "/tmp/save.yaml");
auto end = std::chrono::high_resolution_clock::now();
std::cout << "Wrote to file in "
<< std::chrono::duration_cast<std::chrono::duration<double>>(end - start).count()
<< " seconds\n";
return 0;
}
输出:
user@mintvm ~/Desktop/yaml $ g++ -std=c++14 -o test main.cpp -lyaml-cpp
user@mintvm ~/Desktop/yaml $ ./test
Inventory has 10000 items
Wrote to file in 0.0628495 second
更新编辑(来自 Michael Goldshteyn):
我想 运行 在本机而不是 VM 上执行此操作,以表明实际上上述代码在使用适当的优化、正确的时间和 运行 本机(即,不在 VM 中):
$ # yaml-cpp built from source commit: * c90c08cThu Aug 9 10:05:07 2018 -0500
$ # (HEAD -> master, origin/master, origin/HEAD)
$ # Revert "Improvements to CMake buildsystem (#563)"
$ # - Lib was built Release with flags: -std=c++17 -O3 -march=native -mtune=native
$ # Benchmark hardware info
$ # -----------------------
$ # CPU: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz
$ # Kernel: 4.4.0-131-generic #157-Ubuntu SMP
$ # gcc: gcc (Debian 8.1.0-9) 8.1.0
$
$ # And away we go:
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$ g++ -std=c++17 -O3 -march=native -mtune=native -o yamltest yamltest.cpp -lyaml-cpp
$ ./yamltest
Inventory has 10000 items
After 100 saveAsFile() iterations, the average execution time
per iteration was 0.0521697 seconds.
所以我正在使用 yaml-cpp
以便能够在 c++ 中将 yaml 用于我的游戏数据文件,但是我 运行 遇到了一些主要的性能问题。
我想测试一个有点大的文件,所以我创建了一些虚拟数据来写出:
Player newPlayer = Player();
newPlayer.name = "new player";
newPlayer.maximumHealth = 1000;
newPlayer.currentHealth = 1;
Inventory newInventory;
newInventory.maximumWeight = 10.9f;
for (int z = 0; z < 10000; z++) {
InventoryItem* newItem = new InventoryItem();
newItem->name = "Stone";
newItem->baseValue = 1;
newItem->weight = 0.1f;
newInventory.items.push_back(newItem);
}
YAML::Node newSavedGame;
newSavedGame["player"] = newPlayer;
newSavedGame["inventory"] = newInventory;
然后我编写了这个函数来获取数据并将其写入文件:
void YamlUtility::saveAsFile(YAML::Node node, std::string filePath) {
std::ofstream myfile;
myfile.open(filePath);
myfile << node << std::endl;
myfile.close();
}
现在,在我添加这段代码之前,我的游戏内存使用量约为 22MB。在我添加 newPlayer
、newInventory
和 InventoryItems
之后,它的大小达到了大约 23MB。然后当我在 YAML::Node newSavedGame
中添加时,内存上升到 108MB。另外,写出的文件只有 570KB,所以我想不出为什么它会将内存增加 85MB。
第二个问题是这段代码写入文件大约需要8秒。这对我来说有点不对劲。
我决定使用 YAML::Emitter
重写保存函数,代码如下所示:
static void buildYamlManually(std::ofstream& file, YAML::Node node) {
YAML::Emitter out;
out << YAML::BeginMap << YAML::Key << "player" << YAML::Value << YAML::BeginMap << YAML::Key << "name" << YAML::Value
<< node["player"]["name"].as<std::string>() << YAML::Key << "maximumHealth" << YAML::Value
<< node["player"]["maximumHealth"].as<int>() << YAML::Key << "currentHealth" << YAML::Value
<< node["player"]["currentHealth"].as<int>() << YAML::EndMap;
out << YAML::BeginSeq;
std::vector<InventoryItem*> items = node["inventory"]["items"].as<std::vector<InventoryItem*>>();
for (InventoryItem* const value : items) {
out << YAML::BeginMap << YAML::Key << "name" << YAML::Value << value->name << YAML::Key << "baseValue"
<< YAML::Value << value->baseValue << YAML::Key << "weight" << YAML::Value << value->weight << YAML::EndMap;
}
out << YAML::EndSeq;
out << YAML::EndMap;
file << out.c_str() << std::endl;
}
这似乎对性能影响很小,但保存文件的时间仍接近 7 秒(而不是 8 秒)。
然后我决定看看如果我完全不使用 yaml-cpp
手动编写文件会是什么样子,该代码如下所示:
static void buildYamlManually(std::ofstream& file, SavedGame savedGame) {
file << "player: \n"
<< " name: " << savedGame.player.name << "\n maximumHealth: " << savedGame.player.maximumHealth
<< "\n currentHealth: " << savedGame.player.currentHealth << "\ninventory:"
<< "\n maximumWeight: " << savedGame.inventory.maximumWeight << "\n items:";
for (InventoryItem* const value : savedGame.inventory.items) {
file << "\n - name: " << value->name << "\n baseValue: " << value->baseValue
<< "\n weight: " << value->weight;
}
}
删除此代码和所有 yaml-cpp
代码后,内存从 23MB 变为 24MB,写入文件大约需要 0.15 秒。
虽然我理解使用 yaml-cpp
与手动处理文件(就像文本一样)会产生一些开销,但这种性能差异似乎是错误的。
我想说我做错了什么,但根据 yaml-cpp
文档,我看不出那可能是什么。
您需要提供一个完整的示例来实际演示问题。我一直想尝试 yaml-cpp,所以今天早上我试图重现您的问题,但未能成功。使用下面与您提供的代码片段非常相似的代码,在我的 VM 中编写文件花费了大约 0.06 秒。看起来问题不是 yaml-cpp 固有的,而是您代码中的某个地方。
#include <string>
#include <vector>
#include <iostream>
#include <yaml-cpp/yaml.h>
#include <fstream>
#include <chrono>
class Player
{
public:
Player(const std::string& name, int maxHealth, int curHealth) :
m_name(name),
m_maxHealth(maxHealth),
m_currentHealth(curHealth)
{
}
const std::string& name() const { return m_name;}
int maxHealth() const { return m_maxHealth; }
int currentHealth() const { return m_currentHealth; }
private:
const std::string m_name;
int m_maxHealth;
int m_currentHealth;
};
class Item
{
public:
Item(const std::string& name, int value, double weight) :
m_name(name),
m_value(value),
m_weight(weight)
{
}
const std::string& name() const { return m_name; }
int value() const { return m_value; }
double maxWeight() const { return m_weight; }
private:
const std::string m_name;
int m_value;
double m_weight;
};
class Inventory
{
public:
Inventory(double maxWeight) :
m_maxWeight(maxWeight)
{
m_items.reserve(10'000);
}
std::vector<Item>& items() { return m_items;}
const std::vector<Item>& items() const { return m_items;}
double maxWeight() const { return m_maxWeight; };
private:
double m_maxWeight;
std::vector<Item> m_items;
};
namespace YAML
{
template<>
struct convert<Inventory>
{
static Node encode(const Inventory& rhs)
{
Node node;
node.push_back(rhs.maxWeight());
for(const auto& item : rhs.items())
{
node.push_back(item.name());
node.push_back(item.value());
node.push_back(item.maxWeight());
}
return node;
}
// TODO decode Inventory
};
template<>
struct convert<Player>
{
static Node encode(const Player& rhs)
{
Node node;
node.push_back(rhs.name());
node.push_back(rhs.maxHealth());
node.push_back(rhs.currentHealth());
return node;
}
//TODO Decode Player
};
}
void saveAsFile(const YAML::Node& node, const std::string& filePath)
{
std::ofstream myFile(filePath);
myFile << node << std::endl;
}
int main(int arg, char **argv)
{
Player newPlayer("new player", 1'000, 1);
Inventory newInventory(10.9f);
for(int z = 0; z < 10'000; z++)
{
newInventory.items().emplace_back("Stone", 1, 0.1f);
}
std::cout << "Inventory has " << newInventory.items().size() << " items\n";
YAML::Node newSavedGame;
newSavedGame["player"] = newPlayer;
newSavedGame["inventory"] = newInventory;
//Measure it
auto start = std::chrono::high_resolution_clock::now();
saveAsFile(newSavedGame, "/tmp/save.yaml");
auto end = std::chrono::high_resolution_clock::now();
std::cout << "Wrote to file in "
<< std::chrono::duration_cast<std::chrono::duration<double>>(end - start).count()
<< " seconds\n";
return 0;
}
输出:
user@mintvm ~/Desktop/yaml $ g++ -std=c++14 -o test main.cpp -lyaml-cpp
user@mintvm ~/Desktop/yaml $ ./test
Inventory has 10000 items
Wrote to file in 0.0628495 second
更新编辑(来自 Michael Goldshteyn):
我想 运行 在本机而不是 VM 上执行此操作,以表明实际上上述代码在使用适当的优化、正确的时间和 运行 本机(即,不在 VM 中):
$ # yaml-cpp built from source commit: * c90c08cThu Aug 9 10:05:07 2018 -0500
$ # (HEAD -> master, origin/master, origin/HEAD)
$ # Revert "Improvements to CMake buildsystem (#563)"
$ # - Lib was built Release with flags: -std=c++17 -O3 -march=native -mtune=native
$ # Benchmark hardware info
$ # -----------------------
$ # CPU: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz
$ # Kernel: 4.4.0-131-generic #157-Ubuntu SMP
$ # gcc: gcc (Debian 8.1.0-9) 8.1.0
$
$ # And away we go:
$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
$ g++ -std=c++17 -O3 -march=native -mtune=native -o yamltest yamltest.cpp -lyaml-cpp
$ ./yamltest
Inventory has 10000 items
After 100 saveAsFile() iterations, the average execution time
per iteration was 0.0521697 seconds.