应用记忆会使 golom 序列变慢
Applying memoization makes golom sequence slower
我正在尝试使用 C++ 进行记忆化,我正在尝试使用“golom 序列”做一个示例
int main(int argc, char* argv[])
{
std::unordered_map<int, int> hashTable;
int value = 7;
auto start = std::chrono::high_resolution_clock::now();
std::cout << golomS(4, hashTable) << std::endl;
auto stop = std::chrono::high_resolution_clock::now();
auto start1 = std::chrono::high_resolution_clock::now();
std::cout << golom(4) << std::endl;;
auto stop1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
auto duration1 = std::chrono::duration_cast<std::chrono::microseconds>(stop1 - start1);
std::cout << "Time taken by function 1: "
<< duration.count() << " microseconds" << std::endl;
std::cout << "Time taken by function 2: "
<< duration1.count() << " microseconds" << std::endl;
return 0;
}
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int> golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
作为输出我得到这个:
4
4
函数 1 所用时间:687 微秒 //This is Golom S
函数 2 耗时:242 微秒 //This is Golom
GolomS 不应该通过记忆更快吗?我试着绕过调试器,但我不确定有效的“慢”是从哪里来的。
我的问题是:如何改变我制作 golomS 的方法,使其比 golom 更快。
谢谢你。 -亚当
int golomS(int n, std::unordered_map<int, int> golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
在您的 goloms
函数中,您将 golomb
作为值而不是引用传递。这意味着每次调用 goloms
时,编译器都会制作 golomb
的副本并在超出范围时销毁该副本,而不更改实际的 golomb
值。
您应该改为通过引用传递。
int golomS(int n, std::unordered_map<int, int>& golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
您的代码中有两个问题。第一个主要问题是您按值传递无序映射。您必须通过引用传递它以使其更快,以避免在每次函数调用时复制整个地图。此外,如果您按值传递映射,则 return 不会保留新计算的值,但如果您按引用传递它,则新值将为调用函数带来好处。
第二个重要问题是您使用非常小的值 4
作为函数的输入来测量时间,4
需要很少的时间来计算,您必须使用更大的值,例如50
,因此需要更多的时间来计算,从而给出更准确的时间测量。
与第二个函数相比,您的代码的固定版本已经大大改进了第一个函数的计时(对于输入 50
):
13
13
Time taken by function 1: 110 microseconds
Time taken by function 2: 118345 microseconds
固定码:
#include <unordered_map>
#include <iostream>
#include <chrono>
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int> & golomb)
{
if(n == 1)
return 1;
if(golomb.find(n) == golomb.end())
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
return golomb[n];
}
int main(int argc, char* argv[])
{
std::unordered_map<int, int> hashTable;
auto start = std::chrono::high_resolution_clock::now();
std::cout << golomS(50, hashTable) << std::endl;
auto stop = std::chrono::high_resolution_clock::now();
auto start1 = std::chrono::high_resolution_clock::now();
std::cout << golom(50) << std::endl;;
auto stop1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
auto duration1 = std::chrono::duration_cast<std::chrono::microseconds>(stop1 - start1);
std::cout << "Time taken by function 1: "
<< duration.count() << " microseconds" << std::endl;
std::cout << "Time taken by function 2: "
<< duration1.count() << " microseconds" << std::endl;
return 0;
}
除了其他答案之外,我想补充一点,这真的可以从适当的基准测试中受益。
为了获得可靠的结果,您希望 运行 多次测试,并采取措施确保内存缓存和其他系统级恶作剧不会干扰结果。
值得庆幸的是,有一些库可以为我们处理大部分这些复杂问题。例如,下面是使用 google 的 Benchmark 库的更好的代码基准测试:
N.B。已整合其他答案提出的修复。
#include <chrono>
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int>& golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
static void TestGolomb(benchmark::State& state) {
for (auto _ : state) {
benchmark::DoNotOptimize(golom(state.range(0)));
}
}
BENCHMARK(TestGolomb)->DenseRange(1, 17, 2);
static void TestGolombS(benchmark::State& state) {
for (auto _ : state) {
// Make sure we always start from a fresh map
std::unordered_map<int, int> golomb;
// Ignore construction and destruction of the map
auto start = std::chrono::high_resolution_clock::now();
benchmark::DoNotOptimize(golomS(state.range(0), golomb));
auto end = std::chrono::high_resolution_clock::now();
auto elapsed_seconds =
std::chrono::duration_cast<std::chrono::duration<double>>(
end - start);
state.SetIterationTime(elapsed_seconds.count());
}
}
BENCHMARK(TestGolombS)->DenseRange(1, 17, 2);
这为我们提供了以下配置文件,表明盈亏平衡点在 14-15 左右:
很好。请尝试并继续了解记忆。
我再给一个方法。您还可以在 DP table 中“记忆”值。 DP喜欢动态规划。
如果您追求速度并且能够承受大内存消耗,那么这是最快和最有效的解决方案。对该序列使用递归基本上是一个坏主意,会导致代码非常缓慢和潜在的堆栈溢出。
在 DP table 中将递归转换为带有记忆的迭代解决方案,会给你一个非常快的结果。即使是大数字。启用编译器优化后,您可以计算完整的序列,例如高达 100000 ultrafast。
见下文:
#include <iostream>
#include <array>
#include <cstdint>
// We will calculate a Golomb sequence up to the below specified element
constexpr size_t MaxValues = 100000;
using MyDataType = uint_fast16_t;
using GolombSequence = std::array<MyDataType, MaxValues>;
// This is the Golomb Sequence
GolombSequence golombSequence{};
// Iterative Function to calculate a Golomb sequence using a DP table
void createGolombSequence() {
golombSequence[1] = 1;
for (int i = 2; i < MaxValues; ++i) {
golombSequence[i] = 1 + golombSequence[i - golombSequence[golombSequence[i - 1]]];
}
}
// Test
int main() {
createGolombSequence();
std::cout << golombSequence[777] << '\n';
std::cout << golombSequence[7777] << '\n';
std::cout << golombSequence[77777] << '\n';
}
我们甚至可以更进一步,使用“编译时记忆”,也就是说,我们会在编译时计算出完整的序列。所以,让编译器来做吧。
这当然会导致最快的解决方案,但缺点是内存消耗高,并且受编译器能力的限制。
请看下面。所有 Golomb 值计算都在编译时完成。在运行时,我们只使用超快索引操作:
#include <iostream>
#include <array>
#include <cstdint>
// We will calculate a Golomb sequence up to the below specified element
constexpr size_t MaxValues = 100000u;
using MyDataType = uint_fast16_t;
using GolombSequence = std::array<MyDataType, MaxValues>;
// Iterative Function to calculate a Golomb sequence using a DP table
constexpr GolombSequence createGolombSequence() {
GolombSequence golombSequence{};
golombSequence[1] = 1;
for (int i = 2; i < MaxValues; ++i) {
golombSequence[i] = 1 + golombSequence[i - golombSequence[golombSequence[i - 1]]];
}
return golombSequence;
}
// This is the Golomb Sequence
constexpr GolombSequence golombSequence = createGolombSequence();
// Test
int main() {
std::cout << golombSequence[777] << '\n';
std::cout << golombSequence[7777] << '\n';
std::cout << golombSequence[77777] << '\n';
}
我将我的代码添加到 Franks Benchmark。结果大大优于其他一切。 0ms处的绿线用于上述算法。
使用基准代码
#include <chrono>
#include <iostream>
#include <array>
#include <cstdint>
// We will calculate a Golomb sequence up to the below specified element
constexpr size_t MaxValues = 10000u;
using MyDataType = uint_fast16_t;
using GolombSequence = std::array<MyDataType, MaxValues>;
// Iterative Function to calculate a Golomb sequence using a DP table
constexpr GolombSequence createGolombSequence() {
GolombSequence golombSequence{};
golombSequence[1] = 1;
for (int i = 2; i < MaxValues; ++i) {
golombSequence[i] = 1 + golombSequence[i - golombSequence[golombSequence[i - 1]]];
}
return golombSequence;
}
// This is the Golomb Sequence
constexpr GolombSequence golombSequence = createGolombSequence();
inline int golombFast(size_t i) { return golombSequence[i]; };
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int>& golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
static void TestGolombFast(benchmark::State& state) {
for (auto _ : state) {
benchmark::DoNotOptimize(golombFast(state.range(0)));
}
}
BENCHMARK(TestGolombFast)->DenseRange(1, 17, 2);
static void TestGolomb(benchmark::State& state) {
for (auto _ : state) {
benchmark::DoNotOptimize(golom(state.range(0)));
}
}
BENCHMARK(TestGolomb)->DenseRange(1, 17, 2);
static void TestGolombS(benchmark::State& state) {
for (auto _ : state) {
std::unordered_map<int, int> golomb;
// Be extra-generous, ignore construction and destruction of the set
auto start = std::chrono::high_resolution_clock::now();
benchmark::DoNotOptimize(golomS(state.range(0), golomb));
auto end = std::chrono::high_resolution_clock::now();
auto elapsed_seconds =
std::chrono::duration_cast<std::chrono::duration<double>>(
end - start);
state.SetIterationTime(elapsed_seconds.count());
}
}
BENCHMARK(TestGolombS)->DenseRange(1, 17, 2);
我正在尝试使用 C++ 进行记忆化,我正在尝试使用“golom 序列”做一个示例
int main(int argc, char* argv[])
{
std::unordered_map<int, int> hashTable;
int value = 7;
auto start = std::chrono::high_resolution_clock::now();
std::cout << golomS(4, hashTable) << std::endl;
auto stop = std::chrono::high_resolution_clock::now();
auto start1 = std::chrono::high_resolution_clock::now();
std::cout << golom(4) << std::endl;;
auto stop1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
auto duration1 = std::chrono::duration_cast<std::chrono::microseconds>(stop1 - start1);
std::cout << "Time taken by function 1: "
<< duration.count() << " microseconds" << std::endl;
std::cout << "Time taken by function 2: "
<< duration1.count() << " microseconds" << std::endl;
return 0;
}
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int> golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
作为输出我得到这个:
4
4
函数 1 所用时间:687 微秒 //This is Golom S
函数 2 耗时:242 微秒 //This is Golom
GolomS 不应该通过记忆更快吗?我试着绕过调试器,但我不确定有效的“慢”是从哪里来的。
我的问题是:如何改变我制作 golomS 的方法,使其比 golom 更快。 谢谢你。 -亚当
int golomS(int n, std::unordered_map<int, int> golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
在您的 goloms
函数中,您将 golomb
作为值而不是引用传递。这意味着每次调用 goloms
时,编译器都会制作 golomb
的副本并在超出范围时销毁该副本,而不更改实际的 golomb
值。
您应该改为通过引用传递。
int golomS(int n, std::unordered_map<int, int>& golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
您的代码中有两个问题。第一个主要问题是您按值传递无序映射。您必须通过引用传递它以使其更快,以避免在每次函数调用时复制整个地图。此外,如果您按值传递映射,则 return 不会保留新计算的值,但如果您按引用传递它,则新值将为调用函数带来好处。
第二个重要问题是您使用非常小的值 4
作为函数的输入来测量时间,4
需要很少的时间来计算,您必须使用更大的值,例如50
,因此需要更多的时间来计算,从而给出更准确的时间测量。
与第二个函数相比,您的代码的固定版本已经大大改进了第一个函数的计时(对于输入 50
):
13
13
Time taken by function 1: 110 microseconds
Time taken by function 2: 118345 microseconds
固定码:
#include <unordered_map>
#include <iostream>
#include <chrono>
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int> & golomb)
{
if(n == 1)
return 1;
if(golomb.find(n) == golomb.end())
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
return golomb[n];
}
int main(int argc, char* argv[])
{
std::unordered_map<int, int> hashTable;
auto start = std::chrono::high_resolution_clock::now();
std::cout << golomS(50, hashTable) << std::endl;
auto stop = std::chrono::high_resolution_clock::now();
auto start1 = std::chrono::high_resolution_clock::now();
std::cout << golom(50) << std::endl;;
auto stop1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
auto duration1 = std::chrono::duration_cast<std::chrono::microseconds>(stop1 - start1);
std::cout << "Time taken by function 1: "
<< duration.count() << " microseconds" << std::endl;
std::cout << "Time taken by function 2: "
<< duration1.count() << " microseconds" << std::endl;
return 0;
}
除了其他答案之外,我想补充一点,这真的可以从适当的基准测试中受益。
为了获得可靠的结果,您希望 运行 多次测试,并采取措施确保内存缓存和其他系统级恶作剧不会干扰结果。
值得庆幸的是,有一些库可以为我们处理大部分这些复杂问题。例如,下面是使用 google 的 Benchmark 库的更好的代码基准测试:
N.B。已整合其他答案提出的修复。
#include <chrono>
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int>& golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
static void TestGolomb(benchmark::State& state) {
for (auto _ : state) {
benchmark::DoNotOptimize(golom(state.range(0)));
}
}
BENCHMARK(TestGolomb)->DenseRange(1, 17, 2);
static void TestGolombS(benchmark::State& state) {
for (auto _ : state) {
// Make sure we always start from a fresh map
std::unordered_map<int, int> golomb;
// Ignore construction and destruction of the map
auto start = std::chrono::high_resolution_clock::now();
benchmark::DoNotOptimize(golomS(state.range(0), golomb));
auto end = std::chrono::high_resolution_clock::now();
auto elapsed_seconds =
std::chrono::duration_cast<std::chrono::duration<double>>(
end - start);
state.SetIterationTime(elapsed_seconds.count());
}
}
BENCHMARK(TestGolombS)->DenseRange(1, 17, 2);
这为我们提供了以下配置文件,表明盈亏平衡点在 14-15 左右:
很好。请尝试并继续了解记忆。
我再给一个方法。您还可以在 DP table 中“记忆”值。 DP喜欢动态规划。
如果您追求速度并且能够承受大内存消耗,那么这是最快和最有效的解决方案。对该序列使用递归基本上是一个坏主意,会导致代码非常缓慢和潜在的堆栈溢出。
在 DP table 中将递归转换为带有记忆的迭代解决方案,会给你一个非常快的结果。即使是大数字。启用编译器优化后,您可以计算完整的序列,例如高达 100000 ultrafast。
见下文:
#include <iostream>
#include <array>
#include <cstdint>
// We will calculate a Golomb sequence up to the below specified element
constexpr size_t MaxValues = 100000;
using MyDataType = uint_fast16_t;
using GolombSequence = std::array<MyDataType, MaxValues>;
// This is the Golomb Sequence
GolombSequence golombSequence{};
// Iterative Function to calculate a Golomb sequence using a DP table
void createGolombSequence() {
golombSequence[1] = 1;
for (int i = 2; i < MaxValues; ++i) {
golombSequence[i] = 1 + golombSequence[i - golombSequence[golombSequence[i - 1]]];
}
}
// Test
int main() {
createGolombSequence();
std::cout << golombSequence[777] << '\n';
std::cout << golombSequence[7777] << '\n';
std::cout << golombSequence[77777] << '\n';
}
我们甚至可以更进一步,使用“编译时记忆”,也就是说,我们会在编译时计算出完整的序列。所以,让编译器来做吧。
这当然会导致最快的解决方案,但缺点是内存消耗高,并且受编译器能力的限制。
请看下面。所有 Golomb 值计算都在编译时完成。在运行时,我们只使用超快索引操作:
#include <iostream>
#include <array>
#include <cstdint>
// We will calculate a Golomb sequence up to the below specified element
constexpr size_t MaxValues = 100000u;
using MyDataType = uint_fast16_t;
using GolombSequence = std::array<MyDataType, MaxValues>;
// Iterative Function to calculate a Golomb sequence using a DP table
constexpr GolombSequence createGolombSequence() {
GolombSequence golombSequence{};
golombSequence[1] = 1;
for (int i = 2; i < MaxValues; ++i) {
golombSequence[i] = 1 + golombSequence[i - golombSequence[golombSequence[i - 1]]];
}
return golombSequence;
}
// This is the Golomb Sequence
constexpr GolombSequence golombSequence = createGolombSequence();
// Test
int main() {
std::cout << golombSequence[777] << '\n';
std::cout << golombSequence[7777] << '\n';
std::cout << golombSequence[77777] << '\n';
}
我将我的代码添加到 Franks Benchmark。结果大大优于其他一切。 0ms处的绿线用于上述算法。
使用基准代码
#include <chrono>
#include <iostream>
#include <array>
#include <cstdint>
// We will calculate a Golomb sequence up to the below specified element
constexpr size_t MaxValues = 10000u;
using MyDataType = uint_fast16_t;
using GolombSequence = std::array<MyDataType, MaxValues>;
// Iterative Function to calculate a Golomb sequence using a DP table
constexpr GolombSequence createGolombSequence() {
GolombSequence golombSequence{};
golombSequence[1] = 1;
for (int i = 2; i < MaxValues; ++i) {
golombSequence[i] = 1 + golombSequence[i - golombSequence[golombSequence[i - 1]]];
}
return golombSequence;
}
// This is the Golomb Sequence
constexpr GolombSequence golombSequence = createGolombSequence();
inline int golombFast(size_t i) { return golombSequence[i]; };
int golom(int n)
{
if (n == 1) {return 1;}
return 1 + golom(n - golom(golom(n - 1)));
}
int golomS(int n, std::unordered_map<int, int>& golomb)
{
if(n == 1)
{
return 1;
}
if(!golomb[n])
{
golomb[n] = 1 + golomS(n - golomS(golomS(n - 1, golomb), golomb), golomb);
}
return golomb[n];
}
static void TestGolombFast(benchmark::State& state) {
for (auto _ : state) {
benchmark::DoNotOptimize(golombFast(state.range(0)));
}
}
BENCHMARK(TestGolombFast)->DenseRange(1, 17, 2);
static void TestGolomb(benchmark::State& state) {
for (auto _ : state) {
benchmark::DoNotOptimize(golom(state.range(0)));
}
}
BENCHMARK(TestGolomb)->DenseRange(1, 17, 2);
static void TestGolombS(benchmark::State& state) {
for (auto _ : state) {
std::unordered_map<int, int> golomb;
// Be extra-generous, ignore construction and destruction of the set
auto start = std::chrono::high_resolution_clock::now();
benchmark::DoNotOptimize(golomS(state.range(0), golomb));
auto end = std::chrono::high_resolution_clock::now();
auto elapsed_seconds =
std::chrono::duration_cast<std::chrono::duration<double>>(
end - start);
state.SetIterationTime(elapsed_seconds.count());
}
}
BENCHMARK(TestGolombS)->DenseRange(1, 17, 2);