为什么并行执行 Tcl_ExprDouble 的独立 Tcl 解释器需要互斥量?
Why do independent Tcl interpreters executing Tcl_ExprDouble in parallel require a mutex?
我写了一个简单的 class,它在 Tcl 中包装了一个回调。它管理自己的 Tcl 解释器并将 Tcl 命令存储为字符串。 go
方法将字符串提供给解释器并 returns 结果。
#include <iostream>
#include <string>
#include <vector>
#include <future>
#include <tcl.h>
class Tcl_callback {
std::string callback;
Tcl_Interp *local_interp;
public:
Tcl_callback(std::string c):
callback(std::move(c)),
local_interp(Tcl_CreateInterp())
{}
~Tcl_callback() {Tcl_DeleteInterp(local_interp);}
Tcl_callback(const Tcl_callback & c):
callback(c.callback),
local_interp(Tcl_CreateInterp()) {}
double go() {
std::cout << "going..." << std::endl;
double resultValue;
int resultCode;
resultCode = Tcl_ExprDouble(local_interp, callback.c_str(), &resultValue);
if (resultCode != TCL_OK) {
throw std::runtime_error("ERROR: failed evaluation of the expression: \"" + callback + "\"\n " + Tcl_GetStringResult(local_interp));
}
return resultValue;
}
};
我已经用一个简单的 main 测试了它,它允许在并行和串行执行之间切换:
#define PARALLEL
int main() {
const int n_callbacks = 100;
const int n_iter = 10;
std::vector<Tcl_callback> cs(n_callbacks, Tcl_callback("sqrt(123)"));
for (int i = 0; i < n_iter; i++) {
#ifdef PARALLEL
std::vector<std::future<double>> fs;
for (auto & c: cs) {
fs.push_back( std::async(std::launch::async,[&](){return c.go();}) );
}
for (auto & f: fs) {
std::cout << f.get() << std::endl;
}
#else
for (auto & c: cs) {
std::cout << c.go() << std::endl;
}
#endif
}
std::cout << "done" << std::endl;
return 0;
}
虽然所有Tcl_callback
对象看起来都是独立的,但如果不使用全局互斥锁保护go
方法,我无法获得稳定的并行版本:
std::mutex m; //at global scope
std::lock_guard<std::mutex> lk(m); //inside the go method
我想了解造成这种情况的原因以及改进代码的可能方法。
问题是 Tcl_Interp
只能从创建它的线程访问(例如,通过 Tcl_ExprDouble
或 Tcl_DeleteInterp
); Tcl 解释器的实现大量使用线程局部变量以避免持有全局锁。不幸的是,您在启动所有线程之前创建解释器,导致解释器跨线程使用,这是行不通的。
来自 documentation…
The token returned by Tcl_CreateInterp may only be passed to Tcl routines called from the same thread as the original Tcl_CreateInterp call. It is not safe for multiple threads to pass the same token to Tcl's routines.
将代码更改为此(其中 Tcl_Interp
在 go
方法的范围内):
class Tcl_callback {
std::string callback;
public:
Tcl_callback(std::string c):
callback(std::move(c))
{}
~Tcl_callback() {}
Tcl_callback(const Tcl_callback & c):
callback(c.callback) {}
double go() {
std::cout << "going..." << std::endl;
double resultValue;
int resultCode;
Tcl_Interp *interp = Tcl_CreateInterp();
resultCode = Tcl_ExprDouble(interp, callback.c_str(), &resultValue);
if (resultCode != TCL_OK) {
throw std::runtime_error("ERROR: failed evaluation of the expression: \"" + callback + "\"\n " + Tcl_GetStringResult(interp));
}
Tcl_DeleteInterp(interp);
return resultValue;
}
};
哪个效率低下会起作用(或者至少在我尝试时它会起作用)。我会让您弄清楚如何使用巧妙的方法来避免创建 相当 如此多的解释器! (我还将让您清理抛出异常情况下的潜在资源泄漏。这只是概念验证代码。)
我写了一个简单的 class,它在 Tcl 中包装了一个回调。它管理自己的 Tcl 解释器并将 Tcl 命令存储为字符串。 go
方法将字符串提供给解释器并 returns 结果。
#include <iostream>
#include <string>
#include <vector>
#include <future>
#include <tcl.h>
class Tcl_callback {
std::string callback;
Tcl_Interp *local_interp;
public:
Tcl_callback(std::string c):
callback(std::move(c)),
local_interp(Tcl_CreateInterp())
{}
~Tcl_callback() {Tcl_DeleteInterp(local_interp);}
Tcl_callback(const Tcl_callback & c):
callback(c.callback),
local_interp(Tcl_CreateInterp()) {}
double go() {
std::cout << "going..." << std::endl;
double resultValue;
int resultCode;
resultCode = Tcl_ExprDouble(local_interp, callback.c_str(), &resultValue);
if (resultCode != TCL_OK) {
throw std::runtime_error("ERROR: failed evaluation of the expression: \"" + callback + "\"\n " + Tcl_GetStringResult(local_interp));
}
return resultValue;
}
};
我已经用一个简单的 main 测试了它,它允许在并行和串行执行之间切换:
#define PARALLEL
int main() {
const int n_callbacks = 100;
const int n_iter = 10;
std::vector<Tcl_callback> cs(n_callbacks, Tcl_callback("sqrt(123)"));
for (int i = 0; i < n_iter; i++) {
#ifdef PARALLEL
std::vector<std::future<double>> fs;
for (auto & c: cs) {
fs.push_back( std::async(std::launch::async,[&](){return c.go();}) );
}
for (auto & f: fs) {
std::cout << f.get() << std::endl;
}
#else
for (auto & c: cs) {
std::cout << c.go() << std::endl;
}
#endif
}
std::cout << "done" << std::endl;
return 0;
}
虽然所有Tcl_callback
对象看起来都是独立的,但如果不使用全局互斥锁保护go
方法,我无法获得稳定的并行版本:
std::mutex m; //at global scope
std::lock_guard<std::mutex> lk(m); //inside the go method
我想了解造成这种情况的原因以及改进代码的可能方法。
问题是 Tcl_Interp
只能从创建它的线程访问(例如,通过 Tcl_ExprDouble
或 Tcl_DeleteInterp
); Tcl 解释器的实现大量使用线程局部变量以避免持有全局锁。不幸的是,您在启动所有线程之前创建解释器,导致解释器跨线程使用,这是行不通的。
来自 documentation…
The token returned by Tcl_CreateInterp may only be passed to Tcl routines called from the same thread as the original Tcl_CreateInterp call. It is not safe for multiple threads to pass the same token to Tcl's routines.
将代码更改为此(其中 Tcl_Interp
在 go
方法的范围内):
class Tcl_callback {
std::string callback;
public:
Tcl_callback(std::string c):
callback(std::move(c))
{}
~Tcl_callback() {}
Tcl_callback(const Tcl_callback & c):
callback(c.callback) {}
double go() {
std::cout << "going..." << std::endl;
double resultValue;
int resultCode;
Tcl_Interp *interp = Tcl_CreateInterp();
resultCode = Tcl_ExprDouble(interp, callback.c_str(), &resultValue);
if (resultCode != TCL_OK) {
throw std::runtime_error("ERROR: failed evaluation of the expression: \"" + callback + "\"\n " + Tcl_GetStringResult(interp));
}
Tcl_DeleteInterp(interp);
return resultValue;
}
};
哪个效率低下会起作用(或者至少在我尝试时它会起作用)。我会让您弄清楚如何使用巧妙的方法来避免创建 相当 如此多的解释器! (我还将让您清理抛出异常情况下的潜在资源泄漏。这只是概念验证代码。)