何时使用内存 return 截止值进行 alpha-beta 搜索?
When does alpha-beta search with memory return cutoff values?
我已经实现了带换位的 alpha beta 搜索 table。
关于在 table 中存储截止值,我的想法是否正确?
具体来说,我在 table 命中时返回截止值的方案是否正确?(同样,存储它们。)我的实现似乎与 this one,但直觉上对我来说似乎是正确的。
此外,我的算法从不存储带有 at_most 标志的条目。我应该什么时候存储这些条目?
这是我的(简化的)代码,展示了主要思想:
int ab(board *b, int alpha, int beta, int ply) {
evaluation *stored = tt_get(b);
if (entryExists(stored) && stored->depth >= ply) {
if (stored->type == at_least) { // lower-bound
if (stored->score >= beta) return beta;
} else if (stored->type == at_most) { // upper bound
if (stored->score <= alpha) return alpha;
} else { // exact
if (stored->score >= beta) return beta; // respect fail-hard cutoff
if (stored->score < alpha) return alpha; // alpha cutoff
return stored->score;
}
}
if (ply == 0) return quiesce(b, alpha, beta, ply);
int num_children = 0;
move chosen_move = no_move;
move *moves = board_moves(b, &num_children);
int localbest = NEG_INFINITY;
for (int i = 0; i < num_children; i++) {
apply(b, moves[i]);
int score = -ab(b, -beta, -alpha, ply - 1);
unapply(b, moves[i]);
if (score >= beta) {
tt_put(b, (evaluation){moves[i], score, at_least, ply});
return beta; // fail-hard
}
if (score >= localbest) {
localbest = score;
chosen_move = moves[i];
if (score > alpha) alpha = score;
}
}
tt_put(b, (evaluation){chosen_move, alpha, exact, ply});
return alpha;
}
My implementation seems to conflict with this one
转置代码 table 查找对我来说似乎是正确的。它大致相当于 wikipedia.
// Code on Wikipedia rewritten using your notation / variable names
if (entryExists(stored) && stored->depth >= ply)
{
if (stored->type == at_least)
alpha = max(alpha, stored->score);
else if (stored->type == at_most)
beta = min(beta, stored->score);
else if (stored->type == exact)
return stored->score;
if (alpha >= beta)
return stored->score;
}
这相当于(检查if (alpha >= beta)
已移入每个节点类型):
if (entryExists(stored) && stored->depth >= ply)
{
if (stored->type == at_least)
{
alpha = max(alpha, stored->score);
if (alpha >= beta) return stored->score;
}
else if (stored->type == at_most)
{
beta = min(beta, stored->score);
if (alpha >= beta) return stored->score;
}
else if (stored->type == exact)
return stored->score;
}
这可以在以下时间更改:
if (entryExists(stored) && stored->depth >= ply)
{
if (stored->type == at_least)
{
// if (max(alpha, stored->score) >= beta) ...
if (stored->score >= beta) return stored->score;
}
else if (stored->type == at_most)
{
// if (min(beta, stored->score) <= alpha) ...
if (stored->score <= alpha) return stored->score;
}
else if (stored->type == exact)
return stored->score;
}
剩下的区别是维基百科使用 fail-soft optimization, while your code is the classic alpha-beta pruning (fail-hard)。 Fail-soft 是一个小的改进,但不会改变算法的关键点。
my algorithm never stores an entry with the at_most flag. When should I be storing these entries?
存储 exact
/ at_most
节点类型的方式存在错误。在这里你假设节点总是类型 exact
:
tt_put(b, (evaluation){chosen_move, alpha, exact, ply});
实际上它可以是一个at_most
节点:
if (alpha <= initial_alpha)
{
// Here we haven't a best move.
tt_put(b, (evaluation){no_move, initial_alpha, at_most, ply});
}
else
tt_put(b, (evaluation){chosen_move, alpha, exact, ply});
我已经实现了带换位的 alpha beta 搜索 table。
关于在 table 中存储截止值,我的想法是否正确?
具体来说,我在 table 命中时返回截止值的方案是否正确?(同样,存储它们。)我的实现似乎与 this one,但直觉上对我来说似乎是正确的。
此外,我的算法从不存储带有 at_most 标志的条目。我应该什么时候存储这些条目?
这是我的(简化的)代码,展示了主要思想:
int ab(board *b, int alpha, int beta, int ply) {
evaluation *stored = tt_get(b);
if (entryExists(stored) && stored->depth >= ply) {
if (stored->type == at_least) { // lower-bound
if (stored->score >= beta) return beta;
} else if (stored->type == at_most) { // upper bound
if (stored->score <= alpha) return alpha;
} else { // exact
if (stored->score >= beta) return beta; // respect fail-hard cutoff
if (stored->score < alpha) return alpha; // alpha cutoff
return stored->score;
}
}
if (ply == 0) return quiesce(b, alpha, beta, ply);
int num_children = 0;
move chosen_move = no_move;
move *moves = board_moves(b, &num_children);
int localbest = NEG_INFINITY;
for (int i = 0; i < num_children; i++) {
apply(b, moves[i]);
int score = -ab(b, -beta, -alpha, ply - 1);
unapply(b, moves[i]);
if (score >= beta) {
tt_put(b, (evaluation){moves[i], score, at_least, ply});
return beta; // fail-hard
}
if (score >= localbest) {
localbest = score;
chosen_move = moves[i];
if (score > alpha) alpha = score;
}
}
tt_put(b, (evaluation){chosen_move, alpha, exact, ply});
return alpha;
}
My implementation seems to conflict with this one
转置代码 table 查找对我来说似乎是正确的。它大致相当于 wikipedia.
// Code on Wikipedia rewritten using your notation / variable names
if (entryExists(stored) && stored->depth >= ply)
{
if (stored->type == at_least)
alpha = max(alpha, stored->score);
else if (stored->type == at_most)
beta = min(beta, stored->score);
else if (stored->type == exact)
return stored->score;
if (alpha >= beta)
return stored->score;
}
这相当于(检查if (alpha >= beta)
已移入每个节点类型):
if (entryExists(stored) && stored->depth >= ply)
{
if (stored->type == at_least)
{
alpha = max(alpha, stored->score);
if (alpha >= beta) return stored->score;
}
else if (stored->type == at_most)
{
beta = min(beta, stored->score);
if (alpha >= beta) return stored->score;
}
else if (stored->type == exact)
return stored->score;
}
这可以在以下时间更改:
if (entryExists(stored) && stored->depth >= ply)
{
if (stored->type == at_least)
{
// if (max(alpha, stored->score) >= beta) ...
if (stored->score >= beta) return stored->score;
}
else if (stored->type == at_most)
{
// if (min(beta, stored->score) <= alpha) ...
if (stored->score <= alpha) return stored->score;
}
else if (stored->type == exact)
return stored->score;
}
剩下的区别是维基百科使用 fail-soft optimization, while your code is the classic alpha-beta pruning (fail-hard)。 Fail-soft 是一个小的改进,但不会改变算法的关键点。
my algorithm never stores an entry with the at_most flag. When should I be storing these entries?
存储 exact
/ at_most
节点类型的方式存在错误。在这里你假设节点总是类型 exact
:
tt_put(b, (evaluation){chosen_move, alpha, exact, ply});
实际上它可以是一个at_most
节点:
if (alpha <= initial_alpha)
{
// Here we haven't a best move.
tt_put(b, (evaluation){no_move, initial_alpha, at_most, ply});
}
else
tt_put(b, (evaluation){chosen_move, alpha, exact, ply});