理解 Minimax 算法

Question

我正在尝试为两人 8x8 棋盘游戏创建 AI 对手。经过研究，我发现 Minimax 算法足以胜任这项工作。我创建的 AI 对手将与其他 AI 对手或人类对战。

对极小极大算法的理解有疑问

我试图只创建一个 AI 对手，但在网上找到的解释说我需要为两个玩家（最小玩家和最大玩家）编写代码，正如我从下面的伪代码中理解的那样。

MinMax (GamePosition game) {
  return MaxMove (game);
}

MaxMove (GamePosition game) {
  if (GameEnded(game)) {
    return EvalGameState(game);
  }
  else {
    best_move < - {};
    moves <- GenerateMoves(game);
    ForEach moves {
       move <- MinMove(ApplyMove(game));
       if (Value(move) > Value(best_move)) {
          best_move < - move;
       }
    }
    return best_move;
  }
}

MinMove (GamePosition game) {
  best_move <- {};
  moves <- GenerateMoves(game);
  ForEach moves {
     move <- MaxMove(ApplyMove(game));
     if (Value(move) > Value(best_move)) {
        best_move < - move;
     }
  }

  return best_move;
}

我可以进一步理解，最大玩家将是我要开发的AI，最小玩家是对手。

我的问题是为什么我必须为最小玩家和最大玩家编写代码以 return 最佳移动？

下面给出的伪代码是基于C#的。

提前致谢。

Answer 1

你只需要在最坏的情况下为两个玩家搜索最佳解决方案，这就是为什么它被称为 minmax，你不需要更多：

function minimax( node, depth )     
   if node is a terminal node or depth <= 0:        
       return the heuristic value of node  

   α = -∞    
   foreach child in node:          
      α = max( a, -minimax( child, depth-1 ) )  

   return α

节点是一个游戏位置，节点中的 child 是下一个动作（来自所有可用动作的列表），深度是两个玩家一起搜索的最大移动。

您可能无法运行 8x8 棋盘上所有可能的走法（取决于您下一步有多少选择）例如，如果每个人都有 8 种不同的可能动作，并且游戏在 40 次动作后结束（应该是最坏的情况），那么您将获得 8^40 个位置。计算机需要几十年甚至更长时间才能解决它，这就是为什么你需要 depth 参数和使用 heuristic 函数（例如 random forest tree) 不用检查所有选项就知道游戏位置有多好。

更好一点的 minmax 算法是 Alpha-Beta p运行ing，一旦找到目标（β 参数）就完成搜索：

function negamax( node, depth, α, β, color )  
   if node is a terminal node or depth = 0     
       return color * the heuristic value of node  

   foreach child of node          
       value = -negamax( child, depth-1, -β, -α, -color )  

   if value ≥ β                
      return value /** Alpha-Beta cut-off */  

  if value ≥ α       
     α = value              

  return α

最好先使用没有很多位置的游戏（例如井字游戏）。

理解 Minimax 算法

Understanding the Minimax Algorithm

c#

algorithm

artificial-intelligence

minimax

alpha-beta-pruning