Minmax 算法不 return 根的直接子代(returns 非法移动)
Minmax algorithm doesn't return direct child of root (returns illegal move)
我正在尝试为九人莫里斯创建 "AI",但我在 minMax
算法上遇到困难。总而言之,我试图找到超过 10 小时的问题,但没有成功。 (调试这个递归很讨厌,或者我做得不好,或者两者都有)
自从我开始怀疑我写的所有内容后,我决定 post 我的问题,这样别人就可以在我的 minMax 版本中发现任何错误。我意识到如果没有整个应用程序,这真的是一项艰巨的任务,因此也非常欢迎任何我应该三重检查我的代码的建议。
这是解释 minMax 的视频 link,我的实现基于此:https://www.youtube.com/watch?v=l-hh51ncgDI(搜索 minmax 后在 yt 上弹出的第一个视频 - 以防万一看视频不想点击link)
我的 minMax 没有 alpha-beta 修剪:
//turn - tells which player is going to move
//gameStage - what action can be done in this move, where possible actions are: put pawn, move pawn, take opponent's pawn
//depth - tells how far down the game tree should minMax go
//spots - game board
private int minMax(int depth, Turn turn, GameStage gameStage, Spot[] spots){
if(depth==0){
return evaluateBoard(spots);
}
//in my scenario I am playing as WHITE and "AI" is playing as BLACK
//since heuristic (evaluateBoard) returns number equal to black pawns - white pawns
//I have decided that in my minMax algorithm every white turn will try to minimize and black turn will try to maximize
//I dont know if this is correct approach but It seems logical to me so let me know if this is wrong
boolean isMaximizing = turn.equals(Turn.BLACK);
//get all possible (legal) actions based on circumstances
ArrayList<Action> children = gameManager.getAllPossibleActions(spots,turn,gameStage);
//this object will hold information about game circumstances after applying child move
//and this information will be passed in recursive call
ActionResult result;
//placeholder for value returned by minMax()
int eval;
//scenario for maximizing player
if(isMaximizing){
int maxEval = NEGATIVE_INF;
for (Action child : children){
//aplying possible action (child) and passing its result to recursive call
result = gameManager.applyMove(child,turn,spots);
//evaluate child move
eval = minMax(depth-1,result.getTurn(),result.getGameStage(),result.getSpots());
//resets board (which is array of Spots) so that board is not changed after minMax algorithm
//because I am working on the original board to avoid time consuming copies
gameManager.unapplyMove(child,turn,spots,result);
if(maxEval<eval){
maxEval = eval;
//assign child with the biggest value to global static reference
Instances.theBestAction = child;
}
}
return maxEval;
}
//scenario for minimizing player - the same logic as for maximizing player but for minimizing
else{
int minEval = POSITIVE_INF;
for (Action child : children){
result = engine.getGameManager().applyMove(child,turn,spots);
eval = minMax(depth-1,result.getTurn(),result.getGameStage(),result.getSpots());
engine.getGameManager().unapplyMove(child,turn,spots,result);
if(minEval>eval){
minEval=eval;
Instances.theBestAction = child;
}
}
return minEval;
}
}
用于评估的简单启发式:
//calculates the difference between black pawns on board
//and white pawns on board
public int evaluateBoard(Spot[] spots) {
int value = 0;
for (Spot spot : spots) {
if (spot.getTurn().equals(Turn.BLACK)) {
value++;
}else if(spot.getTurn().equals(Turn.WHITE)){
value--;
}
}
return value;
}
我的问题:
//the same parameters as in minMax() function
public void checkMove(GameStage gameStage, Turn turn, Spot[] spots) {
//one of these must be returned by minMax() function
//because these are the only legal actions that can be done in this turn
ArrayList<Action> possibleActions = gameManager.getAllPossibleActions(spots,turn,gameStage);
//I ignore int returned by minMax() because,
//after execution of this function, action choosed by minMax() is assigned
//to global static reference
minMax(1,turn,gameStage,spots);
//getting action choosed by minMax() from global
//static reference
Action aiAction = Instances.theBestAction;
//flag to check if aiAction is in possibleActions
boolean wasFound = false;
//find the same action returned by minMax() in possibleActions
//change the flag upon finding one
for(Action possibleAction : possibleActions){
if(possibleAction.getStartSpotId() == aiAction.getStartSpotId() &&
possibleAction.getEndSpotId() == aiAction.getEndSpotId() &&
possibleAction.getActionType().equals(aiAction.getActionType())){
wasFound = true;
break;
}
}
//when depth is equal to 1 it always is true
//because there is no other choice, but
//when depth>1 it really soon is false
//so direct child of root is not chosen
System.out.println("wasFound?: "+wasFound);
}
我实现 minMax 算法的想法是否正确?
我认为错误可能存在于您正在更新 Instances.theBestAction
即使在评估子移动时也是如此。
例如,假设 'Move 4' 是最终返回的真正最佳着法,但是在评估 'Move 5' 时,theBestAction
被设置为 [= 的最佳子动作24=]。从现在开始,您不会将原来的 theBestAction
更新回 'Move 4'。
也许只是一个简单的条件,只在 depth == originalDepth
时设置 theBestAction
?
除了使用全局,您还可以考虑返回一个 struct/object,其中包含最佳分数和获得分数的移动。
我正在尝试为九人莫里斯创建 "AI",但我在 minMax
算法上遇到困难。总而言之,我试图找到超过 10 小时的问题,但没有成功。 (调试这个递归很讨厌,或者我做得不好,或者两者都有)
自从我开始怀疑我写的所有内容后,我决定 post 我的问题,这样别人就可以在我的 minMax 版本中发现任何错误。我意识到如果没有整个应用程序,这真的是一项艰巨的任务,因此也非常欢迎任何我应该三重检查我的代码的建议。
这是解释 minMax 的视频 link,我的实现基于此:https://www.youtube.com/watch?v=l-hh51ncgDI(搜索 minmax 后在 yt 上弹出的第一个视频 - 以防万一看视频不想点击link)
我的 minMax 没有 alpha-beta 修剪:
//turn - tells which player is going to move
//gameStage - what action can be done in this move, where possible actions are: put pawn, move pawn, take opponent's pawn
//depth - tells how far down the game tree should minMax go
//spots - game board
private int minMax(int depth, Turn turn, GameStage gameStage, Spot[] spots){
if(depth==0){
return evaluateBoard(spots);
}
//in my scenario I am playing as WHITE and "AI" is playing as BLACK
//since heuristic (evaluateBoard) returns number equal to black pawns - white pawns
//I have decided that in my minMax algorithm every white turn will try to minimize and black turn will try to maximize
//I dont know if this is correct approach but It seems logical to me so let me know if this is wrong
boolean isMaximizing = turn.equals(Turn.BLACK);
//get all possible (legal) actions based on circumstances
ArrayList<Action> children = gameManager.getAllPossibleActions(spots,turn,gameStage);
//this object will hold information about game circumstances after applying child move
//and this information will be passed in recursive call
ActionResult result;
//placeholder for value returned by minMax()
int eval;
//scenario for maximizing player
if(isMaximizing){
int maxEval = NEGATIVE_INF;
for (Action child : children){
//aplying possible action (child) and passing its result to recursive call
result = gameManager.applyMove(child,turn,spots);
//evaluate child move
eval = minMax(depth-1,result.getTurn(),result.getGameStage(),result.getSpots());
//resets board (which is array of Spots) so that board is not changed after minMax algorithm
//because I am working on the original board to avoid time consuming copies
gameManager.unapplyMove(child,turn,spots,result);
if(maxEval<eval){
maxEval = eval;
//assign child with the biggest value to global static reference
Instances.theBestAction = child;
}
}
return maxEval;
}
//scenario for minimizing player - the same logic as for maximizing player but for minimizing
else{
int minEval = POSITIVE_INF;
for (Action child : children){
result = engine.getGameManager().applyMove(child,turn,spots);
eval = minMax(depth-1,result.getTurn(),result.getGameStage(),result.getSpots());
engine.getGameManager().unapplyMove(child,turn,spots,result);
if(minEval>eval){
minEval=eval;
Instances.theBestAction = child;
}
}
return minEval;
}
}
用于评估的简单启发式:
//calculates the difference between black pawns on board
//and white pawns on board
public int evaluateBoard(Spot[] spots) {
int value = 0;
for (Spot spot : spots) {
if (spot.getTurn().equals(Turn.BLACK)) {
value++;
}else if(spot.getTurn().equals(Turn.WHITE)){
value--;
}
}
return value;
}
我的问题:
//the same parameters as in minMax() function
public void checkMove(GameStage gameStage, Turn turn, Spot[] spots) {
//one of these must be returned by minMax() function
//because these are the only legal actions that can be done in this turn
ArrayList<Action> possibleActions = gameManager.getAllPossibleActions(spots,turn,gameStage);
//I ignore int returned by minMax() because,
//after execution of this function, action choosed by minMax() is assigned
//to global static reference
minMax(1,turn,gameStage,spots);
//getting action choosed by minMax() from global
//static reference
Action aiAction = Instances.theBestAction;
//flag to check if aiAction is in possibleActions
boolean wasFound = false;
//find the same action returned by minMax() in possibleActions
//change the flag upon finding one
for(Action possibleAction : possibleActions){
if(possibleAction.getStartSpotId() == aiAction.getStartSpotId() &&
possibleAction.getEndSpotId() == aiAction.getEndSpotId() &&
possibleAction.getActionType().equals(aiAction.getActionType())){
wasFound = true;
break;
}
}
//when depth is equal to 1 it always is true
//because there is no other choice, but
//when depth>1 it really soon is false
//so direct child of root is not chosen
System.out.println("wasFound?: "+wasFound);
}
我实现 minMax 算法的想法是否正确?
我认为错误可能存在于您正在更新 Instances.theBestAction
即使在评估子移动时也是如此。
例如,假设 'Move 4' 是最终返回的真正最佳着法,但是在评估 'Move 5' 时,theBestAction
被设置为 [= 的最佳子动作24=]。从现在开始,您不会将原来的 theBestAction
更新回 'Move 4'。
也许只是一个简单的条件,只在 depth == originalDepth
时设置 theBestAction
?
除了使用全局,您还可以考虑返回一个 struct/object,其中包含最佳分数和获得分数的移动。