TicTacToe 的 Minimax 算法（但每个玩家只能拥有 3 个 tacs）

Question

我目前正在制作一款 TicTacToe 游戏，该游戏使用 Minimax 算法进行玩家对战。截至目前，代码在 PvE 上只有困难模式。这是一个使用 Minimax 算法无法战胜的工作机器人。但是，我需要找到一种方法来合并特殊规则...

每位玩家一次只能在棋盘上放置 3 个 'tacs'。一旦达到 3，他们就必须将 tac 从一个位置移动到另一个位置。我不太确定如何实现这个，有人可以给我一些想法吗？

"""
TicTacToe - 3 symbols max
By Gage Gunn (???), Trevor Purta (Player vs. Computer), Toby Howie (Player vs. Player), Rachel Powers (Computer vs. Computer)
"""

# -------- Modules --------

import os
import time
import random

# -------- Global Variables --------

human = 'O'
bot = 'X'
amount_of_x = 0
amount_of_o = 0
ID = '' #Detecting which game mode (PvP, PvE, EvE)
ID2 = '' #Detecting which difficulty in PvE
ID3 = '' #Detecting which difficulty in EvE

# -------- Functions --------

def printBoard(board):
  print(board[1] + ' | ' + board[2] + ' | ' + board[3] + '        1 | 2 | 3')
  print(board[4] + ' | ' + board[5] + ' | ' + board[6] + '        4 | 5 | 6')
  print(board[7] + ' | ' + board[8] + ' | ' + board[9] + '        7 | 8 | 9')
  print('\n')


def spaceIsFree(position):
  if board[position] == ' ':
    return True
  else:
    return False


def insertLetter(letter, position):
  if spaceIsFree(position):
    board[position] = letter
    printBoard(board)
    if (checkDraw()):
      print("Draw!")
      exit()
    if checkForWin():
        if letter == 'X':
          print("Bot wins!")
          exit()
        else:
          print("Player wins!")
          exit()

    return


  else:
    print("Can't insert there!")
    position = int(input("Please enter new position:  "))
    insertLetter(letter, position)
    return


def checkForWin():
  if (board[1] == board[2] and board[1] == board[3] and board[1] != ' '):
    return True
  elif (board[4] == board[5] and board[4] == board[6] and board[4] != ' '):
    return True
  elif (board[7] == board[8] and board[7] == board[9] and board[7] != ' '):
    return True
  elif (board[1] == board[4] and board[1] == board[7] and board[1] != ' '):
    return True
  elif (board[2] == board[5] and board[2] == board[8] and board[2] != ' '):
    return True
  elif (board[3] == board[6] and board[3] == board[9] and board[3] != ' '):
    return True
  elif (board[1] == board[5] and board[1] == board[9] and board[1] != ' '):
    return True
  elif (board[7] == board[5] and board[7] == board[3] and board[7] != ' '):
    return True
  else:
    return False


def checkWhichMarkWon(mark):
  if board[1] == board[2] and board[1] == board[3] and board[1] == mark:
    return True
  elif (board[4] == board[5] and board[4] == board[6] and board[4] == mark):
    return True 
  elif (board[7] == board[8] and board[7] == board[9] and board[7] == mark):
    return True
  elif (board[1] == board[4] and board[1] == board[7] and board[1] == mark):
    return True
  elif (board[2] == board[5] and board[2] == board[8] and board[2] == mark):
    return True
  elif (board[3] == board[6] and board[3] == board[9] and board[3] == mark):
    return True
  elif (board[1] == board[5] and board[1] == board[9] and board[1] == mark):
    return True
  elif (board[7] == board[5] and board[7] == board[3] and board[7] == mark):
    return True
  else:
    return False


def checkDraw():
  for key in board.keys():
    if (board[key] == ' '):
      return False
  return True


def playerMove():
  position = int(input("Enter the position for 'O':  "))
  insertLetter(human, position)
  return


def compMove():
  bestScore = -800
  bestMove = 0
  for key in board.keys():
    if (board[key] == ' '):
      board[key] = bot
      score = minimax(board, 0, False)
      board[key] = ' '
      if (score > bestScore):
        bestScore = score
        bestMove = key

  insertLetter(bot, bestMove)
  return


def minimax(board, depth, isMaximizing):
  if (checkWhichMarkWon(bot)):
    return 1
  elif (checkWhichMarkWon(human)):
    return -1
  elif (checkDraw()):
    return 0

  if (isMaximizing):
    bestScore = -800
    for key in board.keys():
      if (board[key] == ' '):
        board[key] = bot
        score = minimax(board, depth + 1, False)
        board[key] = ' '
        if (score > bestScore):
          bestScore = score
    return bestScore

  else:
    bestScore = 800
    for key in board.keys():
      if (board[key] == ' '):
        board[key] = human
        score = minimax(board, depth + 1, True)
        board[key] = ' '
        if (score < bestScore):
          bestScore = score
    return bestScore


board = {1: ' ', 2: ' ', 3: ' ',
        4: ' ', 5: ' ', 6: ' ',
        7: ' ', 8: ' ', 9: ' '}

def pick_game_type(): #Pick PvP, PvE, EvE
  global ID
  print("""
First, Pick which game mode you would like to play!
(Player vs. Player [1], Player vs. Computer [2], Computer vs. Computer [3])
  """)
  type_choice = input('> ')
  if type_choice == "1": #Set global ID for [1]
    ID = "1"
  elif type_choice == "2": #Set global ID for [2]
    ID = "2"
  elif type_choice == "3": #Set global ID for [3]
    ID = "3"

# -------- Script --------

os.system('cls') #Clear console
print("""
Welcome to Wacky Tacky Toe, Tac, Tic!
By Gage Gunn, Trevor Purta, Toby Howie, and Rachel Power with an s
""")

pick_game_type() #Choose gamemode

if ID == '1': #PvP Chosen
  pass

elif ID == '2': #PvE Chosen
  os.system('cls')

  print('''
Welcome to Player vs Computer! In this version of Tic, Tac, Toe, each player can have a maximum of 3 tacs on the board at a time!
Decide which difficuly you would like to play? (Easy [1] or Hard [2])''')
  ID2 = input('> ')

  valid_input = False
  while not valid_input: #Checks validity of ID2
    if ID2 in ['1', '2']:
      valid_input = True
    else:
      ID2 = input('> ')
  
  if ID2 == '1': #Easy was chosen
    pass

  elif ID2 == '2':
    print('You have chosen hard! You are X, the computer is O. The computer will go first')
    while not checkForWin():
      compMove()
      playerMove()

elif ID == '3': #PvE Chosen
  pass

Answer 1

您需要将游戏玩法的组件隔离成可重复使用的函数，这些函数可以组合起来模拟 minimax 计算的动作。

因为你需要模拟，所以最好有一个独立的数据结构来表示游戏（棋盘）的状态，包括轮到谁玩（我使用一个 10 个字符的字符串，其中第一个角色是当前玩家，其他 9 个角色是棋盘的当前状态。

您需要实现的第一件事是模拟移动（返回新的游戏状态）的函数：

def playMove(game,toPos,fromPos):
    played =  [game[0] if i==toPos else "." if i==fromPos else p
               for i,p in enumerate(game)]
    return "".join(played)

你还需要一个切换当前播放器的功能。这需要分开，以便您可以在模拟对手的动作之前轻松评估当前玩家的结果：

def switchPlayer(game):
    return "XO"[game[0]=="X"]+game[1:]  # returns a new game state

为了递归地模拟走法，您需要一个函数来给出可能走法的列表。根据您的游戏规则，移动被定义为目标位置 (toPos) 和可选的原点位置 (fromPos)。对于前 3 个动作，将没有原点（让 fromPos = toPos for those）

def getMoves(game):
    player,*board = game
    moves = [(i,i) for i,p in enumerate(board,1) if p=="."]  # 1st 3 moves
    if board.count(player)==3:                               # moves 4+
        origins = [r for r,p in enumerate(board,1) if p==player ]
        moves   = [ (i,r) for i,_ in moves for r in origins] # add origins
    return moves

对于模拟和实际游戏，您需要一个函数来确定游戏是否获胜：

winPatterns = [(1,2,3),(4,5,6),(7,8,9),(1,4,7),(2,5,8),(3,6,9),(1,5,9),(3,5,7)]
def isWin(game):
    return any(all(game[i]==game[0] for i in p) for p in winPatterns)

对一步棋的结果进行评级可能是一些复杂的公式，但为了简单起见，我们只说获胜是 1，而不是获胜是零。 minimax 函数将递归地模拟动作，根据“最近的”决定性动作给它们一个评级（1 表示获胜，-1 表示失败）。

maxDepth = 6       # maximum depth = computer's skill
memo = dict()      # memoization (for performance)
def minimax(game,depth=maxDepth):
    if (game,depth) in memo:      # known rating at this depth 
        return memo[game,depth]   
    if isWin(game): return 1      # no moves after a win (base condition)
    if not depth: return 0        # limit depth (avoids infinite loops)
    oGame  = switchPlayer(game)   # will simulate opponent's moves
    oRatings = []                 # opponent's minimax ratings
    for move in getMoves(oGame):   
        played = playMove(oGame,*move)           # simulate
        oRatings.append(minimax(played,depth-1)) # get rating    
    rating = -max(oRatings or [1])               # worst countermove
    for d in range(1,depth+1):
        memo[game,d]=rating    # memorize for this depth and lower
    return rating

因为游戏数据结构是可散列的（即字符串），它可以用于记忆，大大提高了 minimax 函数的性能。

使用 minimax 函数，您可以确定计算机玩家（具有最高 minimax 评级的玩家）的最佳着法：

def bestMove(game):
    return max(getMoves(game),key=lambda m:minimax(playMove(game,*m)))

您可以在获得 max() 之前随机选择移动，当不止一个移动具有最高评分时随机选择最佳移动

将它们放在一个游戏循环中：

def printBoard(game):
    for r in range(1,10,3):
          print("("+") (".join(game[r:r+3])+")",list(range(r,r+3)))

from random import sample
while True:
    player,computer = sample("XOXO",2) # random starter, random computer
    game = player + "."*9
    while True:
        printBoard(game)
        if isWin(game): break
        game = switchPlayer(game)
        player = game[0]
        if player == computer:
            toPos,fromPos = bestMove(game)
            print(f"Computer plays {player} at {toPos}",end=" ")
            print(f"from {fromPos}"*(fromPos!=toPos))
        else:
            strMoves = [(str(t),str(f)) for t,f in getMoves(game)]
            while True:
                fromPos = toPos = input(f"Position to play for {player}: ")
                if not toPos in (f for f,_ in strMoves):
                    print("invalid position"); continue
                if game[1:].count(player)<3: break
                fromPos = input(f"Position to remove {player} from: ")
                if (toPos,fromPos) not in strMoves:
                    print("invalid move")
                else: break
        game = playMove(game,int(toPos),int(fromPos))
    print( f"{player}'s WIN !!!\n")
    if input("play again (y/n)") != "y": break
    print()

样本运行：

(.) (.) (.) [1, 2, 3]
(.) (.) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Computer plays O at 5 
(.) (.) (.) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Position to play for X: 1
(X) (.) (.) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Computer plays O at 3 
(X) (.) (O) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Position to play for X: 7
(X) (.) (O) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Computer plays O at 4 
(X) (.) (O) [1, 2, 3]
(O) (O) (.) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Position to play for X: 6
(X) (.) (O) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Computer plays O at 9 from 3
(X) (.) (.) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (O) [7, 8, 9]
Position to play for X: 3
Position to remove X from: 7
(X) (.) (X) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(.) (.) (O) [7, 8, 9]
Computer plays O at 2 from 4
(X) (O) (X) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(.) (.) (O) [7, 8, 9]
Position to play for X: 8
Position to remove X from: 3
(X) (O) (.) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(.) (X) (O) [7, 8, 9]
Computer plays O at 3 from 2
(X) (.) (O) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(.) (X) (O) [7, 8, 9]
Position to play for X: 7
Position to remove X from: 8
(X) (.) (O) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(X) (.) (O) [7, 8, 9]
Computer plays O at 4 from 9
(X) (.) (O) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Position to play for X: 
Position to play for X: 9
Position to remove X from: 1
(.) (.) (O) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (X) [7, 8, 9]
Computer plays O at 8 from 4
(.) (.) (O) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(X) (O) (X) [7, 8, 9]
Position to play for X: 2
Position to remove X from: 6
(.) (X) (O) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(X) (O) (X) [7, 8, 9]
Computer plays O at 1 from 3
(O) (X) (.) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(X) (O) (X) [7, 8, 9]
Position to play for X: 3
Position to remove X from: 7
(O) (X) (X) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (O) (X) [7, 8, 9]
Computer plays O at 6 from 8
(O) (X) (X) [1, 2, 3]
(.) (O) (O) [4, 5, 6]
(.) (.) (X) [7, 8, 9]
Position to play for X: 7
Position to remove X from: 2
(O) (.) (X) [1, 2, 3]
(.) (O) (O) [4, 5, 6]
(X) (.) (X) [7, 8, 9]
Computer plays O at 4 from 1
(.) (.) (X) [1, 2, 3]
(O) (O) (O) [4, 5, 6]
(X) (.) (X) [7, 8, 9]
O's WIN !!!

play again (y/n) ? n

TicTacToe 的 Minimax 算法（但每个玩家只能拥有 3 个 tacs）

Minimax Algorithm with TicTacToe (But each player can only have 3 tacs on board)

python

tic-tac-toe

minimax