Tic Tac Toe 的 Minimax 算法 [python]

Question

我正在尝试在我的井字游戏中实现极小极大算法。我看了几个视频，用 minimax 算法分析了多个程序，我想我现在知道它是如何工作的了。我的程序正在运行，但算法似乎不知道他在做什么。它在板上输出焊盘，但不会阻止我或试图获胜。就像它是随机的。如果有人可以看看我的 minimax 算法并告诉我哪里出了问题，那就太好了！也很高兴告诉我我的解释有什么问题，不要只是投反对票。

from copy import deepcopy


class Board:
    def __init__(self, board=None):
        self.winning_combos = (
            [0, 1, 2], [3, 4, 5], [6, 7, 8],
            [0, 3, 6], [1, 4, 7], [2, 5, 8],
            [0, 4, 8], [2, 4, 6])

        if board is not None:
            self.board = board
        else:
            self.board = [None for i in range(9)]

    def check_combos(self):
        """ checks every combo if its used """
        for symbol in ['X', 'O']:
            for win_comb in self.winning_combos:
                sum = 0
                for field in win_comb:
                    if self.board[field] == symbol:
                        sum += 1

                if sum == 3:
                    return symbol

        return None

    def complete(self):
        """ check if the game is complete, caused by win or draw """
        cc = self.check_combos()
        if cc is not None:
            return cc

        if len(self.empty_pads()) <= 0:
            return "DRAW"

        return False

    def show(self):
        """ print board """
        print(str(self.board[0:3]) + "\n" +
              str(self.board[3:6]) + "\n" +
              str(self.board[6:9]))

    def empty_pads(self):
        """ returns list with indexes of every unused/empty field/pad """
        list = []
        for pad in range(len(self.board)):
            if self.board[pad] is None:
                list.append(pad)

        return list

    def set(self, position, player):
        """ sets the players symbol on the given position """
        self.board[position] = player

    def copy(self):
        return deepcopy(self)


def get_enemy_player(player):
    if player == 'X':
        return 'O'
    return 'X'


def get_player_value(player):
    """ X = max, O = min """
    if player == 'X':
        return 1
    else:
        return -1


def get_player_by_value(value):

    if value == -1:
        return "O"
    elif value == 1:
        return "X"
    else:
        return "NONE"


def max_v(node):
    if node.depth == 0 or node.board.complete():
        return get_player_value(node.board.complete())

    bestVal = -100
    for child in node.children:
        v = minimax(child)
        if v >= bestVal:
            bestVal = v
            node.bestmove = child.move

    return bestVal


def min_v(node):
    if node.depth == 0 or node.board.complete():
        return get_player_value(node.board.complete())

    bestVal = 100
    for child in node.children:
        v = minimax(child)
        if v <= bestVal:
            bestVal = v
            node.bestmove = child.move

    return bestVal


def minimax(node):
    if node.depth == 0 or node.board.complete():
        return get_player_value(node.board.complete())

    if get_player_value(node.player) == 1:
        return max_v(node)
    elif get_player_value(node.player) == -1:
        return min_v(node)


class Node:
    def __init__(self, depth, player, board, pad):
        self.depth = depth
        self.player = player
        self.board = board
        self.move = pad
        self.board.set(pad, self.player)
        self.bestmove = int

        self.children = []
        self.CreateChildren()

    def CreateChildren(self):
        if self.depth > 0 and not self.board.complete():
            for index in self.board.empty_pads():
                board = self.board.copy()
                self.children.append(Node(self.depth - 1, get_enemy_player(self.player), board, index))

if __name__ == "__main__":
    board = Board()
    board.show()

    while not board.complete():
        player = 'X'
        player_move = int(input('Move: ')) - 1
        if player_move not in board.empty_pads():
            continue
        board.set(player_move, player)
        board.show()

        if board.complete():
            break
        player = get_enemy_player(player)
        node = Node(9, player, board.copy(), player_move)
        minmax = minimax(node)
        print(node.bestmove+1)
        for child in node.children:
            print("move: " + str(child.move + 1) + " --> " + get_player_by_value(minmax) + " win")

        board.set(node.bestmove, player)
        board.show()

    print(board.complete())

PS：我知道为什么 "moves: " 输出总是一样，但这不是重点。

Answer 1

我发现您的程序存在多个问题。

至于你的实际问题：你的程序表现得好像计算机不区分输和平。在您的代码中，我找不到您为平局分配 0 的值，而您似乎为赢分配了 1，为输分配了 -1。您的代码应该宁愿平局也不愿输，但它认为没有区别。这就是它看起来 "Like it's random" 的原因。我这里的分析可能不对，但是下面的问题解释了为什么我很难说。

您的风格应该改进，以提高可读性和易维护性并避免错误。你的代码对我来说太难理解了，部分原因是...

您的评论太少，除了您以外，其他人无法理解代码的作用。过几个月你就记不住了，所以在代码里记下来。
你的空行太多，违反了PEP8，让屏幕上的代码更难看清。
你没有足够认真地对待你的输出，正如你所说 "ou[t]put is always the same, but that's not the point." 任何人，包括你，都很难说出没有良好输出的代码中发生了什么。处理它，并添加一些临时打印或日志记录语句，告诉您更多关于内部发生的事情。
您的一些例程 return 不同类型的值。 complete() 函数有时 return 是字符串 "DRAW"，有时是布尔值 False，有时是来自 self.check_combos() 的值，无论是什么类型。您的例程 max_v() 和 min_v() 有时 return 来自 get_player_value() 的字符串值，有时来自变量 bestVal.

Tic Tac Toe 的 Minimax 算法 [python]

Minimax Alogrithm for TicTacToe [python]

python

algorithm

tic-tac-toe

minimax

minmax