为什么我的 Alpha-Beta 修剪会扩展比必要更多的节点？

Question

我目前正在为极小极大函数实现 alpha-beta 剪枝算法。此练习对应于伯克利大学 PacMan 项目的多智能体部分。

我的实现：

class AlphaBetaAgent(MultiAgentSearchAgent):
    """
    Your minimax agent with alpha-beta pruning (question 3)
    """

    def getAction(self, gameState):
        """
        Returns the minimax action using self.depth and self.evaluationFunction
        """
        "*** YOUR CODE HERE ***"
        def maxValue(state, depth: int, alpha: int, beta: int):
            depth += 1
            if state.isWin() or state.isLose() or depth == self.depth:
                return self.evaluationFunction(state)
            else:
                value: int = -sys.maxsize - 1
                legalActions: list = state.getLegalActions(0)
                for action in legalActions:
                    value = max(value, minValue(
                        state.generateSuccessor(0, action), 1, depth, alpha, beta))
                    if beta <= value:
                        return value
                    alpha = max(alpha,value)
            return value

        def minValue(state, ghostIndex: int, depth: int, alpha: int, beta: int):
            if state.isWin() or state.isLose():
                return self.evaluationFunction(state)
            else:
                value: int = sys.maxsize * 2 + 1
                legalActions: list = state.getLegalActions(ghostIndex)
                if ghostIndex == state.getNumAgents() - 1:
                    for action in legalActions:
                        value = min(value, maxValue(
                            state.generateSuccessor(ghostIndex, action), depth, alpha, beta))
                else:
                    for action in legalActions:
                        value = min(value, minValue(state.generateSuccessor(
                            ghostIndex, action), ghostIndex + 1, depth, alpha, beta))
                        if alpha >= value:
                            return value
                        beta = min (beta, value)
                return value

        pacManMoves: list = gameState.getLegalActions(0)
        value: int = -sys.maxsize - 1
        alpha: int = -sys.maxsize - 1
        beta: int = sys.maxsize*2+1
        move: str = Directions.STOP
        for currentMove in pacManMoves:
            currentValue = minValue(
                gameState.generateSuccessor(0, currentMove), 1, 0, alpha, beta)
            if (currentValue > value):
                value = currentValue
                move = currentMove
        return move

输出：

Question q3
===========

*** PASS: test_cases\q3[=11=]-eval-function-lose-states-1.test
*** PASS: test_cases\q3[=11=]-eval-function-lose-states-2.test
*** PASS: test_cases\q3[=11=]-eval-function-win-states-1.test
*** PASS: test_cases\q3[=11=]-eval-function-win-states-2.test
*** FAIL: test_cases\q3[=11=]-lecture-6-tree.test
***     Incorrect generated nodes for depth=2
***         Student generated nodes: A B C D E F G H I max min1 min2 min3
***         Correct generated nodes: A B C D E F G H max min1 min2 min3
***     Tree:
***                max
***           /-/   |   \--\
***          /      |       \
***         /       |        \
***      min1      min2      min3
***       /|\      /|\       /|\
***      / | \    / | \     / | \
***     A  B  C  D  E  F   G  H  I
***     3 12  8  5  4  6  14  1  11
*** FAIL: test_cases\q3[=11=]-small-tree.test
***     Incorrect generated nodes for depth=3
***         Student generated nodes: A B C D deeper minLeft minRight root
***         Correct generated nodes: A B C minLeft minRight root
***     Tree:
***             root
***            /    \
***      minLeft  minRight
***       / \       /    \
***      A   B     C   deeper
***      4   3     2     |
***                      D
***                    1000
*** FAIL: test_cases\q3-1-minmax.test
***     Incorrect generated nodes for depth=3
***         Student generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***         Correct generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \                 |
***        c1        c2             cx
***       /  \      /  \             |
***      d1   d2  d3   d4           dx
***     -3    -9  10    6         -3.01
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b1 is -3.
*** FAIL: test_cases\q3-2-minmax.test
***     Incorrect generated nodes for depth=3
***         Student generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***         Correct generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \                 |
***        c1        c2             cx
***       /  \      /  \             |
***      d1   d2  d3   d4           dx
***     -3    -9  10    6         -2.99
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b1 is -3.
*** FAIL: test_cases\q3-3-minmax.test
***     Incorrect generated nodes for depth=3
***         Student generated nodes: a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***         Correct generated nodes: a b1 b2 c3 cx d5 d6 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***              |                /    \
***             cx             c3        c4
***              |            /   \     /   \
***             dx           d5   d6   d7   d8
***            4.01          4    -7   0    5
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b2 is 4.
*** PASS: test_cases\q3-4-minmax.test
*** FAIL: test_cases\q3-5-minmax.test
***     Incorrect generated nodes for depth=4
***         Student generated nodes: A B C D E F G H Z a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***         Correct generated nodes: A B C D E F G Z a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***     Tree:
***                  /-----a------\
***                 /              \
***                /                \
***              b1                  b2
***            /    \                |
***         c1        c2             cx
***        /  \      /  \            |
***       d1   d2  d3   d4           dx
***      / \  / \  / \  / \          |
***      A B  C D  E F  G H          Z
***     -3 13 5 9 10 3 -6 8        3.01
***
***     a - max
***     b - min
***     c - max
***     d - min
***
***     Note the minimax value of b1 is 3.
*** FAIL: test_cases\q3-6-minmax.test
***     Incorrect generated nodes for depth=4
***         Student generated nodes: A B C D E F G H Z a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***         Correct generated nodes: A B C D E F G Z a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***     Tree:
***                  /-----a------\
***                 /              \
***                /                \
***              b1                  b2
***            /    \                |
***         c1        c2             cx
***        /  \      /  \            |
***       d1   d2  d3   d4           dx
***      / \  / \  / \  / \          |
***      A B  C D  E F  G H          Z
***     -3 13 5 9 10 3 -6 8        2.99
***
***     a - max
***     b - min
***     c - max
***     d - min
***
***     Note the minimax value of b1 is 3.
*** FAIL: test_cases\q3-7-minmax.test
***     Incorrect generated nodes for depth=4
***         Student generated nodes: I J K L M O P Z a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***         Correct generated nodes: I J K M O P Z a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***             |                 /    \
***             cx             c3        c4
***             |             /   \     /   \
***             dx           d5   d6   d7   d8
***             |           / \  / \  / \  / \
***             Z           I J  K L  M N  O P
***          -1.99        -1 -9  4 7  2 5 -3 -2
***
***     a - max
***     b - min
***     c - min
***     d - max
***
***     Note that the minimax value of b2 is -2
*** FAIL: test_cases\q3-8-minmax.test
***     Incorrect generated nodes for depth=4
***         Student generated nodes: I J K L M O P Z a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***         Correct generated nodes: I J K M O P Z a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***             |                 /    \
***             cx             c3        c4
***             |             /   \     /   \
***             dx           d5   d6   d7   d8
***             |           / \  / \  / \  / \
***             Z           I J  K L  M N  O P
***          -2.01        -1 -9  4 7  2 5 -3 -2
***
***     a - max
***     b - min
***     c - min
***     d - max
***
***     Note that the minimax value of b2 is -2.01
*** PASS: test_cases\q3-1a-vary-depth.test
*** FAIL: test_cases\q3-1b-vary-depth.test
***     Incorrect generated nodes for depth=2
***         Student generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***         Correct generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \                 |
***     -4 c1        c2 9           cx -4.01
***       /  \      /  \             |
***      d1   d2  d3   d4           dx
***     -3    -9  10    6         -4.01
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b1 is -3, but the depth=1 limited value is -4.
***     The values next to c1, c2, and cx are the values of the evaluation function, not
***     necessarily the correct minimax backup.
*** PASS: test_cases\q3-2a-vary-depth.test
*** FAIL: test_cases\q3-2b-vary-depth.test
***     Incorrect generated nodes for depth=2
***         Student generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 d4 dx
***         Correct generated nodes: a b1 b2 c1 c2 cx d1 d2 d3 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \                 |
***     -4 c1        c2 9           cx -3.99
***       /  \      /  \             |
***      d1   d2  d3   d4           dx
***     -3    -9  10    6         -3.99
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b1 is -3, but the depth=1 limited value is -4.
***     The values next to c1, c2, and cx are the values of the evaluation function, not
***     necessarily the correct minimax backup.
*** PASS: test_cases\q3-3a-vary-depth.test
*** FAIL: test_cases\q3-3b-vary-depth.test
***     Incorrect generated nodes for depth=2
***         Student generated nodes: a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***         Correct generated nodes: a b1 b2 c3 cx d5 d6 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***              |                /    \
***      5.01   cx          8  c3        c4   5
***              |            /   \     /   \
***             dx           d5   d6   d7   d8
***            5.01          4    -7   0    5
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b1 is 4, but the depth=1 limited value is 5.
***     The values next to c3, c4, and cx are the values of the evaluation function, not
***     necessarily the correct minimax backup.
*** PASS: test_cases\q3-4a-vary-depth.test
*** FAIL: test_cases\q3-4b-vary-depth.test
***     Incorrect generated nodes for depth=2
***         Student generated nodes: a b1 b2 c3 c4 cx d5 d6 d7 d8 dx
***         Correct generated nodes: a b1 b2 c3 cx d5 d6 dx
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***              |                /    \
***      4.99   cx          8  c3        c4   5
***              |            /   \     /   \
***             dx           d5   d6   d7   d8
***            4.99          4    -7   0    5
***
***     a - max
***     b - min
***     c - max
***
***     Note that the minimax value of b1 is 4, but the depth=1 limited value is 5.
***     The values next to c3, c4, and cx are the values of the evaluation function, not
***     necessarily the correct minimax backup.
*** FAIL: test_cases\q3-one-ghost-3level.test
***     Incorrect generated nodes for depth=3
***         Student generated nodes: a b1 b2 c1 c2 c3 c4 d1 d2 d3 d4 d5 d6 d7 d8
***         Correct generated nodes: a b1 b2 c1 c2 c3 d1 d2 d3 d5 d6
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \              /     \
***        c1        c2        c3        c4
***       /  \      /  \      /   \     /   \
***      d1   d2  d3   d4    d5   d6   d7   d8
***      3    9   10   6     4    7    0    5
***
***     a - max
***     b - min
***     c - max
*** FAIL: test_cases\q3-one-ghost-4level.test
***     Incorrect generated nodes for depth=4
***         Student generated nodes: A B C D E F G H I J K L M N O P a b1 b2 c1 c2 c3 c4 d1 d2 d3 d4 d5 d6 d7 d8
***         Correct generated nodes: A B C D E F I K a b1 b2 c1 c2 c3 d1 d2 d3 d5 d6
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \              /     \
***        c1        c2        c3        c4
***       /  \      /  \      /   \     /   \
***      d1   d2  d3   d4    d5   d6   d7   d8
***     / \  / \  / \  / \   / \  / \  / \  / \
***     A B  C D  E F  G H   I J  K L  M N  O P
***     3 13 5 9 10 11 6 8   1 0  4 7 12 15 2 14
***
***     a - max
***     b - min
***     c - max
***     d - min
*** FAIL: test_cases\q3-two-ghosts-3level.test
***     Incorrect generated nodes for depth=3
***         Student generated nodes: a b1 b2 c1 c2 c3 c4 d1 d2 d3 d4 d5 d6 d7 d8
***         Correct generated nodes: a b1 b2 c1 c2 c3 c4 d1 d2 d3 d4 d5 d6 d7
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \              /     \
***        c1        c2        c3        c4
***       /  \      /  \      /   \     /   \
***      d1   d2  d3   d4    d5   d6   d7   d8
***      3    9   10   6     4    7    0    5
***
***     a - max
***     b - min
***     c - min
*** FAIL: test_cases\q3-two-ghosts-4level.test
***     Incorrect generated nodes for depth=4
***         Student generated nodes: A B C D E G H I J K L M O a b1 b2 c1 c2 c3 c4 d1 d2 d3 d4 d5 d6 d7 d8
***         Correct generated nodes: A B C D E G H I J a b1 b2 c1 c2 c3 d1 d2 d3 d4 d5
***     Tree:
***                 /-----a------\
***                /              \
***               /                \
***             b1                  b2
***           /    \              /     \
***        c1        c2        c3        c4
***       /  \      /  \      /   \     /   \
***      d1   d2  d3   d4    d5   d6   d7   d8
***     / \  / \  / \  / \   / \  / \  / \  / \
***     A B  C D  E F  G H   I J  K L  M N  O P
***     3 13 5 9 10 11 6 8   1 0  4 7 12 15 2 14
***
***     a - max
***     b - min
***     c - min
***     d - max
*** PASS: test_cases\q3-tied-root.test
*** PASS: test_cases\q3-1a-check-depth-one-ghost.test
*** PASS: test_cases\q3-1b-check-depth-one-ghost.test
*** PASS: test_cases\q3-1c-check-depth-one-ghost.test
*** PASS: test_cases\q3-2a-check-depth-two-ghosts.test
*** PASS: test_cases\q3-2b-check-depth-two-ghosts.test
*** PASS: test_cases\q3-2c-check-depth-two-ghosts.test
*** Running AlphaBetaAgent on smallClassic 1 time(s).
Pacman died! Score: 84
Average Score: 84.0
Scores:        84.0
Win Rate:      0/1 (0.00)
Record:        Loss
*** Finished running AlphaBetaAgent on smallClassic after 15 seconds.
*** Won 0 out of 1 games. Average score: 84.000000 ***
*** FAIL: test_cases\q3-pacman-game.test
***     Bug: Wrong number of states expanded.
*** Tests failed.

练习中有很多功能已经给出，我没有修改。为练习目的而修改的唯一函数是上面提到的函数。

源代码： https://github.com/MaAlonsoA/AI-Class/tree/feature/AlphaBetaAgent

运行自动评分器 python autograder.py -q q3

运行在 python 3.9.4

Answer 1

解法：

我之前的实现有两个bug：

之前在minValue函数中计算了所有'legal actions'的值，如果只剩下一个幽灵的话。这是不正确的，你只需要计算一次。

        legalActions: list = state.getLegalActions(ghostIndex)
        if ghostIndex == state.getNumAgents() - 1:
            for action in legalActions:
                value = min(value, maxValue(
                    state.generateSuccessor(ghostIndex, action), depth, alpha, beta))
        else:

固定：

        legalActions: list = state.getLegalActions(ghostIndex)
        for action in legalActions:
            if ghostIndex == state.getNumAgents() - 1:
                value = min(value, maxValue(
                        state.generateSuccessor(ghostIndex, action), depth, alpha, beta))
                if alpha > value:
                    return value
                beta = min (beta, value)
            else:

需要检查currentValue是否大于beta。

    for currentMove in pacManMoves:
    currentValue = minValue(
        gameState.generateSuccessor(0, currentMove), 1, 0, alpha, beta)
    if (currentValue > value):
        value = currentValue
        move = currentMove
return move

固定：

        for currentMove in pacManMoves:
        currentValue = minValue(
            gameState.generateSuccessor(0, currentMove), 1, 0, alpha, beta)
        if currentValue > value:
            value = currentValue
            move = currentMove
        if currentValue > beta:
            return move
        alpha = max(alpha, value)
    return move

为什么我的 Alpha-Beta 修剪会扩展比必要更多的节点？

Why does my Alpha-Beta pruning expand more nodes than necessary?

python

pacman

minimax

alpha-beta-pruning