DEV Community

Cover image for My experience in cs50 ai part 3 (Adversarial search)

My experience in cs50 ai part 3 (Adversarial search)

Debojyoti Chakraborty
I am a 3rd year cs undergraduate in India with a love in machine learning,be passionate in web dev and cyber security.Now keep exploring the computational neuroscience.
・4 min read

Previous part I wrote about the Bfs,dfs and other searches where in game there was only one player but we have a opponent you know that,so for 2 people game there exists some conflicts for that reason we need to implement adversarial search to make computer capable of playing such games with us.In this case let's take a game as example and we choose tic-tac-toe as example.

in adversarial search the algorithm faces an opponent that tries to achieve the opposite goal.So there is first algorithm we can use is minimax algorithm.

A type of algorithm in adversarial search, Minimax represents winning conditions as (-1) for one side and (+1) for the other side. Further actions will be driven by these conditions, with the minimizing side trying to get the lowest score, and the maximizer trying to get the highest score.

Representing a Tic-Tac-Toe AI:

  • S₀: Initial state (in our case, an empty 3X3 board)
    Players(s): a function that, given a state s, returns which player’s turn it is (X or O).

  • Actions(s): a function that, given a state s, return all the legal moves in this state (what spots are free on the board).

  • Result(s, a): a function that, given a state s and action a, returns a new state. This is the board that resulted from performing the action a on state s (making a move in the game).

  • Terminal(s): a function that, given a state s, checks whether this is the last step in the game, i.e. if someone won or there is a tie. Returns True if the game has ended, False otherwise.

  • Utility(s): a function that, given a terminal state s, returns the utility value of the state: -1, 0, or 1.
    How the algorithm works:

Recursively, the algorithm simulates all possible games that can take place beginning at the current state and until a terminal state is reached. Each terminal state is valued as either (-1), 0, or (+1).

Alt Text

Now minimax algorithm always tries to suppress the human player by calculating min/max value of the actions from the game tree.

Knowing based on the state whose turn it is, the algorithm can know whether the current player, when playing optimally, will pick the action that leads to a state with a lower or a higher value. This way, alternating between minimizing and maximizing, the algorithm creates values for the state that would result from each possible action. To give a more concrete example, we can imagine that the maximizing player asks at every turn: “if I take this action, a new state will result. If the minimizing player plays optimally, what action can that player take to bring to the lowest value?” However, to answer this question, the maximizing player has to ask: “To know what the minimizing player will do, I need to simulate the same process in the minimizer’s mind: the minimizing player will try to ask: ‘if I take this action, what action can the maximizing player take to bring to the highest value?’” This is a recursive process, and it could be hard to wrap your head around it; looking at the pseudo code below can help. Eventually, through this recursive reasoning process, the maximizing player generates values for each state that could result from all the possible actions at the current state. After having these values, the maximizing player chooses the highest one.

Now tic tac toe is a simple game which has 64 possible states so for large games like chess it has 2^64 possible states from which building a hierchical tree quiet impossible for solve this we have two main algorithm called alpha beta pruning and depth limited Minimax.

Alpha-Beta Pruning

A way to optimize Minimax, Alpha-Beta Pruning skips some of the recursive computations that are decidedly unfavorable. After establishing the value of one action, if there is initial evidence that the following action can bring the opponent to get to a better score than the already established action, there is no need to further investigate this action because it will decidedly be less favorable than the previously established one.

Depth-Limited Minimax

There is a total of 255,168 possible Tic Tac Toe games, and 10²⁹⁰⁰⁰ possible games in Chess. The minimax algorithm, as presented so far, requires generating all hypothetical games from a certain point to the terminal condition. While computing all the Tic-Tac-Toe games doesn’t pose a challenge for a modern computer, doing so with chess is currently impossible.

Depth-limited Minimax considers only a pre-defined number of moves before it stops, without ever getting to a terminal state. However, this doesn’t allow for getting a precise value for each action, since the end of the hypothetical games has not been reached. To deal with this problem, Depth-limited Minimax relies on an evaluation function that estimates the expected utility of the game from a given state, or, in other words, assigns values to states. For example, in a chess game, a utility function would take as input a current configuration of the board, try to assess its expected utility (based on what pieces each player has and their locations on the board), and then return a positive or a negative value that represents how favorable the board is for one player versus the other. These values can be used to decide on the right action, and the better the evaluation function, the better the Minimax algorithm that relies on it.

reference taken of the definition of the algorithms from the cs50 notes:

Discussion (0)