HowTo: Add Skill Level 3 to the tic-tac-toe game in Python

(Sorry it took so long to get back to the tic-tac-toe series. I got distracted.😉)

Recall that the post Make a tic-tac-toe game in Python made a basic tic-tac-toe (noughts and crosses) game. Because it moved randomly, it was fairly easy to beat.

This example improves those by adding Skill Level 3. At this level, the program uses the minimax algorithm to search the game tree for the best solution possible.

Minimax

In tic-tac-toe, a score is one of WIN, LOSE, or TIE. More generally, it's useful to have a few other scores like UNKOWN (we don't have enough information to know what will happen) and NONE (no move has been made yet).

This example's TicTacToeApp defines constants to represent possible scores like this:

class TicTacToeApp: # Define board values. SCORE_NONE = 0 # No move possible. SCORE_LOSE = 1 # Not good. SCORE_TIE = 2 # Better than losing. SCORE_UNKNOWN = 3 # We still *might* win. SCORE_WIN = 4 # Best outcome.

Philosophical Digression

There's actually an interesting philosophical idea here. Suppose you assume that a tie goes player X. Then there are two possibilities. First, player X can make a series of perfect moves to always force a win. Second, there is no way for player 1 to force a win, so there must be a way for player 2 to force a win.

For tic-tac-toe, that's pretty obvious. If you've played much (or even not all that much), then you probably know that player 1 cannot lose without making some mistakes.

The interesting thing is that this also applies to other deterministic games like chess. (It does not apply to non-deterministic games like poker or backgammon where randomness is a factor.) That means that for chess there is either (1) a way for player 1 to always win or (2) a way for player 2 to always win.

The catch is that there are so many possible moves and board configurations that no human (or computer at this point) can find a perfect strategy. This is related to computability theory and complexity theory in computer science. We know that it is conceivable that you could find a winning strategy because, in theory, you could enumerate every possible move and board position. In practice, the number of possible positions is ridiculously huge so, while this isn't an instance of the halting problem, it is a problem that's much too large to actually solve with today's techniques.

All of this means the simple LOW/TIE/UNKNOWN/WIN scores aren't sufficient. For those games, you need to calculate some sort of numeric estimate of a board's strength so you can compare results.

Minimax Code

def computer_move_level_3(self): '''Use minimax.''' # Get the best move. best_value, best_move = self.get_board_value( self.computer_char, self.player_char, 1, 9) if best_move is None: print(f'Best move: {best_move}, value: {best_value}') # Make a Lveel 2 move. self.computer_move_level_2() else: print(f'Best move: [{best_move[0]}, {best_move[1]}], value: {best_value}') # Take the move. self.squares[best_move[0]][best_move[1]]['text'] = self.computer_char

def get_board_value(self, player1, player2, depth, max_depth): '''Return the best value the board can have for player char.''' # If we are too deep, then we don't know. if depth > max_depth: return TicTacToeApp.SCORE_UNKNOWN, None # Get possible moves. moves = self.get_possible_moves() # If there are no moves, it's a tie. if len(moves) == 0: return TicTacToeApp.SCORE_TIE, None # Find the move that gives the other player the worst score. random.shuffle(moves) best_move = None player2_value = TicTacToeApp.SCORE_WIN for r, c in moves: # Have player1 take this square. self.squares[r][c]['text'] = player1 # See if this gives player1 a win. if self.is_winner(r, c): # This gives player1 a win and therefore player2 a loss. # Take this move. best_move = (r, c) player2_value = TicTacToeApp.SCORE_LOSE else: # Recursively try moves for player2. test_value, test_move = \ self.get_board_value(player2, player1, depth + 1, max_depth) # See if this is an improvement for player 2. if test_value <= player2_value: best_move = (r, c) player2_value = test_value # Unmake this move. self.squares[r][c]['text'] = '' # If player2 will lose, stop searching and take this move. if player2_value == TicTacToeApp.SCORE_LOSE: break # We now know the worst we can force player2 to do. # Convert that into a board value for player1. if player2_value == TicTacToeApp.SCORE_LOSE: best_value = TicTacToeApp.SCORE_WIN elif player2_value == TicTacToeApp.SCORE_WIN: best_value = TicTacToeApp.SCORE_LOSE else: best_value = player2_value return best_value, best_move

It first checks the current depth of recursion to see if we have exceeded the maximum allowed depth. This doesn't matter for program because it allows up to 9 levels of recursion, which is enough to examine every possible game. You could set the maximum depth to something smaller like 3 to create Skill Level 2.5. You might also want to adjust this for more complicated games like chess.

If the depth of recursion is greater than the maximum allowed, the method returns UNKNOWN. (If you look back at the computer_move_level_3 method, you'll see that it calls computer_move_level_2 in this case.)

If we have not exceeded the allowed depth, the program calls get_possible_moves to get a list of all possible moves.

If there are no more possible moves, then the board is filled and the game is a tie. (If it wasn't a tie, then the previous move would have won the game.)

If there are possible moves, the program randomizes them so it doesn't always make the same move in response to a given board position.

The method initializes player2_value to the best possible value that player 2 could get from the board: WIN. It then loops through the possible moves.

For each move, the code takes that move for player 1. If that move lets player 1 win, the program saves that move and sets player2_value to LOSE.

If the move doesn't give player 1 a win, the get_board_value method calls itself recursively to see what the best possible result could be for player 2. If that's a better result than the best result found so far, the code updates best_move and player2_value.

If the best value that player 2 can get from this move is a loss, that's the best possible outcome for player 1, so the code breaks out of the loop to take that move.

After it has check all of the moves, the method knows what the worst possible move is for player 2. It converts that into the best possible move for player 1 (for example, if player 2 wins, then player 2 loses) and returns the result.

Improvements

This isn't too hard for a game like tic-tac-toe, but the details can sometimes be tricky for more complex games where you need to use heuristics to estimate the value of a board position.

Minimax is pretty useful but it can cause a few weird effects. For example, in one that I call "giving up," the program notices that its opponent has a winning strategy. In that case, it doesn't matter what move it makes so it essentially moves randomly. For example, it might not move to block an immediate win because it assumes its opponent will move perfectly and it doesn't distinguish between losing in one move or losing in five moves. You could make it give a higher value to a later loss.

You may also notice that the program takes a few seconds to make its first move or two and then moves much more quickly after that. This happens because the game tree is bigger during the first few moves. he following table shows the number of times get_board_value is called when you make various first moves.

First Move	# Function Calls
Computer moves first	68,576
Upper middle	11,736
Center	5,598
Upper left	4,811

This only takes a few seconds, but could be far worse with a more complex game. You can avoid some of the startup time if you precompute the computer's responses to the possible first moves. Then the program can just look up its first move (or first several moves for a more complicated game) and greatly reduce the time needed in the beginning of a game.

Conclusion

Download the example to see all of the details and to experiment with it. Don't forget to switch to Skill level 3 or you'll wonder why the program plays so badly.

Title: Add Skill Level 3 to the tic-tac-toe game in Python

Minimax

Philosophical Digression

Minimax Code

Improvements

Conclusion