Jan-18-2024, 06:35 PM
Your negmax function doesn't make any sense. It appears to be similar to an algorithm I'm used to called minimax with alpha-beta pruning, but I don't see how it could possibly work.
Often you will see this algorithm implemented as two functions. One to pick the best move for the computer player, and the other to predict the move the opponent will take. Somtimes these are called minimize and maximize, but I am going to call the computer() and player(). Computer picks the best move for the computer to make. Player predicts the move the human player would most likely make in response to the computer.
If the game is not over, computer() tries all available moves, and pics the move with the highest score/evaluation. player() does the opposite, searching for the move with the lowest score/evaluation. You try to do both the computer() and the player() function with negmax(), but your negmax code always searches for the highest score/evaluation. As written, your algorithm is always picking the best move for the computer. The computer will pick bad moves because its opponent is trying to throw the game.
You can combine the computer and opponent into one function.
One more thing. Your logic, and the logic in my functions, do nothing to reward winning in fewer moves. A win in 10 moves gets the same evaluation as a win in 1 move. You can use the "depth" to give more value to a faster win by adding the depth to the eval. Like this:
Often you will see this algorithm implemented as two functions. One to pick the best move for the computer player, and the other to predict the move the opponent will take. Somtimes these are called minimize and maximize, but I am going to call the computer() and player(). Computer picks the best move for the computer to make. Player predicts the move the human player would most likely make in response to the computer.
def computer(board, depth=11, alpha=-inf, beta=inf): evaluated = evaluate_board(board, depth) if evaluated is not None: return -evaluated if depth == 0 or board.is_game_over(): return 0 best = (-inf, None) for move in board.legal_moves: board.push(move) eval = player(board, depth - 1, alpha, beta) board.pop() if eval > best[0]: best = (eval, move) alpha = max(alpha, eval) if alpha >= beta: break return best def player(board, depth, alpha, beta): evaluated = evaluate_board(board, depth) if evaluated is not None: return evaluated if depth == 0 or board.is_game_over(): return 0 best = (inf, None) for move in board.legal_moves: board.push(move) eval = computer(board, depth - 1, alpha, beta) board.pop() if eval < best[0]: best = (eval, move) beta = min(beta, eval) if alpha >= beta: break return bestNotice the subtle differences. In computer(), a game winning result is a bad outcome, because it means the last move made by human() resulted in a draw or win for the human. In human(), a game winning result is a good outcome because it means the computer just won. Your negmax() function does this same thing using the color argument, toggling the value between positive (a good outcome) and negative (a bad outcome).
If the game is not over, computer() tries all available moves, and pics the move with the highest score/evaluation. player() does the opposite, searching for the move with the lowest score/evaluation. You try to do both the computer() and the player() function with negmax(), but your negmax code always searches for the highest score/evaluation. As written, your algorithm is always picking the best move for the computer. The computer will pick bad moves because its opponent is trying to throw the game.
You can combine the computer and opponent into one function.
def negmax(board, color=-1, depth=11, alpha=-inf, beta=inf): evaluated = evaluate_board(board, depth) if evaluated is not None: return color * evaluated if depth == 0 or board.is_game_over(): return 0 color = -color best = (color * inf, None) for move in board.legal_moves: board.push(move) eval = negmax(board, color, depth - 1, alpha, beta) board.pop() if color > 0: if eval > best[0]: best = (eval, move) alpha = max(alpha, eval) else: if eval < best[0]: best = (eval, move) beta = min(beta, eval) if alpha >= beta: break return bestYour version of negmax does not implement all the logic that is conditional based on color.
One more thing. Your logic, and the logic in my functions, do nothing to reward winning in fewer moves. A win in 10 moves gets the same evaluation as a win in 1 move. You can use the "depth" to give more value to a faster win by adding the depth to the eval. Like this:
return color * (evaluated + depth)