strange behavior of chess library in Python

strange behavior of chess library in Python - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: strange behavior of chess library in Python (/thread-41454.html)

strange behavior of chess library in Python - max22 - Jan-18-2024

I have a rather simple chess endgame Python code implementinng negamax strategy given below but it behaves strange.
Negamax means that when both players play fully rationally the best move leading to the quickest mate or the longest defense by both sides.
It prints correctly this position, but the best move Kb2c2 is not even considered when I have printed all legal moves.
I'm even unable to force the code consider at any stage of searching the best move by white king K.
I would like if some can dig into the code and tell me what's wrong Huh

The output:

. . . . . . . .
. . . k . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. K . p . . . .
. . . . . . . .

Na tahu je: bílý (white to move in Czech language)
b2c3
<LegalMoveGenerator at 0x1d473d26b50 (Ke8, Kd8, Kc8, Ke7, Kc7, Ke6, Kd6, Kc6, d1=Q, d1=R, d1=B, d1=N+)>
<LegalMoveGenerator at 0x1d473d26250 (Kd4, Kc4, Kb4, Kd3, Kb3, Kxd2, Kc2, Kb2)>
<LegalMoveGenerator at 0x1d473d24050 (Kf8, Kd8, Kf7, Ke7, Kd7, d1=Q+, d1=R+, d1=B, d1=N)>
<LegalMoveGenerator at 0x1d473d25e10 (Ke5, Kd5, Kc5, Ke4, Kc4, Ke3, Kd3, Kc3)>
<LegalMoveGenerator at 0x1d473d25110 (Kg8, Ke8, Kg7, Kf7, Ke7, d1=Q, d1=R, d1=B, d1=N)>
<LegalMoveGenerator at 0x1d473d27110 (Kf6, Ke6, Kd6, Kf5, Kd5, Kf4, Ke4, Kd4)>

The code:

import chess
import time
import threading

# Globální proměnná pro sledování počtu prohledaných pozic
positions_count = 0

def update_positions_count(last_time_printed):
    global positions_count
    while not board.is_game_over():
        if time.time() - last_time_printed > 1:
            print(f"\rProhledaných pozic: {positions_count}", end='')
            last_time_printed = time.time()

def evaluate_board(board, depth):
    global positions_count
    positions_count += 1

    if board.is_checkmate():
        return 10000 - depth
    if board.is_stalemate() or board.can_claim_draw():
        return 0
    return None

# ... negamax, find_best_move ...
# Negamax algoritmus
def negamax(board, depth, alpha, beta, color):
    evaluated = evaluate_board(board, depth)
    if evaluated is not None:
        return color * evaluated

    if depth == 0 or board.is_game_over():
        return 0

    max_eval = float('-inf')
    print(board.legal_moves)
    for move in board.legal_moves:
        board.push(move)
        eval = -negamax(board, depth - 1, -beta, -alpha, -color)
        board.pop()
        max_eval = max(max_eval, eval)
        alpha = max(alpha, eval)
        if alpha >= beta:
            break

    return max_eval

 

def find_best_move(board, depth):
    best_move = None
    best_value = float('-inf')
    alpha = float('-inf')
    beta = float('inf')
    color = 1 if board.turn else -1

    print(f"Na tahu je: {'bílý' if board.turn else 'černý'}")

    for move in board.legal_moves:
        print(move.uci())  # Vypíše tahy ve formátu UCI
        if move.uci() == "c1c2":  # Příklad pro tah bílého krále
            print("HERE")
        board.push(move)
        board_value = -negamax(board, depth - 1, -beta, -alpha, -color)
        board.pop()

        if board_value > best_value:
            best_value = board_value
            best_move = move

    return best_move



# Hlavní část kódu
#start_position = "8/8/8/8/8/7Q/k7/2K5 w - - 0 1"
#start_position = "8/3k4/8/8/8/8/1K6/8 w - - 0 1"
start_position = "8/3k4/8/8/8/8/1K1p4/8 w - - 0 1"
board = chess.Board(start_position)
depth = 11  # Můžete zvážit snížení hloubky pro rychlejší výsledky
last_time_printed = time.time()

positions_count_thread = threading.Thread(target=update_positions_count, args=(last_time_printed,), daemon=True)
positions_count_thread.start()

print(board)
print()

while not board.is_game_over():
    best_move = find_best_move(board, depth)
    if best_move is not None:
        board.push(best_move)
        print("\n", board)  # Vytiskne šachovnici po provedení nejlepšího tahu
    else:
        print("Žádný další legální tah není možný.")
        break

print("\nKonec hry")

RE: strange behavior of chess library in Python - deanhystad - Jan-18-2024

Your negmax function doesn't make any sense. It appears to be similar to an algorithm I'm used to called minimax with alpha-beta pruning, but I don't see how it could possibly work.

Often you will see this algorithm implemented as two functions. One to pick the best move for the computer player, and the other to predict the move the opponent will take. Somtimes these are called minimize and maximize, but I am going to call the computer() and player(). Computer picks the best move for the computer to make. Player predicts the move the human player would most likely make in response to the computer.

def computer(board, depth=11, alpha=-inf, beta=inf):
    evaluated = evaluate_board(board, depth)
    if evaluated is not None:
        return -evaluated
 
    if depth == 0 or board.is_game_over():
        return 0

    best = (-inf, None)
    for move in board.legal_moves:
        board.push(move)
        eval = player(board, depth - 1, alpha, beta)
        board.pop()
        if eval > best[0]:
            best = (eval, move)
            alpha = max(alpha, eval)
            if alpha >= beta:
                break
 
    return best


def player(board, depth, alpha, beta):
    evaluated = evaluate_board(board, depth)
    if evaluated is not None:
        return evaluated
 
    if depth == 0 or board.is_game_over():
        return 0

    best = (inf, None)
    for move in board.legal_moves:
        board.push(move)
        eval = computer(board, depth - 1, alpha, beta)
        board.pop()
        if eval < best[0]:
            best = (eval, move)
            beta = min(beta, eval)
            if alpha >= beta:
                break
 
    return best

Notice the subtle differences. In computer(), a game winning result is a bad outcome, because it means the last move made by human() resulted in a draw or win for the human. In human(), a game winning result is a good outcome because it means the computer just won. Your negmax() function does this same thing using the color argument, toggling the value between positive (a good outcome) and negative (a bad outcome).

If the game is not over, computer() tries all available moves, and pics the move with the highest score/evaluation. player() does the opposite, searching for the move with the lowest score/evaluation. You try to do both the computer() and the player() function with negmax(), but your negmax code always searches for the highest score/evaluation. As written, your algorithm is always picking the best move for the computer. The computer will pick bad moves because its opponent is trying to throw the game.

You can combine the computer and opponent into one function.

def negmax(board, color=-1, depth=11, alpha=-inf, beta=inf):
    evaluated = evaluate_board(board, depth)
    if evaluated is not None:
        return color * evaluated

    if depth == 0 or board.is_game_over():
        return 0

    color = -color
    best = (color * inf, None)
    for move in board.legal_moves:
        board.push(move)
        eval = negmax(board, color, depth - 1, alpha, beta)
        board.pop()
        if color > 0:
            if eval > best[0]:
                best = (eval, move)
                alpha = max(alpha, eval)
        else:
            if eval < best[0]:
                best = (eval, move)
                beta = min(beta, eval)
        if alpha >= beta:
            break

    return best

Your version of negmax does not implement all the logic that is conditional based on color.

One more thing. Your logic, and the logic in my functions, do nothing to reward winning in fewer moves. A win in 10 moves gets the same evaluation as a win in 1 move. You can use the "depth" to give more value to a faster win by adding the depth to the eval. Like this:

        return color * (evaluated + depth)