7
\$\begingroup\$

I am building a chess engine in C++ and currently working on the move generation. To measure performance I use perft, but my main goal is to make the move generator itself faster.

I already use several common optimizations, including:

  • Sliding attacks via magics
  • Full bitboard pipeline
  • Fast bit tricks in hot loops (x & -x, std::countr_zero, std::popcount)
  • Efficient data layout & reuse (flat arrays, precomputed masks)
  • No dynamic allocations

With these, I reach about 42M nodes/second in perft using this implementation:

size_t perft(int depth, bool printCountAfterMoves) {
    size_t nodes = 0;

    MoveList legalMoves = moveGenerator.generateLegalMoves();

    if (depth == 1) {
        return static_cast<size_t>(legalMoves.getMovesCount());
    }

    for (Move move : legalMoves) {
        position.makeMove(move);
        size_t count = perft(depth - 1, false);

        if (printCountAfterMoves) [[unlikely]] {
            std::cout << move.toLan() << ": " << count << '\n';
        }

        nodes += count;
        position.undoMove();
    }

    return nodes;
}

What I’m looking for

I’d like to know if there are further move generation optimizations possible without rewriting the entire engine. In particular:

  • Micro-optimizations that speed up hot loops in generateLegalMoves
  • Ways to reduce branching and redundant checks
  • Code patterns that improve speed while keeping things maintainable

My move generator

Here is my generateLegalMoves function:

MoveList MoveGenerator::generateLegalMoves(void) const {
    MoveList moveList;

    int kingSq = std::countr_zero(position->pieces[position->usColor * 6 + PT_KING]);
    assert(kingSq < 64);

    Bitboard occ = position->occForColor[WHITE] | position->occForColor[BLACK];

    Bitboard attackMask = computeAttackMask(occ);
    Bitboard checkerMask = computeCheckerMask(kingSq, occ);
    bool isInCheck = std::popcount(checkerMask) >= 1;
    bool isInDoubleCheck = std::popcount(checkerMask) == 2;
    // if in normal check and the checking piece is a slider piece, generate the block mask
    Bitboard checkBlockMask =
        isInCheck && !isInDoubleCheck &&
                (checkerMask & ~position->pieces[position->oppColor * 6 + PT_PAWN] &
                 ~position->pieces[position->oppColor * 6 + PT_KNIGHT])
            ? BETWEEN_MASK[kingSq][std::countr_zero(checkerMask)]
            : 0ULL;
    Bitboard checkEvasionMask = checkerMask | checkBlockMask;
    Bitboard pinMask = computePinMask(kingSq, occ);

    Bitboard capturableSquares = ~position->occForColor[position->usColor];
    if (!isInDoubleCheck) [[likely]] {
        // pawns
        Bitboard freeSquares = ~occ;
        Bitboard pawns = position->pieces[position->usColor * 6 + PT_PAWN];
        while (pawns) {
            Bitboard currPawn = pawns & -pawns;
            int currPawnSq = std::countr_zero(currPawn);

            if (position->usColor == WHITE) {
                // we are white
                // pushes
                Bitboard singlePush = WHITE_PAWN_SINGLE_PUSH_MASK[currPawnSq] & freeSquares;
                Bitboard doublePush = ((singlePush & RANK_3) << 8) & freeSquares;

                // captures
                Bitboard leftCapture = WHITE_PAWN_CAPTURE_LEFT_MASK[currPawnSq] &
                                       position->occForColor[position->oppColor];
                Bitboard rightCapture = WHITE_PAWN_CAPTURE_RIGHT_MASK[currPawnSq] &
                                        position->occForColor[position->oppColor];

                Bitboard normalMoves = singlePush | doublePush | leftCapture | rightCapture;

                // en-passant
                Bitboard ep =
                    isEpLegal(kingSq, occ, currPawn)
                        ? (WHITE_PAWN_CAPTURE_LEFT_MASK[currPawnSq] & position->epSquare) |
                              (WHITE_PAWN_CAPTURE_RIGHT_MASK[currPawnSq] & position->epSquare)
                        : 0ULL;

                if (isInCheck) {
                    normalMoves &= checkEvasionMask;

                    // if en-passant does not resolve the check, disallow it
                    if ((ep >> 8) != checkEvasionMask) {
                        ep = 0ULL;
                    }
                }

                if (currPawn & pinMask) {
                    normalMoves &= LINE_MASK[currPawnSq][kingSq];
                    ep &= LINE_MASK[currPawnSq][kingSq];
                }

                // add en-passant move
                if (ep) {
                    moveList.add(Move(currPawn, ep, PT_PAWN, PT_NULL, false, true));
                }

                // add each normal move
                while (normalMoves) {
                    Bitboard to = normalMoves & -normalMoves;

                    // promotion
                    if (to & RANK_8) {
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_KNIGHT, false, false));
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_BISHOP, false, false));
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_ROOK, false, false));
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_QUEEN, false, false));
                    }
                    else {
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_NULL, false, false));
                    }

                    normalMoves &= normalMoves - 1;
                }
            }
            else {
                // we are black
                // pushes
                Bitboard singlePush = BLACK_PAWN_SINGLE_PUSH_MASK[currPawnSq] & freeSquares;
                Bitboard doublePush = ((singlePush & RANK_6) >> 8) & freeSquares;

                // captures
                Bitboard leftCapture = BLACK_PAWN_CAPTURE_LEFT_MASK[currPawnSq] &
                                       position->occForColor[position->oppColor];
                Bitboard rightCapture = BLACK_PAWN_CAPTURE_RIGHT_MASK[currPawnSq] &
                                        position->occForColor[position->oppColor];

                Bitboard normalMoves = singlePush | doublePush | leftCapture | rightCapture;

                // en-passant
                Bitboard ep =
                    isEpLegal(kingSq, occ, currPawn)
                        ? (BLACK_PAWN_CAPTURE_LEFT_MASK[currPawnSq] & position->epSquare) |
                              (BLACK_PAWN_CAPTURE_RIGHT_MASK[currPawnSq] & position->epSquare)
                        : 0ULL;

                if (isInCheck) {
                    normalMoves &= checkEvasionMask;

                    // if en-passant does not resolve the check, disallow it
                    if ((ep << 8) != checkEvasionMask) {
                        ep = 0ULL;
                    }
                }

                if (currPawn & pinMask) {
                    normalMoves &= LINE_MASK[currPawnSq][kingSq];
                    ep &= LINE_MASK[currPawnSq][kingSq];
                }

                // add en-passant move
                if (ep) {
                    moveList.add(Move(currPawn, ep, PT_PAWN, PT_NULL, false, true));
                }

                // add each normal move
                while (normalMoves) {
                    Bitboard to = normalMoves & -normalMoves;

                    // promotion
                    if (to & RANK_1) {
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_KNIGHT, false, false));
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_BISHOP, false, false));
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_ROOK, false, false));
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_QUEEN, false, false));
                    }
                    else {
                        moveList.add(Move(currPawn, to, PT_PAWN, PT_NULL, false, false));
                    }

                    normalMoves &= normalMoves - 1;
                }
            }

            pawns &= pawns - 1;
        }

        // knights
        Bitboard knights = position->pieces[position->usColor * 6 + PT_KNIGHT];
        while (knights) {
            Bitboard currKnight = knights & -knights;
            Bitboard moves = KNIGHT_MOVE_MASK[std::countr_zero(currKnight)] & capturableSquares;

            // checks & pins
            if (isInCheck) {
                moves &= checkEvasionMask;
            }

            if (currKnight & pinMask) {
                moves &= LINE_MASK[std::countr_zero(currKnight)][kingSq];
            }

            addMovesToList(moveList, currKnight, moves, PT_KNIGHT);
            knights &= knights - 1;
        }

        // bishops
        Bitboard bishops = position->pieces[position->usColor * 6 + PT_BISHOP];
        while (bishops) {
            Bitboard currBishop = bishops & -bishops;
            Bitboard moves =
                getBishopAttacks(std::countr_zero(currBishop), occ) & capturableSquares;

            // checks & pins
            if (isInCheck) {
                moves &= checkEvasionMask;
            }

            if (currBishop & pinMask) {
                moves &= LINE_MASK[std::countr_zero(currBishop)][kingSq];
            }

            addMovesToList(moveList, currBishop, moves, PT_BISHOP);
            bishops &= bishops - 1;
        }

        // rooks
        Bitboard rooks = position->pieces[position->usColor * 6 + PT_ROOK];
        while (rooks) {
            Bitboard currRook = rooks & -rooks;
            Bitboard moves = getRookAttacks(std::countr_zero(currRook), occ) & capturableSquares;

            // checks & pins
            if (isInCheck) {
                moves &= checkEvasionMask;
            }

            if (currRook & pinMask) {
                moves &= LINE_MASK[std::countr_zero(currRook)][kingSq];
            }

            addMovesToList(moveList, currRook, moves, PT_ROOK);
            rooks &= rooks - 1;
        }

        // queens
        Bitboard queens = position->pieces[position->usColor * 6 + PT_QUEEN];
        while (queens) {
            Bitboard currQueen = queens & -queens;
            Bitboard moves = (getRookAttacks(std::countr_zero(currQueen), occ) |
                              getBishopAttacks(std::countr_zero(currQueen), occ)) &
                             capturableSquares;

            // checks & pins
            if (isInCheck) {
                moves &= checkEvasionMask;
            }

            if (currQueen & pinMask) {
                moves &= LINE_MASK[std::countr_zero(currQueen)][kingSq];
            }

            addMovesToList(moveList, currQueen, moves, PT_QUEEN);
            queens &= queens - 1;
        }
    }

    // king
    Bitboard kingMoves = KING_MOVE_MASK[kingSq] & capturableSquares & ~attackMask;
    addMovesToList(moveList, position->pieces[position->usColor * 6 + PT_KING], kingMoves, PT_KING);

    if (!isInCheck) {
        if (position->usColor == WHITE) {
            constexpr Bitboard E1 = 1ULL << 4, F1 = 1ULL << 5, G1 = 1ULL << 6, D1 = 1ULL << 3,
                               C1 = 1ULL << 2, B1 = 1ULL << 1;
            // we are white
            // king-side
            if (position->castlingRights & WHITE_KING_SIDE_CASTLE) {
                Bitboard between = F1 | G1;
                // squares empty && not attacked
                if (!(occ & between) && !(attackMask & between)) {
                    moveList.add(Move(E1, G1, PT_KING, PT_NULL, true, false));
                }
            }
            // queen-side
            if (position->castlingRights & WHITE_QUEEN_SIDE_CASTLE) {
                Bitboard between = B1 | C1 | D1;
                Bitboard passSquares = D1 | C1;
                if (!(occ & between) && !(attackMask & passSquares)) {
                    moveList.add(Move(E1, C1, PT_KING, PT_NULL, true, false));
                }
            }
        }
        else {
            constexpr Bitboard E8 = 1ULL << 60, F8 = 1ULL << 61, G8 = 1ULL << 62, D8 = 1ULL << 59,
                               C8 = 1ULL << 58, B8 = 1ULL << 57;
            // we are black
            // king-side
            if (position->castlingRights & BLACK_KING_SIDE_CASTLE) {
                Bitboard between = F8 | G8;
                if (!(occ & between) && !(attackMask & between)) {
                    moveList.add(Move(E8, G8, PT_KING, PT_NULL, true, false));
                }
            }
            // queen-side
            if (position->castlingRights & BLACK_QUEEN_SIDE_CASTLE) {
                Bitboard between = B8 | C8 | D8;
                Bitboard passSquares = D8 | C8;
                if (!(occ & between) && !(attackMask & passSquares)) {
                    moveList.add(Move(E8, C8, PT_KING, PT_NULL, true, false));
                }
            }
        }
    }

    return moveList;
}

For a complete MCVE, please see this repository.

Additional info on types

  • Move: Encapsulates a single move. Stores source square, destination square, moving piece type, optional promotion piece, and flags (castling, en passant, etc.). Provides helpers such as toLan().
  • MoveList: A lightweight container for Move objects. Uses flat storage (no dynamic allocations) and supports iteration and size queries.
  • Position: Represents the full board state. Holds piece bitboards, per-color occupancy, side to move, en passant square, and castling rights. Provides makeMove / undoMove for state transitions.
  • Bitboard: Alias for a 64-bit integer (uint64_t) used as a board mask. Each bit corresponds to a square.
  • PieceType: Enum for piece kinds: PT_NULL, PT_PAWN, PT_KNIGHT, PT_BISHOP, PT_ROOK, PT_QUEEN, PT_KING.
  • CastlingRights: Bitmask enum with individual flags for each castling possibility (white/black, king/queen side).

Precomputed masks and tables

  • BETWEEN_MASK[from][to]: Squares strictly between two given squares (empty if not aligned).
  • LINE_MASK[from][to]: Full ray line through two squares (including both).
  • XRay masks: For rooks and bishops, represent rays extending from a square while ignoring all blockers (used in pin/check calculations).
  • Rank masks: Constants like RANK_1, RANK_3, RANK_6, RANK_8 to quickly identify board rows.

Functions

  • getRookAttacks(sq, occ): Returns rook attack bitboard from square sq given occupancy occ using magic bitboards.
  • getBishopAttacks(sq, occ): Returns bishop attack bitboard from square sq given occupancy occ using magic bitboards.
  • addMovesToList(moveList, from, allMoves, pt): Expands a bitboard of destinations into Move objects and adds them to moveList.
  • Move::Move(...): Packs a move into a compact 32-bit layout using from, to, piece type, promotion, and flags.
  • MoveGenerator::generateLegalMoves(): Generates all legal moves for the side to move, accounting for checks, pins, en passant, and castling.
  • MoveGenerator::computeAttackMask(occ): Computes the opponent’s attack map (all squares attacked by the opponent).
  • MoveGenerator::computeCheckerMask(kingSq, occ): Returns a bitboard of opponent pieces currently giving check to the king.
  • MoveGenerator::computePinMask(kingSq, occ): Computes pinned friendly pieces by scanning for enemy sliders aligned with the king.
  • MoveGenerator::isEpLegal(kingSq, occ, capturingPawn): Checks whether an en passant capture is legal by verifying it does not expose the king to a horizontal check.
\$\endgroup\$
5
  • \$\begingroup\$ It's hard to review this code without any information on the types used (MoveList, BitBoard, etc.) - could you add more context to assist reviewers? \$\endgroup\$ Commented Sep 10 at 8:34
  • \$\begingroup\$ @TobySpeight I have updated the question \$\endgroup\$ Commented Sep 10 at 8:49
  • 2
    \$\begingroup\$ When you incorporate advice from answers and want an additional review, please post a new question with a link back to the original question. Please do not edit the question, especially the code, after an answer has been posted. Changing the question may cause answer invalidation. Everyone needs to be able to see what the reviewer was referring to. What to do after the question has been answered. \$\endgroup\$ Commented Sep 10 at 12:03
  • \$\begingroup\$ printCountAfterMoves should probably be a NTTP to minimize the impact of the if branch on the measured performance. \$\endgroup\$ Commented Sep 11 at 18:10
  • \$\begingroup\$ I vote for including class MoveList above - and not accepting further answer 'til then. \$\endgroup\$ Commented Sep 17 at 12:52

1 Answer 1

8
\$\begingroup\$

This check (no pun intended) is done repeatedly:

// checks & pins
if (isInCheck) {
    moves &= checkEvasionMask;
}

If you set up the evasion mask so that it is -1 when not in check, the bitwise AND can be unconditional. That may or may not help, depending on what the code got compiled to.

For pawns, the check if (position->usColor == WHITE) could be lifted out of the loop, the compiler may already be doing that but you can confirm that by reading the assembly.

I noticed that various computations are heavily based on array lookups, that's not necessarily bad, but it's also not necessarily cheap to pull something out of an array, depending on the size and access pattern. ROOK_ATTACK_MASK and BISHOP_ATTACK_MASK are fairly large and there are various alternatives for those sliding piece attacks, that may help you fit the relevant tables in L1 cache or in some cases eliminate them, of course this should all be informed by benchmarks not merely by my armchair speculation.

// find which piece type
for (uint8_t pt = 0; pt < 6; ++pt) {
    if (pieces[oppColor * 6 + pt] & hit) {
        capturedType = pt;
        pieces[oppColor * 6 + pt] ^= hit;
        occForColor[oppColor] ^= hit;
        break;
    }
}

It's rarely useful for a loop counter to be a narrow type, probably harmless in this case, but not always (especially if the type is narrow and signed, which in this case it isn't). You may as well use size_t by default though. (this rule of thumb may not apply to embedded programming on 8-bit MCUs)

SIMD could be used to rewrite this loop, for example with AVX512 it would be possible to do the bitwise AND across all piece types at once, get the result of that comparison as a packed bitmask, then you can find the index with std::countr_zero. This costs fewer operations in what used to be the worst case, maybe more in the best case (but different instructions with different costs so a direct comparison doesn't make sense), and gets rid of a possibly hard to predict data-dependent loop exit, anyway I cannot guarantee that it's worth doing. With 256-bit SIMD like AVX2 or 128-bit SIMD like SSE4 or ARM NEON it's also possible but it would take more instructions to do the same thing so it's even less clear if that could help at all.

\$\endgroup\$
1
  • \$\begingroup\$ I added the make/undo move function code. There most likely somethign to optimize there \$\endgroup\$ Commented Sep 10 at 11:43

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.