# Problems with Minimax + Alpha-Beta Pruning : AI Gone Wild!

I’m making a game based on a chess-like game called Hnefatafl (Nef-Uh-Taf-Ul), an old ancestor of chess played in Viking Age Scandinavia. My game’s functional skeleton is almost done (as far as a minimum viable product goes, at any rate).
However, the AI for an opposing player seems very broken and I’m pretty stumped on how to fix it. The algorithm I’m using is Minimax with Alpha-Beta pruning. The relevant code is below.

Relevant AI Code

`````` /// <summary>
/// Generates a MiniMax tree data structure that a game playing agent can use.
/// </summary>
/// <param name="depth"> Decremented at every level. If this hits zero, generation ends. </param>
/// <param name="isMaximizing"> Toggles between each level, basically simulating the plays of the "other" team (because the other team will seek to minimize your score). THIS IS ASSUMED TO BE THE DEFENDERS! </param>
public void GenerateTree(int depth, bool isMaximizing)
{
//Whenever this is called, we want to reset MinMaxNode's record of how many nodes there are.
MinMaxNode.nextID = 0;

int alpha = 0;//int.MinValue;
int beta = 0;//int.MaxValue;
int finalRootValue = AlphaBeta(this.root, ref alpha, ref beta, depth, isMaximizing);

}

/// <summary>
/// The actual work of generating the tree is done here.
/// </summary>
/// <param name="node">The node the function is considering at this recursion level</param>
/// <param name="alpha"> The value of the best move for Max found so far.</param>
/// <param name="beta"> The value of the best move for Min (scored from Max's perspective) found so far.</param>
/// <param name="depth"> A record of how many recursion levels we've gone down. Generation is cut off when this reaches zero.</param>
/// <param name="isMaximizing"> Is it Max's turn or Min's turn?</param>
/// <returns></returns>
private int AlphaBeta(MinMaxNode node, ref int alpha, ref int beta, int depth, bool isMaximizing)
{
Debug.Log(\$"GenerateTree - Node {node.id}, Depth {depth}, Is Max: {isMaximizing}");

//For the terminal states, the winning team determines which heck-load of a score we assign to the node.
// If the defenders have won, then we give this node a positive infinity score.
// If the attackers have won, then we give this node a negative infinity score.
GameTeamSO winningTeam = node.state.IsTerminal();
if (winningTeam == TaflGameController.Instance.GetDefenderTeam())
{
int nodeScore = int.MaxValue;
node.setScore(nodeScore);
return nodeScore;
}

else if (winningTeam == TaflGameController.Instance.GetAttackerTeam())
{
int nodeScore = int.MinValue;
node.setScore(nodeScore);
return node.score;
}

//Non-terminal state, but we're at the end of our recursion rope.
if (depth == 0)
{
Debug.Log("Reached depth 0 , using heuristics!");
int nodeScore = GetHeuristicScore(node.state);
node.setScore(nodeScore);
return nodeScore;
}

//Handle max's turn.
if (isMaximizing)
{

int nodeScore = int.MinValue;

List<AIAction> possibleActions = GetAllPossibleActions(node.state);
Debug.Log(\$"Max: There are {possibleActions.Count} possible actions from node {node.id}");
foreach(AIAction action in possibleActions)
{
MinMaxNode child = new MinMaxNode(action.resultant_state, action);
this.NumNodes++;

//The score of this node equals the maximal score of each child.
nodeScore = Mathf.Max(nodeScore, AlphaBeta(child, ref alpha, ref beta, depth - 1, !isMaximizing));

//Debug.Log(\$"NodeScore for node {node.id} : {nodeScore}, alpha {alpha}, beta {beta}, depth {depth}, {isMaximizing}");

alpha = Mathf.Max(alpha, nodeScore);

//Prune the tree by refusing to consider anything further if nodeScore is more than the value of the best move for Min we've found so far.
// Removing this part makes this function a regular MiniMax generate function.
if (nodeScore >= beta)
{
//Debug.Log(\$"Pruning at depth {depth}, node score {nodeScore} >= beta {beta}");
break;
}
}

//Cache the recursively calculated score.
node.setScore(nodeScore);

return nodeScore;
}

//Min's turn
else
{
int nodeScore = int.MaxValue;

List<AIAction> possibleActions = GetAllPossibleActions(node.state);
foreach(AIAction action in possibleActions)
{
MinMaxNode child = new MinMaxNode(action.resultant_state, action);
Debug.Log(\$"Min: There are {possibleActions.Count} possible actions from node {node.id}");

this.NumNodes++;

//The score of this node equals the minimal score of each child.
nodeScore = Mathf.Min(nodeScore, AlphaBeta(child, ref alpha, ref beta, depth - 1, !isMaximizing));

//Debug.Log(\$"NodeScore for node {node.id} : {nodeScore}, alpha {alpha}, beta {beta}, depth {depth}, {isMaximizing}");

beta = Mathf.Min(beta, nodeScore);

//Prune the tree by refusing to consider anything further if nodeScore is less than the value of the best move for Max we've found so far.
// Removing this part makes this function a regular MiniMax generate function.
if (nodeScore <= alpha)
{
//Debug.Log(\$"Pruning at depth {depth}, node score {nodeScore} <= alpha {alpha}");
break;
}
}

node.setScore(nodeScore);

return nodeScore;
}

}
``````

Nature of the Problem:

The AI appears to only explore a single branch of the Minimax tree it generates when alpha-beta pruning is enabled, which I previously believed to be a natural consequence of there being no real score-gains at early stages of the game (plus, many initial moves have identical utility because the initial state of the board is symmetrical with respect to each quadrant).
However, increasing the tree exploration depth with alpha-beta pruning enabled (the break clauses uncommented) didn’t seem to help with that. This might still be fair enough for the initial game state, but the AI doesn’t seem to try for captures (which is covered in the heuristic eval function) even if it can perform a capture in a single move. Usually, it oscillates a single piece back and forth between two empty grid spaces when it can, and performs other similarly useless moves when it can’t. It doesn’t seem to care about the King’s position either, but that’s almost certainly due to my heuristic function being too basic and as such is beyond this question’s scope.

SKIP HERE FOR THE ACTUAL QUESTION:
I would like to know if the code you see here has any fundamental issues as far as the “canonical” Alpha-Beta algorithm is concerned. This will help me narrow possible causes to some of the utility functions that I’ve always suspected of malicious intent, if the answer is that there are no issues. (Though, it could also be both that are in error!)
Let me know if you want to see the heuristic function – I didn’t include it because I don’t want to clutter the question too much.

Other Notes:
I’m working on testing this with pure MiniMax, results should be forthcoming soonish.

Staring at code in isolation is unlikely to be useful here.

What the code looks like isn’t a useful metric of its functionality.

Instead, seed your board with a minimum number of pieces to trigger the problem (ideally only one piece on each side) and then start debugging, either by attaching the debugger or simply logging what the code is doing.

2 Likes

A minimal set of pieces sounds like a good test plan, thanks. It’d have to be more than one, because capturing an opposing piece in this particular game requires two allied pieces “flanking” an enemy on either side.

Honestly, I think a refactor might be brewing in the future. As it is now, the game awkwardly splits the rules between those rules that operate on the “regular” game board represented by a collection of GameObjects and those rules that operate on an abstract “BoardState” made solely for the AI to use. Often, there are identical (or nearly so) functions that are trying to get the same answer, but differ solely in what data structures they return for which system’s usage. It seems like that’d be a significant source of potential errors, since the sibling function answers might differ in subtle ways that cause problems with the AI later on. They should be harmonized so that everything’s operating on a BoardState, the GameObjects relegating themselves to visuals.

I’ll keep this unresolved in case anyone else wants to chime in.

This ^ ^ ^ ^ absolutely… yes.

Having two sources of truth is never gonna work.

If you are making a chess or checkers or go or hnefatafl engine, make it first so that you can run it from the command line. That way you can test it in utter isolation:

``````\$ initgame
Game initialized.
\$ move e2 e4
Ok.
AI move: f7 f5
\$ print
R N B Q
P P P P
. . . .
. . . .
etc...`````````

Do you think that’d be feasible six months into development of this game, though? To be sure, a console program wouldn’t be too complicated if I take some shortcuts and I could copy some of the AI code with some tweaks here and there. Maybe it’d be useful to get rid of all the extraneous crap, as you say, but I don’t know.

Sounds like a Sunk Cost Fallacy to me. https://en.wikipedia.org/wiki/Sunk_cost

A better way to frame the question is to ask yourself,

“Given that the source of truth is split and I have bugs in my AI, can I possibly avoid refactoring the core problems and yet still solve all these things as well as any future things that reveal themselves in the time until I have to ship this game?”

If you can, great, but you sure have a lot MORE work to do if you choose this path.

If you’re ready to refactor, start speccing out what the API would be for things like the main game state engine (eg, board and pieces), the user input module (accept input, validate, send to engine, move piece), and the AI (read board, create move, which you then feed back into the user input module as if it was actually user input and not AI.

I see the truth in what you say – better to do the work now and avoid future misery. I think I have a path forward, so I’ll mark this question as resolved. Thank you for your input!