Masking specific combinations of values in discrete action branches for a board game

Hello everyone,

I’m working on a project using ML-Agents to train an agent in a board game. In my game, each move consists of several properties, such as type, origin cell index, direction, and target cell index.

I have defined my discrete actions using multiple branches like this: discrete_actions_branches: [5, 37, 6, 37]. Each branch represents a different property of the move. However, not all combinations of values across branches are valid moves, depending on the current state of the board.

I need to mask specific combinations of values in discrete action branches to prevent the agent from performing invalid moves in the game. Is there a way to achieve this? If not, do you have any suggestions on how I should define my actions or modify my approach to handle the complexity of valid and invalid moves?

Any help or guidance would be greatly appreciated. Thank you in advance!

Yes, there is a way to mask specific actions depending on the game situation. You can refer the page below on how to use it:

I’ve used it for my game, and it helped reduce training time as well.