Hi all,
Pretty new to ML training and I’m struggling to figure out how to approach my problem.
I currently have an environment with a NN brain with two branches (Move & Attack type) which both have a size of 4. The game style is grid-based & we-go.
At the moment my flow is pretty typical: Request Decision → Get decision from both branches → Execute Branch 1 Move → Execute Branch 2 Attack → Set Reward
The way the game flows for the player is different to how I’m currently training, they have to make their 4 decisions for the round upfront and have to input those decisions during a preparation phase, rather than choosing moves one at a time.
What I am trying to figure out is how to replicate this behaviour for the NN (if possible) and request it’s 4 decisions upfront in the preparation phase and then execute them sequentially and reward based on the outcome of the output of those 4 decisions as a group.
My only thought so far has been expanding the branches to convert 1 output into multiple actions but that could create large branches and I’m not sure how that would affect performance.
Any thoughts/suggestions appreciated!