Use case: Othello board game / Alpha-zero

Thought I’d just post another use case I made a while ago.

This one uses a trained model of Othello from this simplified implementation of Alpha Zero made by some guys from Stanford. Apparently you can change the rules to train other 2-person board games. I think their implementation may be a bit buggy. But seems to work OK.

Background: Alpha Zero is Deep Mind’s (now owned by Google) board game AI. Which superseded Alpha Go and was later superseded by Mu Zero, which you don’t have to program in any rules, it learns them itself - and apparently can even play Atari games. They haven’t released their code to the public but have released papers which people have used to reimplement the algorithms.

My implementation is just a bare implementation using the output of the model, not using any Monte Carlo Tree Searching or anything which would improve things a lot.

The model was 62MB and there was no problem running the model in Unity. The model has two outputs (1) the value of the board position (2) Probabilities for the next best move.

The model was trained (by them) for about 3 days, although it would probably need to be trained a lot longer to be any good!

4 Likes

Ah man that is really awesome!

This is very cool! Thanks for sharing