AI Summit Supplementary Material: UCB1 on Actions Demo

This content is associated with a 2014 GDC AI Summit lecture on algorithms for dynamic behavior, including Regret Matching, UCB1 and MCTS. For more information see the overview page.

The program on this page demos UCB1 over actions using the game Rock Paper Scissors. The AI can choose between rock, paper, and scissors.

Note that UCB1 is not a randomized algorithm. Thus, by looking at the internal AI debug information, you can always make the play the beats the AI. Even if you don't look at the information, the AI tends to play predictably. Thus, UCB1 is not a good strategy when actions should be precisely randomized to avoid exploitation.

Suggested tests: (1) Try to play to win; you should notice patterns in the AI play (2) Look at the debug information. You shoudl be able to win in every play. (3) Play a fixed pattern (R-P-S); the AI shouldn't be able to take advantage of it.

Your Last Move	AI Last Move

Your next action:

Your Score	AI Score
0	0