Online Algorithms for Video Game AI
Overview| Regret Matching (Actions)| Regret Matching (Strategies)| UCB1 (Actions)| UCB1 (Strategies)
This content is associated with a 2014 GDC AI Summit lecture on algorithms for dynamic behavior, including Regret Matching, UCB1 and MCTS. For more information see the overview page.

The program on this page demos Regret Matching over strategies using the game Rock Paper Scissors. The AI can choose to play randomly, or to imitate the last action of the human player. Imitation variants include playing the action that loses to the last human action and the action that would beat the previous human action.

Note that if you play with a strategy that is predictable, but won't be beaten using an imitating strategy, then the AI won't be able to exploit your weakness.

You can also try to "train" the AI to play poorly. In the long term you will see that you won't come out ahead in such a strategy, as the AI will unlearn anything before you can take advantage of what was learned. (That is, before your score in the game becomes positive.)

Suggested tests: (1) Try to play to win; you shouldn't win much in the long term (2) Play a fixed pattern (R-P-S); the AI should take advantage of your play. (3) If you carefully play a strategy such as imitating the AI, the AI won't be able to take advantage of yoru play. (Although you have to be careful to stay unsynchronized from the AI play.) If the AI had an "imitate opponent" strategy it would be able to defeat such strategies.


Your Last MoveAI Last Move
Your next action:
Your ScoreAI Score
0
0