Diversity and Novelty as Objectives in Poker
Abstract
Evolutionary algorithms are capable to lead to efficient solutions without a predefined design and few human bias. However, they can be prone to early convergence and may be deceived by a non-informative or deceptive fitness function, and thus the agents may end up as suboptimal solutions or not be able to solve the task. Diversity maintenance and novelty search are methods developed to deal with these drawbacks. The first method modifies the evolution so agents are selected based on their fitness and their diversity. Novelty search builds on top of it, and evolve individuals that are not only diverse, but that also possess novel behaviors.
Currently that are no studies on diversity and novelty for tasks that possess both deceptive properties and large amounts of ambiguity. In this work Heads-up Texas Hold'em Poker is used to provide a domain exhibiting both properties simultaneously. Specifically, Poker contains ambiguity due to imperfect information, stochasticity, and intransitivity. It is also deceptive, due to the complex strategies necessary to perform well in the game, such as bluffing. Finally, this poker variant also contains a behavior space that is extremely large, due to its many game states and decision points. This thesis investigates if diversity maintenance and novelty search are still beneficial under a task that posses these features. These techniques are compared between themselves and between the classic evolutionary method. The goal is to analyze if these methods improve the diversity in the population, and if it leads to an improved performance. This work does not aim to develop a world-class poker player, but to assess the significance of diversity and novelty search.
The results show that diversity maintenance methods were not only able to produce a diverse range of strategies for poker, but also to produce statistically better strategies than in a scenario with no diversity.