The fifth paper from TAMING UNCERTAINTY (R, Hertwig et al., MIT Press. 2019), "Strategic Uncertainty and Incomplete Information: the Homo Heuristics Does Not Fold (L, Spiliopoulos and R, Hertwig)".
This is another paper on heuristics, but this one examines how far heuristic decision making is effective and examines the borderline of setting environments.
Neither of us knows the other's preferences or characteristics, and neither of us knows much about the environment we are in. Moreover, neither of them has ever played against the other before, and they have only played once. In such a case, will heuristic judgment really work? Wouldn't the classical method be better? I am exploring the boundaries between the two.
We evaluate the performance of 10 different strategies by randomly playing against each other 10,000 times in different environments (i.e., each model uses a different calculation method).
No prior learning is required, as it is assumed that "the game is played only once.
<Heuristic Strategy>
Level-1 Prediction level of opponent's behavior: The opponent's past behavior is taken into consideration. In this case, the opponent is assumed to choose all possible alternatives with the same probability.
Level-2 Prediction level of opponent's behavior: Assume that the opponent predicts the behavior of Level-1.
Level-3 Prediction level of opponent's behavior: Assume that the opponent predicts Level-2 behavior.
Level-k Predicted level of opponent's behavior: Assume that the opponent predicts Level-(k-1) behavior.
Social maximum Select the action that maximizes the total reward of the entire game.
Equality Select the action with the smallest reward difference from the opponent.
Dominance-1 Choose the action that gives you the greatest advantage over your opponent, ignoring the actions in which your opponent dominates the game.
Maximax Select the action that maximizes the reward for each action.
Maximin For each action, calculate the minimum reward and select the action with the largest reward.
<Random baseline strategy>
Random baseline NO strategy.
<Classical strategy>
When each player adopts the optimal strategy, both sides obtain the maximum profit.
Specifically, the environmental conditions differ in three ways: (1) game size (number of actions), (2) reward uncertainty, and (3) rationality of the opponent.
As a result of the verification, the simple heuristic strategies (Level-1 and Dominance-1) performed the best in this environment.
Both minimize the prediction of the opponent's behavior.
In the case of a first-time match and a one-shot game, it seems to be possible to make better decisions if one does not have certain assumptions about the opponent.
For the purpose of this paper, where is the boundary between heuristic and classical methods?
Figure1 : Results of the competition among policies, with performance operationalized in terms of the Indifference criterion. The average percentage of missing payoffs ranges from 0% to 80%; the size of the action space ranges from 2 to 20. The darker the shading, the better the performance.
Sited from:Strategic Uncertainty and Incomplete Information: The Homo Heuristics Does Not Fold (L, Spiliopoulos and R, Hertwig)
In Figure 1, the Y-axis shows the size of the game and the X-axis shows the payoffs (the percentage of missing information about the benefits or rewards a player gets for choosing an action). In other words, the further to the right on the graph, the higher the Uncertainty situation.
The y-axis indicates the size of the game, this time from 2 to 20.
The lines separating the spatial regions in the graph like contour lines and the numbers written on the lines are the average values of the missing payoffs.
Darker values of payoffs indicate better policy performance. In other words, the larger the area of darker color, the better the strategy performs in a wide range of situations.
Indeed, L1 and D1 have the widest dark-colored regions.
In the results of this experiment, what exactly do you mean by simple and complex situations, and how much below or above the size of the game?
Looking at the graph, it seems that, roughly speaking, simple games are those with less than 10 actions, while complex games are those with more than 10 actions.
In other words, it is shown that the heuristic methods of L1 and D1 are effective when the number of actions is less than 10 and the information needed to calculate the expected value is more than 50% insufficient for the game.
Only as an experimental result of this paper, the classical method NE (Nash equilibrium) does not show any particularly dark areas, so the characteristic "the classical method is stronger in such cases" is not seen.
What about when these simulations are conducted between opponents? I am interested in the issue of the singularity of AI vs. AI, so to speak, or the superfluity or saturation point of the battle over who has the higher performance and which data to use.
Is that the answer already, because this paper says that it is good to make simple predictions without thinking that much about the other party in the first place.
This is off the subject of this paper, but one bias that comes to mind, group polarization, also comes to mind.