Tag: game theory

  • Generalists Outperform Specialists in Certain Game Theory Scenarios, MIT Study Finds

    Generalists Outperform Specialists in Certain Game Theory Scenarios, MIT Study Finds

    In a surprising twist that challenges long-held assumptions in game theory, a new MIT study shows that general-purpose algorithms called policy gradient methods can outperform specialized game-theoretic algorithms in certain imperfect-information games. The findings, presented at the International Conference on Learning Representations in Rio De Janeiro, could reshape how artificial intelligence agents are trained to make decisions in competitive, real-world scenarios.

    Imperfect-information games—where players don’t know everything about their opponents—are common in life, from poker and bidding wars to military operations and financial negotiations. For decades, the prevailing belief was that algorithms specifically designed for these games, grounded in game theory, would always outshine general-purpose alternatives. However, the MIT-led team discovered that policy gradient methods, originally developed in the 1990s for single-agent decision-making, often perform better and with greater efficiency.

    The researchers created a benchmark to fairly evaluate different algorithms, measuring performance through a concept called exploitability—how well a player does against a worst-case adversary. In experiments involving five games, including Phantom Tic-Tac-Toe, imperfect-information Hex, and Liar’s Dice, neural networks trained with policy gradient algorithms consistently achieved lower exploitability scores than those trained with game-theoretic algorithms.

    “Our study showed that policy gradient methods can work better than these specialized algorithms, and that the specialized algorithms may not work as well as people thought,” said Samuel Sokota, a co-author from Carnegie Mellon University. The team’s benchmarking software, which they have made freely available, allows others to test and compare algorithms with just a single line of code added to the OpenSpiel library.

    The implications extend far beyond board games. “Hidden information is a very important property of the world,” said Eugene Vinitsky of New York University, another co-author. “It pervades military operations, trading scenarios, and negotiations—all of which are carried out under conditions of hidden information. The idea that we can improve on these games suggests that we can also do better in these other settings as well.”

    Ian Gemp, a computer scientist and game theory expert at Google DeepMind not involved in the study, called the results encouraging: “This work serves as a compelling reminder that modernizing classical tools remains a highly productive path for solving complex strategic problems.”

  • Policy Gradient Methods Outperform Specialized Game Theory Algorithms in Imperfect-Information Games

    Policy Gradient Methods Outperform Specialized Game Theory Algorithms in Imperfect-Information Games

    A new MIT-led study challenges long-held assumptions in game theory, demonstrating that general-purpose policy gradient methods can outperform specialized game-theoretic algorithms in imperfect-information, zero-sum games. The research, presented at the International Conference on Learning Representations, provides a benchmark for evaluating algorithms that train neural networks to compete in strategic interactions where players have hidden information.

    The team, including MIT PhD student Sobhan Mohammadpour and Assistant Professor Gabriele Farina, found that policy gradient methods—originally developed in the 1990s for decision-making—achieved lower exploitability scores than game-theory-based approaches in games like Phantom Tic-Tac-Toe, imperfect-information Hex, and Liar’s Dice. Exploitability measures how well a player performs against a worst-case adversary; a score of zero indicates perfect play.

    “It had been pretty much taken for granted that specialized game-theoretic algorithms were the right approach,” said co-author Samuel Sokota. “Our study showed that policy gradient methods can work better than these specialized algorithms.” The researchers attribute the oversight to a lack of rigorous benchmarking, which they have now addressed by releasing a freely available benchmark tool that runs on ordinary laptops.

    The benchmark, built on OpenSpiel, allows researchers to train and compare algorithms on games with up to 30 billion states. Farina emphasized that the term “game” applies broadly to multi-agent strategic interactions, including military operations, trading, and negotiations—all of which involve hidden information. “The idea that we can improve on these games suggests that we can also do better in these other settings,” said co-author Eugene Vinitsky.

    Ian Gemp of Google DeepMind praised the work: “This work serves as a compelling reminder that modernizing classical tools like policy gradient methods remains a highly productive path for solving complex strategic problems.”