Tag: algorithms

  • MIT Study Reveals Why Trio Comparisons Outperform Pairwise for Predicting Preferences

    MIT Study Reveals Why Trio Comparisons Outperform Pairwise for Predicting Preferences

    A new paper from MIT researchers provides a major upgrade to the nearly century-old idea of random utility models (RUMs), showing that asking people to rank three options instead of just two can reveal hidden correlations that dramatically improve preference predictions.

    In 1927, psychologist L. L. Thurstone laid the foundation for random utility models, which assume that when people choose among alternatives, they select the one with the highest subjective value, even if they cannot assign a specific number to that choice. These models are inherently random because preferences vary across individuals and even within the same person over time.

    RUMs are widely used by governments and companies to predict behavior in counterfactual scenarios, such as how commuters would react to a road closure or how to allocate a budget to maximize public good. Despite nearly a century of refinement, the standard approach relies heavily on pairwise comparisons (e.g., “Do you prefer A or B?”) because people find it easier to compare two items than to assign a numerical rating.

    However, the MIT team — Yeshwanth Cherapanamjeri, Gabriele Farina, Constantinos Daskalakis, and Sobhan Mohammadpour — proved that pairwise comparisons alone cannot capture correlations between preferences. For example, someone who favors gun control is likely also to support government-funded child care, or a fan of independent movies may also enjoy foreign films but dislike blockbusters. Ignoring these correlations leads to inaccurate models.

    The key breakthrough, presented at the International Conference on Learning Representations in Rio de Janeiro, is that correlations become detectable when large numbers of people rank three alternatives in order of preference. The same information can also be obtained from a combination of best-of-three and best-of-two choices. The researchers developed an efficient algorithm to merge individual triplet rankings into a single model that captures the full picture.

    “This paper provides a crucial breakthrough,” says Emma Frejinger, a computer scientist at the University of Montreal. “It mathematically proves why traditional data collection fails and demonstrates that simply asking users for their best-of-three choices unlocks the ability to accurately train these powerful models.”

    The work has direct implications for AI alignment. Large language models (LLMs) are often trained by having humans rank candidate outputs — a process that can be made far more effective by using triplet comparisons. As Daskalakis notes, “RUMs play a central role in the commercial viability and usefulness of LLMs.”

    The team’s findings also show that the number of experiments needed does not grow exponentially with the number of items in a catalog, making the approach practical for real-world applications like streaming services, e-commerce, and political polling.

    “This finding provides a highly practical roadmap for collecting better data to drive more accurate optimizations,” adds Frejinger.

    Looking ahead, the MIT researchers believe that building and refining utility models will remain a vibrant area of research, critical to aligning AI systems with human preferences and to sustaining the internet economy.

  • Generalists Outperform Specialists in Certain Game Theory Scenarios, MIT Study Finds

    Generalists Outperform Specialists in Certain Game Theory Scenarios, MIT Study Finds

    In a surprising twist that challenges long-held assumptions in game theory, a new MIT study shows that general-purpose algorithms called policy gradient methods can outperform specialized game-theoretic algorithms in certain imperfect-information games. The findings, presented at the International Conference on Learning Representations in Rio De Janeiro, could reshape how artificial intelligence agents are trained to make decisions in competitive, real-world scenarios.

    Imperfect-information games—where players don’t know everything about their opponents—are common in life, from poker and bidding wars to military operations and financial negotiations. For decades, the prevailing belief was that algorithms specifically designed for these games, grounded in game theory, would always outshine general-purpose alternatives. However, the MIT-led team discovered that policy gradient methods, originally developed in the 1990s for single-agent decision-making, often perform better and with greater efficiency.

    The researchers created a benchmark to fairly evaluate different algorithms, measuring performance through a concept called exploitability—how well a player does against a worst-case adversary. In experiments involving five games, including Phantom Tic-Tac-Toe, imperfect-information Hex, and Liar’s Dice, neural networks trained with policy gradient algorithms consistently achieved lower exploitability scores than those trained with game-theoretic algorithms.

    “Our study showed that policy gradient methods can work better than these specialized algorithms, and that the specialized algorithms may not work as well as people thought,” said Samuel Sokota, a co-author from Carnegie Mellon University. The team’s benchmarking software, which they have made freely available, allows others to test and compare algorithms with just a single line of code added to the OpenSpiel library.

    The implications extend far beyond board games. “Hidden information is a very important property of the world,” said Eugene Vinitsky of New York University, another co-author. “It pervades military operations, trading scenarios, and negotiations—all of which are carried out under conditions of hidden information. The idea that we can improve on these games suggests that we can also do better in these other settings as well.”

    Ian Gemp, a computer scientist and game theory expert at Google DeepMind not involved in the study, called the results encouraging: “This work serves as a compelling reminder that modernizing classical tools remains a highly productive path for solving complex strategic problems.”