Tag: testing

  • Patronus AI Secures $50M to Create Simulated Environments for AI Agent Testing

    Patronus AI Secures $50M to Create Simulated Environments for AI Agent Testing

    Patronus AI has raised $50 million in a Series B funding round to expand its Digital World Models, which simulate websites, software tools, and internal platforms for testing autonomous AI agents. The company plans to use the capital to grow its research and engineering teams and strengthen the computing infrastructure behind its evaluation systems.

    Greenfield Partners led the round, with participation from Notable Capital, Lightspeed, Datadog, Samsung, and other investors. This brings the startup’s total funding to $70 million. Patronus AI, founded in 2023 by former Meta AI researchers Anand Kannappan and Rebecca Qian, focuses on evaluating how AI agents perform in realistic, dynamic environments rather than relying on static benchmarks.

    The company’s Digital World Models use reinforcement learning to reward agents for correct task completion and penalize errors. This approach helps developers study repeated behavior, identify failures, and ensure agents follow instructions without taking shortcuts. According to Glenn Solomon, managing director at Notable Capital, “Patronus is really good at spotting the hacks and making sure they are holding the models accountable.”

    Patronus AI reported that its revenue grew 15 times over the past year, with frontier AI labs and newer AI companies using its evaluation systems. The startup currently builds simulations for software engineering and finance, where results can be verified through code tests and account records. It plans to expand into longer, more complex tasks that span hours, days, or even weeks, aiming to track agent behavior without human review at every step.

    Co-founder Anand Kannappan emphasized the focus on verifiable problems today, but noted that many fields include tasks where correct results are difficult to confirm. The company’s method is comparable to synthetic testing used in self-driving car development, where virtual settings expose systems to rare or risky events before real-world deployment.