Anthropic Seeks to Fund Advanced AI Benchmark Development

Mihailo Obradovic
Published: July 2, 2024
Updated: July 2, 2024
News

Anthropic is launching a program to fund the development of new benchmarks to evaluate AI models’ performance and impact, including generative models like its own Claude.

Unveiled on Monday, Anthropic’s program will provide payments to third-party organizations that can “effectively measure advanced capabilities in AI models,” according to a company blog post. Applications will be accepted on a rolling basis.

“Our investment in these evaluations aims to elevate the entire field of AI safety, providing valuable tools for the whole ecosystem,” Anthropic stated. “Developing high-quality, safety-relevant evaluations is challenging, and demand is outpacing supply.”

AI currently has a benchmarking problem. The most commonly cited benchmarks fail to capture how the average person uses the systems being tested. Some benchmarks, especially those predating modern generative AI, may not measure what they claim to.

Anthropic proposes creating challenging benchmarks focusing on AI security and societal implications using new tools, infrastructure, and methods.

The company calls for tests assessing a model’s ability to perform tasks like cyberattacks, enhancing weapons of mass destruction, and manipulating or deceiving people. For AI risks related to national security, Anthropic is committed to developing an “early warning system” for identifying and assessing risks, though details are not provided in the blog post.

Anthropic also aims to support research into benchmarks and “end-to-end” tasks probing AI’s potential in scientific study, multilingual conversations, bias mitigation, and self-censoring toxicity.

To achieve this, Anthropic envisions new platforms for subject-matter experts to develop evaluations and large-scale model trials involving “thousands” of users. A full-time coordinator has been hired for the program, and the company may purchase or expand promising projects.

“We offer a range of funding options tailored to each project’s needs and stage,” Anthropic writes, without providing further details. “Teams will interact directly with Anthropic’s domain experts from various relevant teams.”

Anthropic’s effort to support new AI benchmarks is commendable, assuming sufficient resources are allocated. However, given the company’s commercial ambitions in the AI race, complete trust may be difficult.

Anthropic is transparent about wanting certain evaluations to align with its AI safety classifications, developed with input from third parties like the nonprofit AI research organization METR. This is within the company’s prerogative but may require applicants to accept definitions of “safe” or “risky” AI they might not agree with.

Some in the AI community may also take issue with Anthropic’s references to “catastrophic” and “deceptive” AI risks, like nuclear weapons risks. Many experts argue there’s little evidence suggesting AI will gain world-ending, human-outsmarting capabilities soon, if ever. Claims of imminent “superintelligence” may distract from pressing AI regulatory issues like AI’s hallucinatory tendencies.

Anthropic hopes its program will be “a catalyst for progress towards a future where comprehensive AI evaluation is an industry-standard.” While many open, corporate-unaffiliated efforts to create better AI benchmarks may identify with this mission, it remains to be seen if they will join forces with an AI vendor ultimately loyal to shareholders.

AI Benchmarks, AI Development, AI Performance, AI Research, AI Safety, Anthropic, Generative Models

Mihailo Obradovic

I'm Mihailo Obradovic, an established professional in digital marketing with almost two decades of experience. Specializing in a wide range of niches is my main strength. But the niche I am most interested in is entrepreneurship, with a special focus on startups. Researching the market and providing accurate data is my priority.