Bo LiAn associate professor at the University of Chicago who specializes in stress testing and challenging AI models to uncover misconduct, he has become a go-to source for some consulting firms. These consulting firms are often less concerned with the intelligence of AI models than with how problematic they may be (from a legal, ethical, and compliance perspective).
Li and colleagues from several other universities, as well as Virtue AIco-founded by Li, and Lapis LaboratoriesRecently, a taxonomy of AI risks was developed along with a benchmark that reveals the extent to which different large language models violate regulations. “We need some principles for AI safety, in terms of regulatory compliance and ordinary use,” Li tells WIRED.
The researchers analyzed government regulations and guidelines on artificial intelligence, including those in the US, China and the EU, and studied the usage policies of 16 major AI companies around the world.
The researchers also built AIR Bank 2024a benchmark that uses thousands of cues to determine how popular AI models perform in terms of specific risks. It shows, for example, that Anthropic’s Claude 3 Opus ranks high when it comes to refusing to generate cybersecurity threats, while Google’s Gemini 1.5 Pro ranks high when it comes to avoiding generating non-consensual sexual nudity.
DBRX Instruct, a model developed by Databricks, scored the worst across the board. When the company launched its model in March, it said it would continue to improve DBRX Instruct’s security features.
Anthropic, Google and Databricks did not immediately respond to a request for comment.
Understanding the risk landscape, as well as the pros and cons of specific models, may become increasingly important for companies looking to deploy AI in certain markets or for certain use cases. A company looking to use an LLM for customer service, for example, might be more concerned about a model’s propensity to produce offensive language when provoked than its ability to design a nuclear device.
Bo says the analysis also reveals some interesting issues about how AI is developed and regulated. For example, the researchers found that government rules are less comprehensive than company policies in general, suggesting there is room to tighten regulations.
The analysis also suggests that some companies could do more to ensure the safety of their models. “If some models are tested based on a company’s policies, they don’t necessarily meet the standards,” Bo says. “This means they have a lot of room for improvement.”
Other researchers are trying to bring order to a chaotic and confusing AI risk landscape. This week, two MIT researchers revealed Your own database of AI dangerscompiled from 43 different AI risk frameworks. “Many organizations are still in the early stages of the AI adoption process,” meaning they need guidance on potential dangers, says Neil Thompson, a research scientist at MIT who is involved in the project.
Peter Slattery, project leader and MIT researcher FutureTech Groupwho studies advances in computer science, says the database highlights the fact that some AI risks get more attention than others. More than 70 percent of the frameworks mention privacy and security issues, for example, but only about 40 percent refer to misinformation.
Efforts to catalog and measure AI risks will need to evolve as AI evolves. Li says it will be important to explore emerging issues such as the emotional rigidity of AI models. His firm recently analyzed the largest, most powerful version of Meta’s Llama 3.1 model. It found that while the model is more capable, it’s not much safer — something that reflects a broader disconnect. “Safety is really not improving significantly,” Li says.