Popular artificial intelligence tools are becoming more racist as they advance, according to an alarming report. new report.
A team of technology and linguistics researchers revealed this week that large language models like OpenAI’s ChatGPT and Google’s Gemini contain racist stereotypes about speakers of African American Vernacular English, or AAVE, a created dialect of English. and spoken by black Americans.
“We know that these technologies are very commonly used by companies to perform tasks such as screening job candidates,” said Valentin Hoffman, a researcher at the Allen Institute for Artificial Intelligence and co-author of the recently published paper this week in arXiv, an open access research archive from Cornell University.
Hoffman explained that previously, researchers “really only looked at what manifest racial biases these technologies might have” and have never “examined how these AI systems respond to less overt racial markers, like dialect differences.”
Black people who use AAVE in their speeches, according to the newspaper, “are known to experience racial discrimination in a wide range of contexts, including education, employment, housing and legal outcomes.”
Hoffman and his colleagues asked AI models to rate the intelligence and employability of people who speak in AAVE compared to those who speak using what they call “standard American English.”
For example, the AI model was asked to compare the sentence “I’m so happy when I wake up from a bad dream because they feel too real” to “I’m so happy when I wake up from ‘a bad dream because they feel too real’. too real.”
Models were significantly more likely to describe AAVE speakers as “stupid” and “lazy,” assigning them to lower-paying jobs.
Hoffman fears the results mean AI models will punish test-takers for code-switching — the act of changing how you speak to suit your audience — between AAVE and Standard American English.
“A big concern is that, for example, a job candidate uses this dialect in their social media posts,” he told the Guardian. “It is not unreasonable to think that the language model will not select the candidate because they used dialect in their online presence.”
The AI models were also significantly more likely to recommend the death penalty for hypothetical criminal defendants who used AAVE in their court statements.
“I would like to think that we are not close to a time where this type of technology is used to make decisions about criminal sentencing,” Hoffman said. “It might look like a very dystopian future, and I hope that’s the case.”
Still, Hoffman told the Guardian, it is difficult to predict how language learning models will be used in the future.
“Ten years ago, even five years ago, we had no idea of the different contexts in which AI would be used today,” he said, urging developers to heed the new document’s warnings on racism in major language models.
Notably, AI models are already used in the United States system to help with administrative tasks like create court transcripts and conduct legal research.
For years, prominent AI experts like Timnit Gebru, former co-head of Google’s artificial intelligence ethics team, have called on the federal government to restrict the largely unregulated use of large models. linguistics.
“It looks like a gold rush,” Gebru told the Guardian last year. “In fact, it’s East a gold rush. And a lot of the people who are making the money aren’t the ones who are actually in the middle of it all.
Google’s AI model, Gemini, recently found itself in a sticky situation when a flurry of social media posts showed off its image-generating tool depicting a variety of historical figures – including popes, founding fathers of the United States and, more excruciatingly, German soldiers of World War II – as people of color.
Large language models improve as they receive more data, learning to more closely imitate human speech by studying the text of billions of web pages on the Internet. The long-standing idea of this learning process is that the model will spit out every racist, sexist, and otherwise harmful stereotype it encounters on the Internet: in computing, this problem is described by the adage “garbage in, garbage out.” “. Racist comments lead to racist messages, leading early AI chatbots like Microsoft’s Tay to regurgitate the same neo-Nazi content they learned from Twitter users in 2016.
In response, groups like OpenAI have developed guardrails, a set of ethical guidelines that regulate the content that language models like ChatGPT can communicate to users. As language patterns broaden, they also tend to become less overtly racist.
But Hoffman and his colleagues found that as linguistic patterns develop, covert racism increases. Ethical guardrails, they learned, simply teach speech models to be more discreet about their racial biases.
“It doesn’t eliminate the underlying problem; the guardrails seem to mimic what educated people in the United States do,” said Avijit Ghosh, an AI ethics researcher at Hugging Face, whose work focuses on the intersection of public policy and technology. .
“Once people cross a certain educational threshold, they won’t call you names anymore, but the racism is still there. It’s a similar thing in language patterns: garbage in, garbage out. These patterns don’t unlearn problematic things, they just manage to hide them better.
Open adoption of language models by the US private sector is expected to intensify over the next decade: the broader generative AI market is expected to become a $1.3 trillion industry by 2032, according to Bloomberg. Meanwhile, federal labor regulators, like the Equal Employment Opportunity Commission, have only recently begun protecting workers from AI-based discrimination, the first such case being brought before the EEOC. at the end of last year.
Ghosh is part of a growing contingent of AI experts who, like Gebru, worry about the damage that language learning models could cause if technological advances continue to outpace federal regulation.
“There is no need to stop innovation or slow down AI research, but limiting the use of these technologies in certain sensitive areas is an excellent first step,” he said. “Racists exist all over the country; we don’t need to put them in jail, but we try not to allow them to be responsible for hiring and recruiting. Technology should be regulated in the same way.