YouTube moderation bots punish video & # 39; s tagged as "gay" or "lesbian", finds study

A new investigation by a coalition of YouTube creators and researchers is accusing YouTube of relying on a "intolerant bots" system to determine if certain content needs to be demonized, especially LGBTQ videos.


The study was conducted by three people: Sealow, the CEO of research agency Ocelot AI; YouTube maker Andrew who the YouTube analyzed channel; and One of the popular YouTube commentary and research channel Nerd City.

The research was fueled by interest to see which words were automatically demonized by YouTube's machine learning bots as concerns about transparency between executives and YouTubers grew within the maker community. Andrew manually tested 15,300 words between June 2 and July 5, 2019, using the most common terms in the search results from Webster & # 39; s Dictionary, UrbanDictionary and Google. The second round of experiments ran from July 6 to July 21 and contained 14,000 words that were automated using the YouTube data API from Sealow. One collaborated with its own sources and helped produce the main video.

Andrew, Sealow and Een each released their individual videos about the findings, along with an Excel sheet with all the words they used and a whitepaper analysis of their findings. These words were used to test what the YouTube bots automatically consider inappropriate for generating revenue. The team discovered that when words like "gay" and "lesbian" changed into random words such as "happy," the "status of the video changed to advertiser-friendly" every time, One says in his video.

Reached through The edge, a YouTube spokesperson denied that there is a list of LGBTQ words that cause demonetization, despite the findings of the investigation. The spokesperson added that the company "constantly evaluates our systems to ensure that they reflect our policies without unfair bias."

"We are proud of the incredible LGBTQ + votes on our platform and take such concerns very seriously, "said the spokesperson." We use machine learning to evaluate content against our guidelines for advertisers. Sometimes our systems are wrong, which is why we have encouraged makers to appeal. Successful calls ensure that our systems are updated to keep getting better. "

YouTube's automated demonetization systems are based on many signals, but the company says there is no specific list built into the company's machine learning system. The company confirmed that it is testing videos & # 39; s from LGBTQ makers when new revenue generation classifications are introduced to ensure that LGBTQ videos are no longer being demonized. But the company claims that the current evaluation system used by professions human moderators, properly reflects the company's policy on LGBTQ conditions.


But the researchers' findings suggest that a considerable bias is at work before the human moderators get involved. Their research led them to the conclusion that YouTube's machine learning bots that are specifically used to investigate whether a video is available to make money use a "hidden confidence level of 0 to 1". People who are closer to zero are approved to generate income, while others are demonetized closer to one. If a video is considered to be above the YouTube threshold, it will be immediately demonized and must be reviewed manually.

"YouTube's ratings have been trained to try to predict how likely it is that a video will be demonized based on training data (based on previous manual assessment results)," Sealow said The edge. “So a score of 1 is 100 percent convinced that it must be demonized, while 0.5 is 50 percent, and so on. YouTube had to set a certain acceptable threshold – say, & # 39; 35 percent confidence & # 39; where any video with a score higher than 0.35 is demonized and requires a manual review before it is approved for revenue generation. "

In analyzing their findings, Sealow states that "the list can best be interpreted as a list of negatively charged keywords because certain words are considered more serious than others."

Each video uploaded for testing purposes lasted between one and two seconds and "contained no visual or audio content that could cause demonetization," the report reads. The waiting time for approving or refusing income was around two hours. Words related to the LGBTQ community or terms used in comments, such as & # 39; democrat & # 39; or & # 39; liberal & # 39 ;, & # 39; are likely to be negatively charged because of their use in political commentary that is often considered non-advertiser friendly & # 39 ;, the report reads.

"Exactly the same video & # 39; s are generated without the LGBTQ terminology," says Sealow in his video. “This is not a matter of LGBTQ personalities being demonized for something that everyone would be demonized for, such as sex or tragedy. This is LGBTQ terminology such as "gay" and "lesbian", the only reason why a video is demonized despite the context. "

Accusations in the video are not new, but the study is the most extensive. YouTube executives, including CEO Susan Wojcicki and chief product officer Neal Mohan, have spoken of concerns that certain keywords in metadata and titles lead to automatic demonetization. It is a particularly common problem within the LGBTQ community. YouTube has categorically denied that there is a policy "that says," If you put certain words in a title that will be demonized, "as Wojcicki YouTuber Alfie Deyes told in a long interview in August.

"We work incredibly hard to ensure that when our machines learn something – because many of our decisions are made algorithmically – our machines are fair," added Wojcicki. "That shouldn't be there (some automatic demonetization)."


This has not prevented video makers from using secret language in their videos and including Google documents in their comment section to communicate with viewers. YouTuber Petty Paige flashes the infamous yellow dollar sign – a a sign that both makers and the public know that a video has been demonized – meaning her fans should read the document below to understand why she uses specific words. She theorized that, like many other LGBTQ personalities the use of words such as "lesbian" or "transgender" can lead to demonetization. Exchanging those terms for other random words didn't seem to be.

"It's just as discriminating if you never say this, and even more exploitative if you do," said One.

Earlier this summer, a number of LGBTQ makers filed a lawsuit against YouTube for alleged discriminatory practices, including unfairly demonizing content that contained LGBTQ-friendly terms. The lawsuit also claims that YouTube is actively damaging the viewing ratings of their channels by placing videos in restricted mode, for which the company has previously apologized, and therefore limits their ability to make money. The lawsuit alleges that "YouTube engages in discriminatory, anti-competitive, and unlawful conduct that damages a protected group of individuals under California law."

"We are tired of being reassured with clear lies and hollow promises that they have solved it or are going to fix it," said Chris Knight, who co-hosts an LGBTQ YouTube news show, GNews! The edge at the time. "It is clearly broken. There is clearly a preference for their AI, their policy. What we really want is that they change."

Sealow and A state that they do not believe that YouTube or Wojcicki are homophobic or deliberately apply homophobic practices. They add this specifically, not because of specific YouTube policies or "a lack of programs to reduce algorithmic discrimination."


"It is simply the result of the probabilistic nature of the machine learning classifications used by the demonetization bone," adds the Sealow report.