Algorithms similar to those of Netflix, Amazon and Facebook have shown that they can decipher the ‘biological language’ of cancer, Alzheimer’s and other neurodegenerative diseases.
Researchers trained a large-scale language model with a recommendation for AI to see what happens if something goes wrong with proteins that leads to the development of a disease.
The work, carried out by St. John’s College and the University of Cambridge, programmed the algorithm to learn the language of shape-shifting droplets of proteins found in cells to understand their function and defect.
By learning the language of these protein droplets, the team can “correct the grammatical errors in cells that cause disease.”
Scroll down for video
Researchers trained a large-scale language model with a recommendation AI to see what happens if something goes wrong with proteins that leads to the development of a disease
Professor Tuomas Knowles, a Fellow at St John’s College, said: “Any defects related to these protein droplets can lead to diseases such as cancer.
“That is why bringing natural language processing technology into research into the molecular origins of protein disturbances is vital if we are to correct the grammatical errors in cells that cause disease.”
Machine learning technology has caused a furore in the tech industry – Netflix uses it to recommend shows, Facebook introduces someone to a friend, and Amazon’s Alexa has an algorithm to recognize people based on their voice.
However, the medical world applies the technology in a way that saves lives.
The work programmed the algorithm to learn the language of shape-shifting droplets of proteins found in cells to understand their function and defect. Depicted are protein-containing condensates that form in living cells
“Using machine learning technology in neurodegenerative disease and cancer research is an absolute game-changer,” says Knowles, the lead author of the study.
“Ultimately, the goal will be to use artificial intelligence to develop targeted drugs to dramatically relieve symptoms or prevent dementia in the first place.”
Dr. Kadi Liis Saar, lead author of the paper and a Research Fellow at St John’s College, was tasked with training the large-scale language model to discover the secrets of the protein.
She said: ‘The human body is home to thousands and thousands of proteins and scientists don’t yet know the function of many of them. We asked a neural network-based language model to learn the language of proteins.
‘We specifically asked the program to learn the language of shape-shifting biomolecular condensates – droplets of proteins found in cells – that scientists need to really understand in order to crack the language of biological functions and malfunctions that cause cancer and neurodegenerative diseases such as Cause Alzheimer’s.
“We found that it could learn, without being explicitly told, what scientists have discovered for decades of research about the language of proteins.”
Proteins play a number of key roles in the body, but most of their work is done in cells – they provide structure, function and regulate the body’s tissues and organs.
Alzheimer’s, Parkinson’s and Huntington’s are three of the most common neurodegenerative diseases, but scientists believe there are hundreds of them.
In Alzheimer’s disease, which affects 50 million people worldwide, proteins go rogue, form clots and kill healthy nerve cells.
A healthy brain has a quality control system that effectively disposes of these potentially dangerous masses of proteins, also known as aggregates.
Scientists now think that some disordered proteins also form liquid-like droplets of proteins called condensates that have no membrane and fuse freely with each other.
Unlike protein aggregates, which are irreversible, protein condensates can form and reform and are often compared to blobs of shape-shifting wax in lava lamps.
“Protein condensates have recently attracted a lot of attention in the scientific world because they regulate important events in the cell, such as gene expression – how our DNA is converted into proteins – and protein synthesis – how the cells make proteins,” Knowles said.
Any defects associated with these protein droplets can lead to diseases such as cancer. Therefore, it is essential to bring natural language processing technology into research on the molecular origin of protein disturbances if we are to correct the grammatical errors in cells that cause disease. ‘
Machine learning technology is evolving at a rapid pace due to growing data availability, increased computing power and technical advancements that have led to more powerful algorithms.
By learning the language of the proteins (see photo), the team can determine what is not functioning properly. “Ultimately, the goal will be to use artificial intelligence to develop targeted drugs to dramatically relieve symptoms or prevent dementia at all,” say scientists.
“ We fed the algorithm all the data on the known proteins so that it could learn and predict the language of proteins in the same way that these models learn about human language and how WhatsApp knows how to suggest words to use, ” said Dr. .Saar.
‘Then we could ask about the specific grammar that causes only a few proteins to form condensates in cells. It is a very challenging problem and unlocking it will help us learn the rules of the language of disease. ‘
Further use of machine learning could transform future research on cancer and neurodegenerative diseases.
Discoveries can be made that go beyond what scientists already know and speculate about diseases and may even go beyond what the human brain can understand without the aid of machine learning.
“Machine learning can be free of the limitations of what researchers think is the target of scientific exploration and it will mean finding new connections that we haven’t even figured out yet,” explains Dr. Saar.
“It’s really exciting.”