The technology to decode our thoughts is getting closer and closer. Neuroscientists at the University of Texas have for the first time decoded data from non-invasive brain scans and used it to reconstruct language and meaning from stories people hear, see, or even imagine.
In a new study published in Nature NeuroscienceAlexander Huth and colleagues have successfully recovered the essence of language and sometimes exact sentences functional magnetic resonance imaging (fMRI) brain images of three participants.
Technology that can create language from brain signals could be immensely helpful for people who can’t talk due to conditions like motor neuron disease. At the same time, it raises concerns about the future privacy of our thoughts.
Language decoding modelsalso called “speech decoders”, aim to use recordings of a person’s brain activity to discover the words they hear, imagine or say.
So far, speech decoders have only been used with data from devices surgically implanted in the brain, which limits their usefulness. Other decoders using non-invasive recordings of brain activity were able to decode single words or short sentences, but not continuous language.
Read more: We’ve been connecting brains to computers for longer than you might expect. These 3 companies are leading the way
The new study used the oxygen level dependent signal in the blood from fMRI scans, which show changes in blood flow and oxygenation levels in different parts of the brain. By focusing on activity patterns in brain regions and networks that process language, the researchers found that their decoder could be trained to continuously reconstruct language (including some specific words and the general meaning of sentences).
Specifically, the decoder took the brain responses of three participants as they listened to stories, generating word sequences that likely triggered those brain responses. These word sequences did a good job of capturing the general gist of the stories, and in some cases contained exact words and phrases.
The researchers also had the participants watch silent movies and make up stories while being scanned. In both cases, the decoder often managed to predict the core of the stories.
For example, one user thought “I don’t have my driver’s license yet”, and the decoder predicted “she hasn’t even started learning to drive yet”.
Further, when participants actively listened to one story while ignoring another story that was playing at the same time, the decoder was able to identify the meaning of the story that was being actively listened to.
How does it work?
The researchers started by having each participant lie in an fMRI scanner and listen to 16 hours of narrated stories while their brain responses were recorded.
These brain responses were then used to make a encoder – a computational model that attempts to predict how the brain will respond to words a user hears. After training, the encoder was able to predict quite accurately how each participant’s brain signals would respond to hearing a particular sequence of words.
Going in the opposite direction — from registered brain responses to words — is trickier, though.
The encoder model is designed to link brain responses to “semantic features,” or the broad meanings of words and sentences. The system uses the original GPT language model, the predecessor of the current GPT-4 model. The decoder then generates strings of words that may have triggered the observed brain responses.
The accuracy of each “guess” is then checked by using it to predict previously recorded brain activity, the prediction then being compared to the actual activity recorded.
During this resource-intensive process, multiple guesses are generated at once and ranked in order of accuracy. Bad guesses are discarded and good ones kept. The process continues by guessing the next word in the sequence, and so on until the most accurate sequence is determined.
Words and meanings
The study found that data from multiple, specific brain regions — including the speech network, the parietal-temporal-occipital association area and the prefrontal cortex — were needed for the most accurate predictions.
An important difference between this work and previous efforts is the data that is decoded. Most decoding systems link brain data to motor characteristics or activity recorded from brain regions involved in the last step of speech output, the movement of the mouth and tongue. This decoder instead works at the level of ideas and meanings.
A limitation of using fMRI data is the low “temporal resolution”. The signal that depends on blood oxygen levels rises and falls over a period of about 10 seconds, during which time a person may have heard as many as 20 or more words. As a result, this technique cannot detect individual words, but only the possible meanings of word strings.
No need for privacy panic (yet).
The idea of technology that can “read minds” raises concerns about mental privacy. The researchers conducted additional experiments to address some of these concerns.
These experiments showed that we need not yet worry about our thoughts being decoded as we walk down the street, or even without our extensive cooperation.
A decoder trained on one person’s thoughts performed poorly at predicting the semantic details of another participant’s data. In addition, participants could interfere with decoding by diverting their attention to another task, such as naming animals or telling a different story.
Read more: Our neurodata may reveal our most personal selves. How will brain implants be protected now that brain implants are becoming commonplace?
Motion in the scanner can also interfere with the decoder, as fMRI is highly sensitive to motion, so participant cooperation is essential. Given these requirements and the need for powerful computational resources, it is highly unlikely that anyone’s thoughts can be decoded against their will at this stage.
Finally, the decoder currently does not work on data other than fMRI, which is an expensive and often impractical procedure. The group plans to test their approach on other non-invasive brain data in the future.