A surprising level of accuracy may be achieved by an AI when decoding speech from brain activity.

A surprising level of accuracy may be achieved by an AI when decoding speech from brain activity.

A surprising level of accuracy may be achieved by an AI when decoding speech from brain activity.

IT  
ISRDO Team 16 Apr, 2023 - in Computer Science and Engineering
359 0
  • Rating
  • neuroscientist
  • intelligence
  • brain
  • research
  • technological
  • brains
  • researcher
  • computational
  • wakefulness
  • researchers

A startling level of accuracy may be achieved by an artificial intelligence when deciphering words and phrases based on brain activity, although this ability is still severely constrained. The artificial intelligence can make an educated prediction about what a person has heard using data from only a few seconds' worth of brain activity. According to the findings of a pilot investigation conducted by the researchers, it places the appropriate response among its top 10 alternatives up to 73% of the time.

According to Giovanni Di Liberto, a computer scientist at Trinity College Dublin who was not engaged in the study but was able to comment on the AI's performance, the AI's "performance was above what many people felt was conceivable at this level."

According to a report that was published on arXiv.org on August 25 by researchers, the artificial intelligence was developed at Meta, the company that is the parent company of Facebook. Eventually, it could be used to assist thousands of people around the world who are unable to communicate through speech, typing, or gestures. This includes a large number of individuals who are only marginally awake, locked-in, or in a "vegetative state," which is what is now often referred to as unresponsive wakefulness syndrome (SN: 2/8/19).

The majority of the currently available solutions to assist these people in communicating call for potentially dangerous brain procedures to implant electrodes. This new method "could provide a viable path to help patients with communication deficits... without the use of invasive methods," according to neuroscientist Jean-Rémi King, who is currently conducting research for Meta AI at the École Normale Supérieure in Paris. King is also a researcher for Meta AI.

The 56,000 hours of audio recordings from 53 different languages that were used to train the computational tool by King and his colleagues to recognise words and phrases. The application, which is also referred to as a language model, was taught how to identify particular aspects of language at both a fine-grained level, such as letters or syllables, and at a more general level, such as a word or a sentence. For example, the tool could recognise specific features of language at both levels.

This language model was used by the scientists to apply an artificial intelligence to databases from four different universities that included the brain activity of 169 individuals. Participants in these databases were asked to listen to excerpts from works by authors such as Ernest Hemingway and Lewis Carroll, such as The Old Man and the Sea and Alice's Adventures in Wonderland, while their brains were scanned by either magnetoencephalography or electroencephalography. Participants in these databases listened to these excerpts while their brains were being scanned. These approaches measure either the magnetic or the electrical component of brain impulses.

After that, the researchers attempted to interpret what the participants had heard by utilising just three seconds of brain activity data from each individual. This was done with the use of a computational approach that helps account for physical variances among actual brains. The team gave the AI the instruction to align the speech sounds from the story recordings with patterns of brain activity that the AI computed as corresponding to what people were hearing. This was done in response to the AI's conclusion that the patterns of brain activity reflected what people were hearing. After that, it made some educated guesses about what the individual may have been hearing during that little window of time, choosing from among more than a thousand potential scenarios.

The researchers discovered that when they used magnetoencephalography, also known as MEG, the right response was within the artificial intelligence's top 10 choices up to 73 percent of the time. The use of electroencephalography brought that percentage down to no more than thirty percent. "[That MEG] performance is very excellent," adds Di Liberto, although he is less confident about its practical usage. "[That MEG] performance is very good." "What are some possible uses for it?" Nothing. "Not even a single thing."

According to him, the reason behind this is because MEG needs a cumbersome and pricey piece of equipment. In order to implement this technology in medical clinics, technological advancements will need to be made that reduce the cost of the devices and make them simpler to operate.

According to Jonathan Brennan, a linguist at the University of Michigan in Ann Arbor, "decoding" should be understood in the context of this research in order to fully appreciate its significance. The act of extracting information from its original source, in this example speech from brain activity, is what is meant by the term, which is used frequently to describe the process. Nevertheless, the AI was only able to accomplish this because it was given a limited number of options for responses that it could consider when making its estimations.

Brennan states that this approach will not be sufficient if the goal is to grow the usage of the technology to a practical level since language is endless.

In addition to this, according to Di Liberto, the AI was able to decipher information from people who were only passively listening to audio, which is information that is not immediately relevant to nonverbal patients. In order for it to become a significant communication tool, scientists will have to discover how to decipher what these patients plan on saying by analysing their brain activity. This may include signs of hunger, discomfort, or even a straightforward "yes" or "no."

King acknowledges that the current research focuses on "decoding of speech perception, not creation." Even while the creation of speech is the end aim, "we're quite a long way away" from achieving it right now.

CITATIONS

A. Défossez et al. Decoding speech from non-invasive brain recordings. arXiv:2208.12266. Posted August 25, 2022.

Leave a Reply

Your email address will not be published. Required fields are marked *

255 character(s) remaining.