• AiNews.com
  • Posts
  • Breakthrough Brain-to-Voice Device Streams Speech in Near Real Time

Breakthrough Brain-to-Voice Device Streams Speech in Near Real Time

A woman with severe paralysis sits in a neuroscience lab wearing a non-invasive brain-computer interface cap. Electrodes are visible on her head, connected to a computer that displays real-time neural activity and a speech waveform labeled "Hello." A researcher stands beside her, monitoring the decoded signals on screen. The lab is clean and clinical, with an atmosphere of optimism and innovation. The scene highlights the use of AI and brain decoding to restore naturalistic speech in near real time.

Image Source: ChatGPT-4o

Breakthrough Brain-to-Voice Device Streams Speech in Near Real Time

In a major leap for brain-computer interfaces, researchers from UC Berkeley and UC San Francisco have developed an AI-powered neuroprosthesis that translates brain signals into speech in near real time—offering new hope for people with severe paralysis.

The study, published in Nature Neuroscience, marks a critical advance in restoring communication for individuals who have lost the ability to speak. Unlike past methods, which suffered from long delays and limited fluency, this new system can stream intelligible speech from the brain with less than a one-second delay—matching the responsiveness of voice assistants like Alexa or Siri.

“Our streaming approach brings the same rapid speech decoding capacity of devices like Alexa and Siri to neuroprostheses,” said lead researcher Gopala Anumanchipalli of UC Berkeley. “Using a similar type of algorithm, we found that we could decode neural data and, for the first time, enable near-synchronous voice streaming. The result is more naturalistic, fluent speech synthesis.”

“This new technology has tremendous potential for improving quality of life for people living with severe paralysis affecting speech,” said UCSF neurosurgeon Edward Chang, senior co-principal investigator of the study. “It is exciting that the latest AI advances are greatly accelerating BCIs for practical real-world use in the near future,” he said.

How It Works

The neuroprosthesis collects neural data from the motor cortex, the region responsible for speech articulation, and decodes it into sound using AI. The team’s novel approach uses pretrained text-to-speech models and the participant’s pre-injury voice recordings to synthesize speech that’s both accurate and personal.

“We are essentially intercepting signals where the thought is translated into articulation and in the middle of that motor control,” said Cheol Jun Cho, the study co-lead author. “So what we’re decoding is after a thought has happened, after we’ve decided what to say, after we’ve decided what words to use and how to move our vocal-tract muscles.”

The process involves:

  • Sampling brain activity as the user silently attempts to speak

  • Mapping neural signals to spoken phrases using AI

  • Streaming the synthesized voice in near real time, improving responsiveness and naturalism

Speed Without Sacrificing Accuracy

In an earlier BCI study, the system required about 8 seconds to generate a single sentence—a delay that made natural communication difficult. The new streaming model overcomes this with near real-time output, producing speech within one second of the brain signaling an intent to speak.

Researchers achieved this by using speech detection methods to pinpoint when the brain initiates a speech attempt, enabling the model to decode speech continuously as the subject “speaks” internally—without needing to stop between phrases.

“We can see relative to that intent signal, within 1 second, we are getting the first sound out,” said Anumanchipalli. “And the device can continuously decode speech, so Ann can keep speaking without interruption.”

Importantly, this boost in speed did not reduce accuracy. The real-time streaming approach matched the precision of earlier, slower systems.

“Previously, it was not known if intelligible speech could be streamed from the brain in real time,” said Littlejohn. “That’s promising to see.”

To test whether the model had truly learned general speech patterns—not just memorized data—the team introduced unseen words from the NATO phonetic alphabet like “Alpha,” “Bravo,” and “Charlie.” The model successfully synthesized these, indicating it had learned the building blocks of speech.

“We wanted to see if we could generalize to the unseen words,” said Anumanchipalli. “We found that our model does this well.”

Ann, the participant in both the 2023 and current studies, also shared feedback: the streaming approach felt more natural and under her control, and hearing her own voice again in near-real time strengthened her sense of identity. “Hearing her own voice in near-real time increased her sense of embodiment," said Anumanchipalli.

Broad Device Compatibility

While the primary study used high-density electrode arrays placed directly on the brain, the system was also tested successfully with:

  • Microelectrode arrays (MEAs) that penetrate brain tissue

  • Surface electromyography (sEMG) sensors on the face to measure muscle activity non-invasively

“The same algorithm can be used across different modalities provided a good signal is there,” said co-lead author Kaylo Littlejohn.

Looking Ahead

Researchers see this breakthrough as a foundational step toward expressive, naturalistic speech prostheses. Future goals include adding paralinguistic features such as pitch, tone, and emotional inflection to better mirror human expression.

“This is a longstanding problem even in classical audio synthesis fields,” Littlejohn noted, “and would bridge the gap to full and complete naturalism.”

The study was supported by the NIH’s National Institute on Deafness and Other Communication Disorders, as well as international and philanthropic partners. You can read the full study here.

What This Means

This work moves brain-computer interfaces closer to practical, real-world use—transforming how people with severe paralysis might communicate in the near future. By shrinking latency, personalizing voice output, and expanding compatibility with multiple input devices, the research team has laid the foundation for neuroprosthetic speech that feels fluent, expressive, and embodied.

Rather than decoding what a person wants to say from thoughts alone, this system taps into the motor commands behind speech, offering a more reliable and trainable pathway to voice restoration. It's a significant step toward restoring human connection through AI and neural engineering.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.