You order a cappuccino, but you get a macchiato. You get to your stop, but the bus driver doesn’t register your frantic call to open the back door above the din of other commuters. You ask Siri for the time and she replies “OK — calling grandma.” These blips in communication are at best an inconvenience. At worst, and particularly for people with speech issues, they can completely hamper your everyday life.
For people who either have poor or no verbal communication abilities, being misunderstood or just outright ignored can be a daily experience that affects both their quality of life and their ability to move in the world.
Millions of people have conditions affecting speech — ALS, cerebral palsy, and Parkinson’s disease all change speech. So can stroke, autism, and other neurological conditions. But just as these conditions can make it more difficult for other people to listen and understand what they want to communicate, the near-ubiquitous, voice-controlled technologies we have welcomed into our lives — smartphones, Google Homes, Alexas, and so on — also don’t tend to catch their drift.
In fact, Google’s speech recognition software is so terrible that the company is trying to fundamentally change how computers hear humans with varying verbal abilities — “Project Relate.”
Aubrie Lee, is a brand manager at Google. Lee’s speech is affected by muscular dystrophy. In a post on Google, she says Project Relate could mean the difference between feeling disconnected or feeling welcomed.
“I’m used to the look on people’s faces when they can’t understand what I’ve said,” Lee says. “Project Relate can make the difference between a look of confusion and a friendly laugh of recognition.”
What is Project Relate?
Project Relate is an Android app designed on top of Google’s existing speech-recognition software. The goal is to learn how to understand and translate atypical human speech patterns.
Julie Cattiau is the project manager. She tells Inverse it all got started when programmers recognized a flaw in Google’s current speech recognition technology.
“We realized that Google’s speech recognition could be improved for people whose speech was impacted by a medical condition,” Cattiau says.
After more than two years in development, the team has collaborated with speech and language pathologists and amassed a training dataset composed of more than a million speech samples from more than 1,000 people.
There are other apps trying to do much the same thing, like Voiceitt and APP2Speak. But the universality of Google products could mean Project Relate has an outsize influence on how computers hear, understand, and act on what we tell them verbally.
Where speech recognition fails
Speaking into a pinhole microphone in our phones and having a disembodied voice respond intelligently feels extremely removed from reality. But Cattiau explains speech recognition software like Google’s is built by feeding an algorithm millions of real-life speech samples.
Like a student cramming for a test, these algorithms memorize the sound of a speech clip — where the hard consonants hit, the sentence structure of repeated queries — and then apply that knowledge to new input. In A.I., this is a process known as training and validation.
The problem with a lot of our technology isn’t necessarily the tech itself but the human behind it. In the case of speech recognition, the data given to these algorithms drastically skew toward clips of accent-free, unimpaired speech.
How computers hear us
Instead of generalizing many different people’s speech samples, Cattiau says Project Relate, which is looking for new beta testers, works by personalizing its speech recognition to the individual.
“The app is designed so that each individual starts first by recording some phrases and then we basically use those recordings to fully personalize the speech recognition experience,” Cattiau says. “We currently require 500 examples to be read [which] can take between 30 and 60 minutes.”
These speech samples might include common queries people ask Google Assistant, or a user can personalize the app’s recognition software by training it on phrases that are particularly useful for their life. Once you’ve trained the app, however, it doesn’t continue to learn from your day-to-day interactions, Cattiau says.
Project Relate is designed for three major use cases:
- Listen, which will transcribe in real-time
- Repeat, which will repeat what you’ve said using a “clear, synthesized voice”
- Assistant, which allows you to speak directly to Google Assistant, for example, to control Google Nest devices
The latter two features may offer users an increased measure of independence, Cattiau says.
“We have a trusted tester called Andrea who’s had ALS for I think six or seven years now,” she says.
“We actually went to her house and we equipped [it] with smart lightbulbs, smart thermostats, a smart lock on her door, and then basically gave her the Relate app so she can do some tasks around the house without having to stand up. She uses a walker to walk around the house, so she would rather not have to move too much.”
“We’re getting positive feedback from her and from her husband [that] it’s been very helpful for her,” she says.
But Project Relate isn’t a silver bullet unto itself. There are limits to how well this app works depending on how atypical a person’s speech patterns might be. Project Relate is currently only being tested for English speakers with otherwise unaccented speech, too, so the accent problem still looms large.
The universality of Google products is a plus for Project Relate’s goals, but equipping a smart home with Google products is expensive and difficult, which leads to a different accessibility problem. How Google might innovate its way out of the limits of its own product ecosystem remains to be seen.