With new research, your smart speaker might better understand African American English : NPR
NPR’s Ayesha Rascoe speaks to Howard University professor Gloria Washington about a new project that will make it easier for Black people to be understood by automatic speech recognition technology.
AYESHA RASCOE, HOST:
So something I’ve noticed is that when I try to talk to my smart speaker, Alexa doesn’t always get what I’m trying to say. And don’t get me started on those apps that transcribe interviews. So why am I having all these issues? Well, it might be because I talk the way I do. Research has shown that a lot of automatic speech recognition technology, or ASR, doesn’t work well for underrepresented accent groups, like the Black community. Howard University wants to change that, along with Google, which we should note is a financial supporter of NPR, they are building an African American English speech dataset that will then be available to others looking to improve speech technology. Gloria Washington is an associate professor of computer science at Howard University and the principal investigator for the project called Elevate Black Voices. Welcome to the program.
GLORIA WASHINGTON: Thank you so much for having me here today.
RASCOE: I know firsthand the challenges of using the automatic speech recognition technologies, but what are you hearing from others in the Black community about their experiences using this type of tech?
WASHINGTON: Really, in a nutshell, it just boils down to what you just said. Across the board, most of these tools have problems understanding Black people when we’re just being our normal selves, living our normal lives, trying to use the technology. It creates, to me, this imposter syndrome that can live and breathe in the Black community and HBCUs. Like, you can use the tech sometimes but not really all the time, and it’s not really there for you to be comfortable with.
RASCOE: So how will this new project address these issues? Is it dialect? Is that the issue? And how does this project address that?
WASHINGTON: It really has to do with the data that automatic speech recognition and these voice assistants are trained off of. So what we want to do is have and collect different data across the United States of different kinds and techniques of the way that Black people speak naturally. So we’re in the DMV. And, of course, we had an event here where we got people just to talk about the uniqueness of the D.C. slang. And then we’re also going to Atlanta. We’re going to the Deep South, Alabama and also Houston area so we can collect some audio segments so that we can use them later on down the line.
RASCOE: This is, like, biometric data, right? And, you know, this can be even more sensitive among marginalized communities. So how are you tackling that part, where people may feel like, I don’t know if I want them to have my voice? What are they going to do with it?
WASHINGTON: Definitely. We created a set of guidelines starting out with that Google agreed to adhere to. And these guidelines just say that basically, Google can’t do anything to go out and find these individuals who provided their data. And any other tech company who decides to utilize the dataset, they cannot go out and further marginalize the people who provided their data. For us, we’re thinking of this entire thing as a collaboration with Google and that eventually, the data will live and breathe, and there’ll be a consortium of HBCUs that protect and celebrate it.
RASCOE: What do you hope this project will achieve, beyond being able to dictate a text message or ask, you know, Siri to look something up for you? What do you hope it will achieve?
WASHINGTON: I hope that we at Howard and all these HBCUs that truly care about African American English and Black people in general will create technology that we can be comfortable just being ourselves. An example is, like, maybe a version of a Siri or Alexa that has the ability to code switch, where you can allow it to speak naturally African American English to you, and you can be comfortable interacting with it. That’s my longer-term goal, and I do hope that my students will feel empowered to create these many cool tools. And if they have that mentality, that they can go in and utilize our voice and our unique structures for something cool, I am all for it.
RASCOE: That’s Gloria Washington. She is an associate professor of computer science at Howard University. Thank you so much.
WASHINGTON: Thank you for having me. Thank you.
Copyright © 2023 NPR. All rights reserved. Visit our website terms of use and permissions pages at www.npr.org for further information.
NPR transcripts are created on a rush deadline by an NPR contractor. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.