Have you ever been misunderstood while speaking to a voice interface because of your accent? How did these subtle linguistic boundaries get encoded in speech recognition technology?
Accent bias is, unfortunately, one of the last socially acceptable forms of prejudice. Its influence is woven into the fabric of voice interfaces and its effects have an increased potential for sociotechnical harm through the pursuit of use cases such as emotion or sentiment detection.
By asking ourselves “Whose English gets to be default?” we are challenging centuries-old ideas about “good” speech while prompting ourselves to confront problematic beliefs, like character or nationality being able to be calculatively deduced from people’s voices.
We will leave asking ourselves if we can give space to polyvocal futures where the vast array of accents, sociolects, and dialects present in spoken English around the world are treated more equitably in our voice interfaces.