From reinforcing entrenched gender roles to potentially even fuelling misogyny, choosing the right voice for a particular task can be a minefield.
James Bond flings open the door of his new BMW – which comes with hidden machine guns as standard – and immediately a feminine computerised voice announces, "Welcome! Please fasten seatbelt and obey all instructions for a safe trip."
Bond’s MI6 colleague and master of gadgets, Q, pipes up to explain: "Thought you’d pay more attention to a female voice." But, predictably, Bond later ignores repeated commands to wear his seatbelt and, using his mobile phone as a remote control, he subsequently drives the car off the top of a multi-storey car park in the 1997 blockbuster Tomorrow Never Dies.
Q's market research was wrong – and so was BMW’s. The firm famously recalled a feminine-voiced GPS system from its cars when German drivers complained that they didn't want to take instructions "from a woman".
But why won’t German men, British secret agents, or anyone else often follow directions delivered in feminine tones? Today, navigation systems with feminine voices are actually quite common. But multiple studies suggest that digital voices continue to reinforce deeply problematic gender stereotypes. The smart speaker with a feminine voice that politely does your bidding versus the masculine-sounding recorded message that takes charge and orders you to stay clear of a reversing truck.
Gender bias is rife in artificial intelligence (AI) systems, according to a widely discussed 2019 report from Unesco. The report’s title, "I'd blush if I could," refers to the response that Apple’s voice assistant Siri used to give when people remarked to it, "Hey Siri, you're a b****."
improvements have been made to AI voice systems since then, many argue there is still a way to go. So how did gender bias get so deeply embedded in these systems in the first place – and how do we go about getting rid of it?
The history of digital voices, and how we have used and abused them, doesn’t make for easy reading. Take the computer systems in aircraft that talk to pilots and provide information or warnings. One such system, which used voice recordings made by the singer and actor Joan Elms, was dubbed "Sexy Sally". A more recent system, originally featuring the voice of actor Kim Crow, was informally named "B******* Betty". And in the UK, the term "Nagging Nora " is sometimes used.
Similarly, staff on the London Underground are also reported to refer to an automated announcement system as "Sonya" because it "gets onya nerves".
A masculine equivalent of aircraft voice systems is called "Barking Bob" by some pilots – though, noticeably, that phrase doesn't connote the same gender-based prejudice as the other epithets.
It's not just whether or not people have accepted feminine voices in certain roles, it's also how developers have designed synthetic voices in the past to perform those roles that’s an issue, says Verena Rieser at Heriot-Watt University. Voice assistants have sometimes been incapable of recognising and challenging inappropriate behaviour.
"These systems are gendered and anthropomorphised," she explains. “There is basically a reinforcement cycle here."
The default voice of assistants such as Siri or Alexa was always feminine in the past, though in recent years Apple and Amazon have made other options available. Despite this, you might find feminine synthesised voices to be more common than masculine ones. But why? It's partly down to the fact that companies spent decades acquiring many more recordings of women’s voices than of men's. This mass of data has influenced subsequent technologies, including AI.
Instead of promoting gender equality, voice assistants have often done quite the opposite
Women have operated telephone exchanges and loaned their voices to lots of pre-digital message systems, meaning that a feminine voice is what many have come to expect from helpful, compliant technologies.
Research indicates that this fits with our misogynistic expectations of what tasks are "suitable" for women versus men. And yet other work suggests that there is actually little or no practical reason to prioritise feminine voices over masculine ones for certain applications – the two are more or less equally intelligible and both are capable of delivering information effectively.
Despite that, instead of promoting gender equality, voice assistants have often done quite the opposite. Journalist Leah Fessler tested virtual voice assistants’ responses to sexual harassment back in 2017 and found multiple issues. When told “You're hot", Amazon’s Alexa replied obsequiously, "That's nice of you to say". To the remark, "You're a slut", Microsoft’s Cortana delivered a web search result with an article entitled: "30 signs you're a slut".
In 2020, researchers from the Brookings Institution re-tested these interactions and found that they had improved somewhat. The voice assistants were more likely to push back against abuse than before, if not always very clearly.
Big tech companies have also diversified the voices that users can select for their virtual assistants. There are more masculine options than before and Apple, for example, no longer pre-selects a feminine voice as the default for Siri.
But researchers who study gender argue that simply offering masculine voices as an alternative, and tweaking assistants' responses to inappropriate language, still leaves us far from resolving the wider problem. A lack of diversity and sophistication remains in these disembodied voices, they say. Not least because of the identities that are often left out by virtual assistant systems. Some people identify as non-binary or gender fluid and it is increasingly accepted that gender identities are the sum of many factors, including social and cultural influences.
Can a digital voice capture this? The answer is "maybe", partly because it’s difficult to synthesise an adult human voice that sounds anything other than either masculine or feminine. Plus, although we tend to think of masculine voices as deeper than feminine voices, this is not always true, says Selina Sutton at Northumbria University.
Just because a digital voice's gender isn't obvious doesn’t mean it can’t also be sexist
"There's this middle range of pitches, fundamental frequencies, that’s the same for men and women," she explains.
Various projects have attempted to synthesise "gender-neutral" voices with varying success. Consulting firm Accenture produced an experimental gender-neutral voice, though it may simply sound masculine or feminine depending on how the listener perceives it.
In 2019, a team of designers and researchers came up with a project called Q (no connection to James Bond), which was billed as a "genderless voice". Co-creator Ryan Sherman, who now works for design lab Space10, says he was inspired to develop Q after noticing the subservient feminine voices that often characterised virtual assistants – even when they were faced with aggression from human users.
"It's usually in a way that’s submissive and reinforcing this idea that women are available to help at the touch of a button," he says.
Although Q is yet to be developed into a fully working synthetic voice system, the demo created by Sherman and his collaborators illustrated what it might sound like. They recorded the voices of people who identify as non-binary and adjusted the pitch to between 145 Hz and 175 Hz, which straddles many feminine and masculine human voices.
The result was played to 4,500 different people, Sherman says, and while some thought it sounded feminine while others perceived it as masculine, many judged it to be neither. Given this mix of responses, Sutton says Q is best described as "gender ambiguous" rather than neutral.
While a greater diversity of voices in computer systems could help move technologies away from stereotypical feminine performances, just because a digital voice's gender isn't obvious doesn't mean it can’t also be sexist, notes Sutton. Much depends on what the voice says and the functions it performs.
Designers of synthetic voices could offer even more outlandish options – voices that sound like weird cartoon characters or non-human animals. Mark West, lead author of the Unesco report suggested this approach last year.
"A way out of this conundrum is to project voice assistants and other AI applications as non-humans—a sort of 'let’s keep AI artificial' ethic," he said, noting that "non-human" sounding voices can be designed to be pleasant and intelligible.
But Sutton points out that people are now highly accustomed to anthropomorphised technologies.
"We are so conditioned to hear certain types of voices in these devices, moving even a little bit away from that can be difficult for users," she says.
Clearly, part of the problem here is that gender bias and prejudice exists across society and synthetic voices – like any cultural artefact – run the risk of reflecting that. Although still a worthwhile exercise, you can’t get rid of gender biases just by redesigning Siri’s voice. That won’t reverse people’s misogynistic attitudes overnight or suddenly equalise the number of women and men working in the AI industry.
There's another point to be made here. And that’s that virtual assistants – by definition – will always be subservient entities. They are more or less digital servants, after all, so how could we ever speak to them on a level playing field? That’s really what we would need in order to get away from all the awkward power dynamics and problematic, domineering behaviour that we currently throw their way.
"If they're primarily designed to help people search and shop, how far can you go with meaningful representations and relationships?" quizzes Charlotte Webb, co-founder of Feminist Internet.
In the near future, we are likely to encounter even more technologies that speak to us. Webb says she is concerned about how voice assistants could continue to perpetuate gender stereotypes once they embody avatars in the "metaverse" – virtual reality spaces. People have already been accused of sexually harassing others in the metaverse. Will virtual assistants serve to, inadvertently or not, enable and encourage such behaviour?
The history of synthetic voices, and our attitudes towards them, may have perpetuated and even deepened gender biases – like a feedback loop amplifying some of our worst intentions. And yet awareness of such issues has rocketed in recent years, with investigations into the output of AI technologies and changes in social attitudes thanks to the #MeToo movement and similar campaigns.
You could argue that’s the crux of all this, in the end. A more enlightened approach to one another and the wonderful array of human identities that exists in the world begins with us, not a database. In order to vanquish gender-based prejudices, we can’t just update the software. Or return the car to the manufacturer.
"I certainly don’t see a technological solution to it," says Webb. "I think it's a human problem."