When I moved to New York City from Toronto, in 1989, at age thirty, I was often interrupted by strangers who would say, with a knowing smile, “What part of Canada are you from?” It was invariably my “out” and “about” that gave me away. Americans hear them as oot and aboot—a gesture of the tongue and lips I learned in earliest infancy and which was duly hardwired into the motor nerves that control my vocal organs when I talk.
Because such aspects of voice are laid down during a critical period of brain development, “unlearning” them is extremely difficult—impossible for some. Today, after more than thirty years living in the United States, I have never lost my oot and aboot. It never fails to make me feel self-conscious about how my voice is telling strangers something intimate about me—and to make me wonder what stereotypes of “Canadianness” my listeners assign to me: excessive niceness, overweening liberalism, love of butter tarts. More seriously, I saw a sinister side to all this when our son returned home from his first day of kindergarten at his Manhattan public school in tears because his classmates had mocked him for saying “Sorry” with the low-back o vowel he had imprinted from his immigrant Canadian parents. Within the Darwinian arena of the schoolyard, our son’s vowels had put a target on his back, made him juicy prey for jungle gym predators.
This is one reason that I only semi-facetiously say that, to the many forms of prejudice we’ve all been alerted to (racism, sexism, ageism, lookism), we should add “voiceism,” a coinage that refers to the destructive stereotypes and preconceptions triggered by the way people sound when they speak. Not just accent but all the acoustic variables contained within the human vocal signal, including pitch, pace, volume, timbre, and texture. These sonic signatures make us draw assumptions (often wildly inaccurate) about a speaker’s geographic upbringing, intelligence, political affiliations, sexual orientation, level of education, and more.
George Bernard Shaw wrote an entire play, Pygmalion, about the way voices fractured Edwardian England: “It is impossible,” he said in the play’s preface, “for an Englishman to open his mouth without making some other Englishman hate or despise him.” But, as my and my son’s experiences show, voiceism is not confined to Shaw’s place and time. As a Torontonian who married a woman from the farm country of southern Alberta, I was quickly disabused of certain lazy assumptions I had unconsciously imbibed about the politics, opinions, attitudes, and values of people who speak in the strains of Canada’s rural west. No, they’re not all conservative! No, they are by no means unsophisticated “hayseeds”! And, having now spent more than half my life in the United States, I can tell you on good authority that Americans in 2021 are as prone as Shaw’s Edwardian Brits (or an Upper Canada College–educated Eastern Canadian) to forming snap judgments according to how their countrymen and -women sound.
To many above the Mason-Dixon line, someone pronouncing the i vowel as ah (“Ah lahk pah”) can instantly stigmatize the speaker as backward, undereducated, slow-witted, prejudiced, or intolerant. To Southerners, a Northerner’s habit of pronouncing i as two vowels, the diphthong uhh-ee (so that a New Yorker actually says “Uhh-ee luhh-eek puhh-ee”) establishes the speaker as an elitist PC-liberal snob. Midwesterners, with their hard r’s, nasality, and singsong prosody, are, to East Coasters and Southerners alike, bumpkins (Marge Gunderson in Fargo), while Californians, with their surfer-dude drawls and Valley-girl upspeak, are, to many in the rest of the country, hopeless flakes.
Assumptions about a speaker’s economic status are just as ingrained. We all know what Gatsby means when he says, of Daisy’s tinkling, musical speech, “Her voice is full of money.” This is to say nothing of the misogyny in critiques of women’s voices as excessively shrill. In reality, women speak roughly an octave higher than men, which hardly puts them in shrill territory. Then there is the avalanche of misogynistic disapproval that has met the spread of Kim Kardashian’s vocal fry among young middle class women (even though studies show that men use vocal fry as much, or more, than women do). There is also the casual homophobia that informs comedic imitations of a “gay” voice (drawling fahh-byuu-luhsss with a sharply sizzled sibilant), yet it’s still a standard trope on shows like Saturday Night Live.
Meanwhile, it’s hard to forget the grotesque racism triggered by the announcement, by linguists, that African American Vernacular English (sometimes called Black English or Ebonics) is a legitimate, rule-based dialect as expressive as any other form of English. When the Oakland, California, school board tried to use Black vernacular as a way to help small kids learn “standard” English, white and Black commentators alike were outraged, misapprehending it as a proposal to teach kids a nonstandard dialect in place of standard English. Meanwhile, media outlets derided the Black vernacular as “mumbo jumbo,” “mutant English,” “broken English,” and “ghettoese.” Right-wing political TV personality Tucker Carlson, then of CNN, now of Fox News, dismissed it as “a language where nobody knows how to conjugate the verbs.”
Little wonder that Olusoji Akomolafe, a political science professor at Norfolk State University, has asked if voice and accent are “the last acceptable form of prejudice.” The answer is yes—and not just in terms of accent or dialect, as US president Joe Biden recently acknowledged when discussing his lifelong stutter. “If you think about it,” he told a town hall audience, “stuttering is the only handicap people still laugh about.”
The truth is, voiceism is perhaps more pervasive than other forms of reflexive bigotry, which tend to demonize, demoralize, and delegitimize the less powerful. Voiceism flows up and down the social scale, as we see among the unemployed, opioid-addicted Rust Belt families in J. D. Vance’s Hillbilly Elegy, who justified their hatred and suspicion of then president Barack Obama for the very thing that the educated “coastal elites” loved him for: a voice that brimmed with inspiration and hope. To the downtrodden of Middletown, that voice was repellent, an embodiment of everything they lost or never had. “His accent—clean, perfect, neutral—is foreign,” Vance wrote of Obama.
But voiceism doesn’t just work across classes, colours, genders, and sexual orientations. It is also depressingly intramural. According to several African American linguists I spoke with, briefly code-switching from their “professorial” General American accent (the neutral accent required of TV anchor folk) into their childhood Black English acts like a verbal fist bump that establishes easy fellowship; those who won’t, or can’t, adopt the vernacular can engender mistrust and suspicion even among people who share the same skin colour. We can see evidence of voiceism in the gay community in David Thorpe’s documentary Do I Sound Gay?, in which he calls his more flamboyant-sounding friends “braying ninnies” while using an army of voice coaches to try to rid his speech of its gay features. And even female commentators levelled the charge of “shrillness” against Hillary Clinton.
Voiceism in the United States has arguably grown more acute at a time when Americans have endured four years under the leadership of a demagogue who used his own rageful, pugnacious voice to demonize whole sectors of society, to play up divisions and differences between men and women, Black and white, liberal and conservative, “red” and “blue” states, north, south, east, and west. At no time since perhaps the Civil War have Americans been more distrustful of one another, more ready to take offence at one another’s tone, more ready to jump down one another’s throats. That voice is one of the most conspicuous ways we signal individuality—and difference—makes voiceism a particularly potent evil in a time of division and discord.
Voiceism is all the more invidious (and ubiquitous) for being, in part, hardwired in our species: an evolutionary holdover of our having domesticated the chirps, barks, ribbets, and roars from which our unique human capacity for language emerged. All stimuli—visual, tactile, auditory—are first processed in our ancient limbic system, which parses them for threats. Before we hear what someone is saying, we react instinctively to how they say it. In 2015, psychologist Patricia Bestelmeyer of Bangor University used fMRI to show that activity in the hot-button amygdala determines whether a speaker is, according to their speech, a member of our “in-group” or a dangerous “other”—a way for early humans to tell if someone belonged to the same tribe, friend or foe.
Still, we should not use our animal ancestry as an excuse. If America hopes to heal itself—if it hopes to endure—then its people must strive to move past the differences they hear in one another’s voices. We may, as with other forms of bigotry, learn to call out friends when they deride another person for how they talk—or even when, in casual conversation, they slip into their “Asian” or “rapper” or “Mafioso” voice (that’s-a right-a!). Is this completely harmless, or does it reinforce stereotypes that encourage seeing others as alien, suspicious, unknowable? What for me is a relatively harmless bit of ribbing over my Canadian vowels was, for my son, something decidedly more upsetting, more painful. And, while we’re at it: can’t we let Steven Colbert speak in his natural South Carolina accent, or Don Lemon in the strains of his native Baton Rouge? As with bias against visible ethnic features (skin colour, the epicanthic fold typical of Asian people’s upper eyelids), exposure is all.
In the meantime, let’s educate young people (and ourselves) about our voices and how they come to display the marvelous diversity that they do. Linguist John McWhorter has written about the instinctive way that speakers of standard English often wince at Black English’s tendency to reverse the consonant sounds in ask to make aks. But, as McWhorter points out, such sound-switching (linguists call it “metathesis”) is not rare in any tongue and is simply a happenstance of how a particular pronunciation gets passed down, from parents to children, during the critical period in infancy when speech sounds get hardwired into the motor circuitry. Through accidents now lost to history, aks came down through many African American communities—and, while it may not be the “King’s English” today, it was (McWhorter points out) Chaucer’s. “Men aksed him, what should bifalle,” says the Miller in The Canterbury Tales, and “Yow lovers axe I now this questioun.”
So, next time you catch yourself, in a spasm of voiceism, wondering why someone insists on saying aks, tell yourself that, first of all, the speaker has as much control over aks as you do over the pronunciations you learned in the cradle, and that, in any case, the speaker is merely emulating one of the greatest writers of English next to Shakespeare. And, next time you find yourself pitying, or laughing at, a stutterer, tell yourself, “She may be the next president of the United States.” Through such small acts of patience and tolerance, we may, as a society, begin to transcend the natural voiceism that is an unfortunate inheritance of our species.
Adapted from This Is the Voice by John Colapinto ©2021. Reprinted by permission of Simon & Schuster.