As we all know, it’s almost impossible to talk to a tiny little baby, or an adorable puppy, and remain in a normal vocal register, or speak in complete sentences. In the presence of chubby little cheeks, big doe eyes, and fat little legs, all but the most cynical succumb to a rapid-fire, swooping cadence, bending down to ask rhetorical questions like “Who’s the cutest little baby in the whole world?” in a lilt seemingly inspired by a lungful of helium.
The practice is so instinctual, so reflexive, there’s even a name for it: researchers call it “infant-directed speech” (IDS), or, in common parlance, “motherese.” And experts at Princeton University may have just proven that aspects of this kind of speech are universal, regardless of language or culture.
In a study published October 12 in the journal Current Biology, researchers observed and recorded 24 mothers as they switched between talking with their babies and talking with adults at the Princeton Baby Lab. They then used algorithms to analyze and establish vocal fingerprints for those conversations, and discovered an uncannily universal shift in timbre as the mothers switched from talking to adults to talking to their babies and vice-versa.
Elise Piazza, a postdoctoral research associate with the Princeton Neuroscience Institute, and the lead author of the paper, describes timbre as “the quality of sound.” In a press release published with the study, Piazza explained timbre with a musical metaphor: “Barry White’s silky voice sounds different from Tom Waits’ gravelly one—even if they’re both singing the same note.”
The shift in timbre Piazza and her team found among mothers was so uniform—regardless of whether they spoke English, or any of the other nine languages analyzed during the course of the study—that the algorithms were able to tell the difference between adult-directed speech and baby-talk from just one second of sample audio.
Beyond providing linguistic proof of the universal bond of motherhood, Piazza and her team may have created new metrics for speech analysis. To start, as Piazza noted, the “findings could enable speech recognition software to rapidly identify this speech mode across languages.” Beyond that, algorithmically isolating and identifying this previously unquantified aspect of speech could have applications for a number of aspects of speech-recognition software. As Piazza put it, the work “invites future explorations of how speakers adjust their timbre to accommodate a wide variety of audiences, such as political constituents, students and romantic partners.”