Cognitive Scientists Find Links Between Jazz, Speech and Whale Songs

By Lorena Anderson, University Communications

October 11, 2017

From left to right, Professor Chris Kello, graduate student Butovens Me´de´ and Professor Ramesh Balasubramaniam worked together to analyze hundreds of sounds.

Jazz musicians riffing with each other, humans talking to each other and pods of killer whales all have interactive conversations that are remarkably similar to each other, new research reveals.

Cognitive science researchers at UC Merced have developed a new method for analyzing and comparing the sounds of speech, music and complex animal vocalizations like whale song and bird song. The paper detailing their findings is being published today in the Journal of the Royal Society Interface.

Their method is based on the idea that these sounds are complex because they have multiple layers of structure. Every language, for instance, has individuals sounds, roughly corresponding to letters, that combine to form syllables, words, phrases, sentences and so on. It’s a hierarchy that everyone understands intuitively. Musical compositions have their own temporal hierarchies, but until now there hasn’t been a way to directly compare the hierarchies of speech and music, or test whether similar hierarchies might exist in bird song and whale song.

“Playing jazz music has been likened to a conversation among musicians, and killer whales are highly social creatures who vocalize as if they are talking to each other. But does jazz music really sound like a conversation, and do killer whales really sound like they are talking?” asked lead researcher and UC Merced Professor Chris Kello. “We know killer whales are highly social and intelligent, but it’s hard to tell that they are interacting when you listen to recordings of them. Our method shows how much their sound patterns are like people talking, but not like other, less social whales or birds.”

The researchers figured out a way to measure and compare sound recordings by converting them into “barcodes” that capture clusters of sound energy, and clusters of clusters, across levels of a hierarchy. These barcodes allowed the researchers to directly compare temporal hierarchies in more than 200 recordings of different kinds of speech in six different languages, different kinds of popular and classical music, four different species of birds and whales singing their songs, and even thunderstorms.

Kello and his colleagues have been using the barcode method for several years. They first developed it in studies of conversations. The study published today is the first time that they applied the method to music and animal vocalizations.

“The method allows us to ask questions about language and music and animal songs that we couldn’t ask without a way to see and compare patterns in all these recordings,” Kello said.

The researchers compared barcode-style visualizations of recorded sounds.

Kello, fellow UC Merced cognitive science Professor Ramesh Balasubramaniam, graduate student Butovens Me´de´ and collaborator Professor Simone Dalla Bella also discovered that the haunting songs of huge humpback whales are remarkably similar to the beautiful songs of tiny nightingales and hermit thrushes in terms of their temporal hierarchies.

“Humpbacks, nightingales and hermit thrushes are solitary singers,” Kello said. “The barcodes show that their songs have similar layers of structure, but we don’t know what it means — yet.”

The idea for this project came from Kello’s sabbatical at the University of Montpellier in France, where he worked and discussed ideas with Dalla Bella. Balasubramaniam, who studies how music is perceived, is in the School of Social Sciences, Humanities and Arts with Kello, who studies speech and language processing. The project was a natural collaboration and is part of a growing research focus at UC Merced that was enabled by the National Science Foundation-funded CHASE summer school on Music and Language in 2014, and a Google Faculty Award to Kello.

Balasubramaniam is interested in continuing the work to better understand how brains distinguish between music and speech, while Kello said there are many different avenues to pursue.

For instance, the researchers found nearly identical temporal hierarchies for six different languages, which may suggest something universal about human speech. However, because this result was based on recordings of TED Talks — which have a common style and progression — Kello said it will be important to keep looking at other forms of speech and language.

One of his graduate students, Sara Schneider, is using the method to study the convergence of Spanish and English barcodes in bilingual conversations. Another graduate student, Adolfo Ramirez-Aristizabal, is working with Kello and Balasubramaniam to study whether the barcode method may shed light on how brains process speech and other complex sounds.

“Listening to music and speech, we can hear some of what we see in the barcodes, and the information may be useful for automatic classification of audio recordings. But that doesn’t mean that our brains process music and speech using these barcodes,” Kello said. “It’s intriguing, but we need to keep asking questions and go where the data lead us.”