The “laurel” versus “yanny” debate is everywhere this week. Even CNBC has joined the discussion by renaming market sentiment. The bulls are now team laurel and the bears joined team yanny.
Spoiler alert: the sound file is from an online pronunciation of laurel. Yup… team laurel wins. But was the sound file tweaked to cause the controversy?
“I asked my friends in my class and we all heard mixed things,” says Katie Hetzel, a freshman at Flowery Branch High School in Georgia. She then posted the audio clip to her Instagram story.
Soon, a senior at the same school, Fernando Castro, re-published the clip to his Instagram story as a poll. “She recorded it and put it on her story then I remade the video and posted it,” says Castro. “Katie and I have been going back and forth and we both agree that we had equal credit on it.”
The original from Vocabulary.com vs. the viral Twitter post
Here’s what the original laurel sounds like on vocabulary.com:
From there, the sound bite traveled from the teens’ Instagram stories to Reddit, and then onto the now-viral Twitter poll. Here’s the sound bite from that post that has made it onto almost every major news site, from CNN to The New York Times:
If you listen to the sound that is causing all the controversy, it sounds quite a bit different than the original source. At least it does to my ears…
So, was it tweaked?
Signal processing: are the two recordings the same?
An individual’s hearing may be subjective, but data is data. Since our ears aren’t helping us, let’s see if data analysis can. We turned to MATLAB and signal processing to see if the sound waves were truly different.
First, here’s what the two sound files look like. In blue, is the laurel pronunciation from vocabularly.com. The orange sound wave is the widely-debated Twitter version.
When these same signals are examined using time-frequency plots, they still strongly resemble each other. A time-frequency plot, shown in the second row in the image below, is a visualization of how the frequency content in signals evolves as a function of time. I’ve highlighted an area where you can see a shared characteristic.
When you look at the higher frequencies in the sound clips, you can see quite a difference. The signal strength is noticeably higher (indicated with more yellow) in the Twitter version.
So the sound file on Twitter is different than the sound file from vocabulary.com, especially at the higher frequencies. Was it manipulated? Not likely. There are still far too many commonalities. It seems to be a poor quality version of the original with extra noise and sound reflection. The sound reflection boosted the signal strength at the upper frequencies.
The interesting thing is people who hear predominantly lower frequencies won’t be able to tell the difference since the lower frequencies are largely the same. But for people whose hearing focuses on higher frequencies, the two sound quite different. I must be one of those people, since the original sounds like “laurel” to me, but the Twitter one sounds like “yanny”.
So what does “laurel” versus “yanny” really look like?
Since both of the above sound files were really “laurel”, we decided to take a look at what “laurel” and “yanny” would look like when spoken by the same person into the same recording device. Here are the recordings we used:
When you listen to these sound files, it’s easy to hear each word very clearly. No debate here! The sound files even look very different in time. So how do so many people confuse the two?
When we closely examine how the spectral content evolves as a function of time using wavelets (scalogram), we notice something interesting. There is one pattern that seems to be common to both the sound clips in the time-frequency plots. The common pattern is in the lower frequency range, shown in the red boxes in the image below. This helps explain why the two words sound so similar in the lower frequency range. This is why so many people heard “yanny” in the Twitter poll.
The blue dress/black dress, this time in sound
When the blue dress/black dress controversy blanketed the internet, we learned sometimes you can’t trust what you see. This story does the same for sound. Fortunately, wavelet-based signal processing techniques can determine the reasoning behind the confusion.
Sometimes, our ears are more sensitive to one range of frequencies versus the other. Depending on how your ears are tuned, you may hear different words. The quality of the recording can also complicate matters. In this case, a recording with sound reflection in higher frequencies and a noisy background had a lot of people hearing “yanny”, an acoustically similar word. Congrats team “laurel”, you were right!