✖️ What's going on with our sonic data?

Voice-assistants, AI, vocal biomarkers & Covid-19

Nov 19, 2020

Recently, there were two slightly silly sounding articles from very serious publications. In September, a news feature was published on the website of Nature titled ‘Alexa, do I have COVID-19?’ It went into the research done by a company called Vocalis aiming to diagnose the vocal biomarkers that could tell when someone has a Covid-19 cough. Then, a few weeks ago, the MIT News Office published an article about similar research. This time it concerned forced-cough recordings that can be used to predict even asymptomatic infections.

Let’s dig deeper

It seems that the current Covid-19 pandemic has brought this type of AI-voice-research to the fore. In both cases mentioned above, researchers had longstanding projects on using vocal biomarkers and coughs to ascertain certain diseases. In the case of Vocalis this concerned an app to help diagnose COPD (chronic obstructive pulmonary disease). In the case of the MIT researchers, they had been working on an AI framework to help diagnose early-onset Alzheimer’s disease in patients. Both sets of researchers turned to the voice because the sounds the we generate when speaking come from our vocal chords. Those sounds are then shaped by a number of structures in our heads, e.g. the nose and its cavities. Our brain and nervous system also come into play to determine how the sounds are perceived as words, music, and language. This complex configuration provides ample space for any distortion to be used as a diagnostic. The potential for using our voice, and vocal expressions more broadly, to aid physicians is vast.

Let’s widen the scope

To be able to create effective AI-driven voice diagnostics tools it’s necessary to have big datasets. In the example above by Vocalis, the dataset came from voluntary use cases. Most trials involve the cooperation of a set of people who opt in to cough into their phone or otherwise have themselves recorded.

However, there are currently 4.2 billion digital voice assistants being used in devices around the world. This number is set to grow to move to a number greater than the global population by 2024. All of these voice assistants are AI-driven. While the medical research mentioned before focuses on the nitty-gritty of matching audio data to other audio data, voice assistants aim to predict what you will be saying. The AI, in other words, is mostly used to speed up the computing process to determine what a voice assistant’s answer should be to your query. In a recent article in the Correspondent, Zeno Siemens-Braga tried to find out what actually happens to your voice after you speak to Siri, Alexa, Xiaowei, Google, or Bixby:

“That voice recording gets ground down into readable data, before ending up somewhere on a server, where the unique features of your voice are connected and compared to as many other speech chunks as possible to make the whole process run even more smoothly.”

This leaves an enormous potential dataset of voices processed through smartphones, smartspeakers, or any other voice-assistant enabled device.

Where does music come in?

Not only are more and more people using voice assistants, but music-related queries are the most common among the command categories. While we may think about smart speakers first and foremost, in the US, for example, it’s actually cars that are driving the use of voice assistants the most. It’s in the home, however, where a smart speaker means not just listening to more music, but also more e-commerce transactions. It seems that everything just becomes easier when we can use our voice. In the latest Smart Audio report by Edison/NPR the importance of music in the adoption of voice assistants again became clear: 85% of users request to play music in a typical week.

From that same report:

“56% of those who use a voice assistant on a smartphone say they keep the voice-operated personal assistant on their smartphone turned on all the time.”

All the audio recorded by those smartphones gets sent to the cloud and stored for analysis. Because it’s nice to have easy access to your favourite music, it’s tempting to leave that voice assistant on alert at all times. But until we can regulate the use of that data it may be wise to pay attention to what you use your voice assistant for and when you should turn it off. We are, perhaps, better off allowing the medical researchers from Vocalis and MIT to figure out safe ways to use our vocal chord data than we are Amazon, Google, Tencent, and co.

TECH

Bas wrote a great piece for Water & Music about the role UX plays on streaming services to potentially push down the per stream royalty rates. It’s free to read until Friday, but the Water & Music Patreon is well worth signing up for if you haven’t yet.
Twitter is rolling out ‘audio spaces’. Seemingly set up to rival Clubhouse, the focus in the reporting on this has been on how Twitter will trial the functionality with exactly the groups who face the most abuse on social platforms: women and people of marginalized backgrounds. Audio spaces also seem a development on the voice tweets made famous by John Legend.
It’s not the most-used browser in the world, but Opera has announced an integration with Spotify, Apple Music, and YouTube Music.
The Guardian has a great piece on BTS and how they use various types of (social) tech to bring their storytelling to their fans: from livestreaming guitar practice to talking to a million fans at ones, and from easter eggs to a reality TV like miniseries.

CORONA

Madrid is one of the few cities where the nightlife hasn’t been shut down completely. El País has the lowdown, from partying starting around 6pm to the risk of bars and clubs being closed down by the police for violating the rules.
Seated continues their series of blogs about their learnings from the many ticketed livestream events they’ve been a part of. In the second installment the major take-away for me is that merchandise sales outpaced ticket sales for some artists.

For instance, long time artist manager Randy Nichols, publicly shared the data from Underoath’s recent live streams, which showed that they experienced similar results. Nichols said that Underoath surpassed $750K in total sales, with more than two-thirds of revenue coming from merch presales and limited edition vinyl.

Fortune asks whether a mouthwash may be the quick-test solution to bring back live events.
Lobbying group UK Music has released its yearly Music by Numbers report. It’s worth it to read the whole thing but the main takeaways are:
1. The sector had seen an 11% growth from 2018 to 2019
2. 85% of live revenue will be lost in 2020
3. 65% of music creators’ income will be lost in 2020
4. Music tourism was worth GBP4.6 billion in 2019
5. Actions required include: VAT rate reduction on tickets; Business rate relief for venues; clear protocols regarding testing and live events.
Bandcamp is entering the livestream market. Their platform could be ideally placed for artists to monetize their livestreams as:
1. fans’ payment method is already known with Bandcamp
2. merch is already a part of what’s offered on the platform
3. there’s a chat (in the browser version only) where all merch purchases are highlighted for the performing artist to give a shoutout.
4. It’s easy to cast the video to a big screen (losing the chat function)

Music

The other day I heard Song to the Siren from This Mortal Coil again for the first time in a long time. This song always manages to stop me dead in my tracks, whatever it is that I’m doing at the time. It’s haunting and if you’ve never heard it before Liz Fraser will cast a spell on you that you will never be able to shake off again.

MUSIC x is composed by Bas Grasmayer and Maarten Walraven.

❤️ patreon - musicxtechxfuture.com - musicxgreen.com - linkedin Bas - linkedin Maarten

MUSIC x