A smart speaker is a small speaker with a microphone installed, without a keyboard, to be operated only by the user's voice, which is an interface very different from more traditional ones (writing or touching). The device may or may not have a screen, but smart speakers without a screen are the most popular. The smart speaker market has also been impacted by the COVID-19 pandemic. The global market is estimated to grow from $4.66 billion in 2020 to $6.98 billion in 2021, at a compound annual growth rate (CAGR) of 49.8%. In operation in Brazilian Portuguese since 2018, these devices saw an increase in usage in Brazil between 2020-2021, compared to previous years.
In this research we used mixed methods in our investigation because it employs a strategy of inquiry involving either the simultaneous or sequential collection of data and collects both numeric information and text information. We used two methodological techniques: an online survey and online thematic analysis. We needed these two techniques to understand how the consumer uses voice assistants, what journalism companies are developing content to these devices, and to analyse data from technology companies' reports.
To understand the challenges and advantages from both the consumer and the content producer side in this new world of voice interaction devices, we applied a survey to 112 Portuguese-speaking smart speaker users, and also interviewed three Portuguese-speaking content producers, two of them from the largest media companies in the Portuguese language.
We cross-referenced the survey and interview data and were able to discern a pattern of news consumption through smart speakers. As a result of our research, we discovered this platform has also gained prominence in news consumption, with most of the users in our research using their smart speakers daily to inform themselves. They seek quick news summaries, with access peaks in the early morning.
Among the challenges for content producers detected by our research are the lack of full understanding of the Portuguese language by the platforms and operative system of these devices. 31% of our survey sample indicated that they have experienced some kind of lack of understanding of words and/or phrases by the device. One of the interviewed brands also confessed that it has identified a problem of understanding by smart speakers: the brand "UOL" has been confused with the English word "wall". Words ending in “R” were also pointed out by consumers as examples of misinterpretation or lack of understanding by the devices. These are issues that show that the natural language processing (NLP) system of the platforms need to be trained for different orthoepics and prosody of the Brazilian Portuguese language. Since in the country, the phoneme /l/ at the end of words is pronounced as the semivowel /w/, and the phoneme /r/ at the end of words is soft and not vibrant as the standard indicates in much of the country.
Content producers also need to pierce platform bubbles to reach the user. Google works more with the consumer journey, i.e. the algorithm defines the content to be delivered. Amazon prioritises commercial partners to define this delivery. But overall, it's a market that brands with larger audiences on other platforms and internet pioneers have an advantage. Both because it is more likely to close commercial deals and to be remembered by the consumer and to be triggered by the brand name.
So, in a growing market for content produced for smart speakers, the challenges, both for producers and consumers, are huge. There are commercial issues, different platform strategies and, above all, to sort out all the voice interface glitches and difficulties, especially for non-English speakers.