An Essay: My Notes On Voice Search

On this blog, I’ve already covered and talked about voice search and why I think is critical to Google future strategy.

Let me start by a couple of moves Google has made which show – I argue – how much the company is betting on voice search. For instance, in 2018, Google invested in San Diego-based startup, KaiOS.

This is a company that created an operating system for a feature phone (the dumb brother of a smartphone). What’s so incredible about KaiOS is that it creates within a feature phone some smart features, like social, and productivity apps. KaiOS also allows feature phones holders to install a Google Assistant.

That is incredible! Finally, hundreds of millions cheap phones around the world, especially in India and Africa can be connected to the internet and have access to apps (like Facebook, WhatsApp, and Google) that have influenced the spread of global cultures in the last decade.

If Google manages to make its Voice Assistants widely used on those devices, it will be able to collect massive amounts of voice data and get access to emerging markets very quickly.

Also, in January 2019, Google bought the answer engine called Superpod. It paid $60 million for it. While we don’t know for sure how Google will use it, we can imagine this will be used as a repository of answers that will help Google enhance its voice search side.

If that is not enough for you, look at the graphic below:


Source: SEMRush Sensor

This is a visualization from the SEMRush sensor. In short, this is a tool that measures, based on a sample of searches, what percentage of those are served as advanced features of Google.

Some of these features comprise featured snippets, knowledge panels, top stories, and instant answers. This is of course only a reference point and it is not meant as to say that those are exact numbers of Google advanced searches.

However, they are enough for me to support the argument that those features are essential, not only because as a business owner, they can give more visibility to your company through them.

But they show how Google is evolving toward voice search. Indeed, a featured snippet is an answer Google gives to specific questions, also complex ones. Below an example:


When Google intercepts an intent of a user that might be well served with a featured snippet.

Other features, like knowledge panels, top stories, and instant answers, all have as an aim to push some information out, on the search results pages – without the users have to click through websites links provided by Google – so that users can find it without the need to leave the search page.

From the image above you can appreciate how the effect of these features is cannibalizing the visibility of organic content. But it also shows how we might be going toward voice search.

Indeed, as shown on this blog, often, advanced features like featured snippets do also become answers in the voice assistant.

This is an example of a featured snippet from this blog, coming from the query “what is a hidden revenue business model” that also has become an answer within Google Assistant:

Considering that in 2018, Google Assistants were already on over five hundred million devices. That justifies all the excitement around voice, and why it is so important in the future development of the web.

Besides all the buzz, excitement and utopic views that go with new technologies, let’s also look at the downside of voice.

Is voice search in tune with human nature?

The classic study by Albert Mehrabian pointed out that 55% of communication happens via body language, 38% to tone or voice – meant as tone and music – and 7% to words.

Whether or not those numbers are accurate, completely off track, or apocryphal, let’s think about this issue from a different perspective.

If I think about this issue rationally, I’m driven to believe that most of my communication is done via words, spoken and written. However, this is the logical side that wants me to think he’s in charge.

On the other hand, we all know that there are dozens if not hundreds of cues that each day help us go through and drive us.

From a simple expression of a person’s face, we can infer the mood of that person. From a simple gesture, there is a whole world. Thus, we can – I guess – agree that words, be them spoken or written are only a small part of our communication.

The remaining part is body language. For that matter, voice search doesn’t seem to be the perfect candidate to replace most of our interactions with tech devices. Rather, body gestures will play a critical role.

When I’m on the street with a set of AirPods in my ears, I’ll never shout loud “next song.” I prefer rather tap on it a couple of times on my left ear to jump to the next song. It does make me look like a fool, but it’s way more comfortable.

Also, as those devices are used by more and more people, those gestures that might seem absurd now, might become part of our daily routines. If I tap twice on my left ear, the person on the other hand that also belongs to the “AirPods circle” will understand I’m also part of the same circle!

For instance, in a recent study:

People surveyed mentioned as primary reasons for not liking the voice experience:

  • Not comfortable shopping by voice
  • Not screen (for smart speakers like Google Home)
  • Can type faster to get what is wanted
  • Do not like to say the wake words (like “Hey Google“)

It is important to notice that those devices are still at the primordial stage, and so far it’s very hard to control the experience of people using them. Also, often those devices work very hard at understanding us. So talking to them is not as easy as speaking to another human (yet).

In addition, humans have shown to use technologies also when they seemed counterintuitive and against our nature. I read entire books on my four inches smartphone! People have been searching for years stuff on the web, based on keywords, rather than use a more natural language.

Just because Google semantic power wouldn’t be able to keep up with users’ human-like searches. We didn’t change technology, we kept at it, as we didn’t know any better.

Thus, using this parameter alone won’t work. People use tech, not just because it is useful. They use it also because they want to belong. In fact, tech entrepreneurs are well aware of the importance of network effects, when building up a company.

With network effects, platforms get better with each new joining user. In addition, when a critical mass is reached people would be “socially locked-in.” When anyone around you has an app, you’ll need to have it too.

I still struggle these days explaining to my family why I don’t use WhatsApp, or why I’ve given up to social media for years (before activating social accounts again, back in 2015, to spread more easily the content of FourWeekMBA).

Therefore, the ability of those companies to put as many devices on the hands of people, with a proper distribution strategy that taps into the right channels, will be critical. Let’s do a recap.

Key takeaway

I’m a believer on voice search. And I do believe Google is and will invest massively in it. I’m also aware of its limitations. Voice might represent a great transition toward another more advanced way of discovering information and interacting with the objects that surround us.

Yet I’m not sure it will be the primary way we interact with things. This, of course, might turn out to be wrong!

What’s your take on that?

Read next:

About The Author

Leave a Reply

Scroll to Top