Understanding the nuances of language to build a powerful voice bot

Build a powerful voice bot

Language and the way we communicate

Language has an interesting role to play in our day-to-day life. It’s so layered and discrete that sometimes we hardly notice it, or simply ignore it. A lot of aspects like the way we interact or speak have to do with language and the pragmatics of our language.

We speak in a certain way because we carry the baggage of our native language while speaking. For instance, since English is not my native language, I have observed that whenever I speak, write, or think in English, my native language has a considerable influence on the way I use English. It could be the way we use an article, our choice of a certain word or a phrase, or the stress we put on a particular phoneme – these elements tend to go unnoticed, but they are present in the way we communicate. 

This influence of native language on a non-native language is something that should be factored in while designing a voice bot. Let me illustrate this through our experience designing a voice bot.

Doll vs daal: Misunderstanding a user

In my role as a Voice Conversation Designer at Haptik, I recently worked on a virtual shopping assistant for a grocery brand. Here, the team dealt with customers from various parts of India who would interact with the bot by speaking their queries to the bot. 

While going through the user conversations on the grocery bot, we found several customers looking for doll. It was a surprise that customers were looking for a doll on a grocery bot, so I checked the product catalog and the inventory and found that the client had recently updated the stock with toys. 

As we scrolled through more conversations, we found more customers asking for doll and being shown the correct set of products. Then, we went systematically went through all the conversations with the word doll, and found that the very first conversation with the utterance doll could be traced back to a period when dolls were not in the product catalog! Another observation was that even after updating the inventory and customers were shown the correct products for doll, they continued to ask for doll! It seemed like customers were either not satisfied with the results that the bot offered or the bot did not respond as per their expectation.

The next step was to debug these conversations, so I took a dump of all these ‘doll’ conversations got to work. We checked for bot breaks during specific time periods, API failures, server errors, and even checked if this problem persisted for customers with certain user IDs. But there was no clear answer. 

Native language and its influences

I continued investigating the conversations and then it struck me! All the users who had asked for doll were from West Bengal and Odisha. I checked user pin codes to find locations in order to understand the linguistic context of these users. 

These customers from West Bengal and Odisha spoke with the baggage of their native language while speaking Hindi. Bengali and Odia languages have a tendency of rounding the vowel and naturally daal was pronounced doll. The speech recognition capability of the bot might have recognised this difference, but it was transcribed in the Roman writing system, which couldn’t capture the difference between a retroflex ‘d’ (similar to the one in ‘drawing’ in Indian English) and a dental ‘d’ (the one in ‘they’). The closest item known to the system was doll because it was available in the catalog and so the bot understood the word as doll rather than daal

This problem and the experience of finding a solution shows the importance of linguistics and the influence of native language on any other language that a user speaks in. 

Key Takeaways

So here’s what I learned from my experience solving this interesting problem

  1. Data is nothing but language:  It is essential to have the linguistic context to analyse a language – the data from actual speakers! A background in linguistics, the ability to have a linguistic perspective, and a linguistic approach are important in analysing user conversations on chatbots and voice bots.

  2. Have an analytical approach: The process of investigating user conversations requires analyzing the conversations to the core. Understand the problem in its entirety to come up with a solution that is robust, efficient, and easy to implement. Analysing what is happening, why it might be happening, and analysing data and language that may be responsible for a problem are crucial very important while doing an investigation or exhaustive analysis.

  3. Have technical context: As a voice Conversation Designer, it may not be absolutely necessary to understand the program or code, but understanding the nuances of how a bot works is very crucial. It’s important to understand how a bot understands human conversations, and how it responds to them in human language.

  4. Understand cognition: Understanding the principles of cognition can help us replicate some of that process on a broader level and translate it into a program. Without understanding the cognitive process, it is very difficult to translate that process into a program that creates an intelligent system. In the case of doll vs daal, I saw that the customers were repeatedly asking for the same item, even after viewing the correct item. This indicated the possibility of incorrect understanding. The knowledge that Bengali and Odia languages have a tendency of rounding the vowels helped me reach the conclusion that customers are probably asking for daal instead of doll. The cognitive process for humans involves linking these two facts and resolving the ambiguity in the situation. Identifying the exact conditions and rules while converting these broader level points into smaller steps and then translating that into a program is only possible if the cognitive process in our head is observed. We need to train ourselves before training the machine, after all!