Limitations in Measuring the Performance of IVA's

 Intelligent Virtual Assistants - Header

As every successful enterprise is aware, building and offering a quality product is just the beginning. “Customer is king” is not just an age-old mantra – it’s perhaps never been as pertinent as it is today. Customers increasingly expect more from brands than just something that works. They demand personalized experiences that deliver on comfort, convenience and efficiency.

Needless to say, businesses need to match these expectations to stay competitive and relevant in the current landscape. This has led to an increased emphasis on customer experience, which is now widely considered as the key driver for a brand’s perception in the market.

Over the last decade, Intelligent Virtual Assistants (IVA) have emerged to address age-old pain points faced by customers during their interactions with businesses. From simple chatbots and voice bots to specialist domain-specific agents, IVAs have gradually evolved from being just a feature to becoming products in their own right.

And as with any other product, it is crucial to measure their effectiveness when it comes to serving the needs of customers.

There are a number of existing metrics that have been employed to track Intelligent Virtual Assistant performance. These include time to resolution (TTR), customer effort in finding product information (CES), customer satisfaction (CSAT) and net promoter score (NPS).

While these are all proven, and indeed essential, metrics, they are solely focused on measuring customer experience (CX), which is influenced by a number of factors. As a result, it neglects what should be the core performance indicator of an IVA – it’s intelligence.

So how do you measure the intelligence of an Intelligent Virtual Assistant? What are the key metrics and frameworks to consider to ensure that the IVA is responding correctly, the NLU (Natural Language Understanding) is actually functioning, and the overall customer satisfaction with the assistant is great?

Before we attempt to answer those questions, let us delve a little deeper into some of the existing frameworks that have been used to measure customer experience, and why they fall short when it comes to gauging IVA performance.

The Challenges of Measuring IVA Performance with Existing Frameworks

Specific business units may come up with their unique metrics to gauge Intelligent Virtual Assistant performance at scale – for e.g. Time to Resolution, Time spent on website/Store Ratings, Customer Effort Score (for overall CX), Lead Conversion, among others. 

However, ultimately, the metrics that are most widely used by brands are CSAT and NPS. In addition, given the increased focus on measuring customer emotions, Sentiment Analysis is also emerging as a powerful measurement tool.

Let us examine these metrics and the challenges that they present when it comes to measuring IVA performance.


CSAT provides what is perhaps the most accurate reflection of how a customer feels about a brand and its products/services. However, it is predominantly dependent on customers filling up the Feedback form – something which rarely happens in the case of users interacting with IVAs. This becomes a significant limitation when it comes to using the CSAT framework to measure IVA performance, as it simply does not capture the experience of all customers using the solution.


NPS is the most prevalent customer experience metric across geographies and industries. Providing feedback under the NPS framework requires minimal effort on the user’s part, and thus has a better response rate as compared to CSAT. With a broader base of customer data, it is able to capture a wider spectrum of users – ranging from happy to dissatisfied. 

However, much like CSAT, NPS is dependent on customers completing the Feedback form. Moreover, it is not qualitative, and thus often misses detailed inputs on performance that are crucial when it comes to assessing IVAs.


Automation is a metric that is more specific to chatbots and IVAs. It indicates an IVA’s ability to understand user queries and provide relevant answers. The basic rule of thumb is that the greater the level of automation, the better the performance of the IVA – ultimately, enhancing customer experience.

A major advantage of Automation as a metric is that it takes into account each and every conversation of the user. However, it also has significant limitations in that it assumes the IVA designer has accounted for all possible user queries and the fullest scope of the solution, which is often not the case. There are a lot of false positives that exist which could be affecting customer experience, and examining the IVA’s automation alone will not help identify them.

Sentiment Analysis

Sentiment Analysis is a relatively new customer experience metric. The idea behind it is to link certain key phrases with specific emotions. Owing to advancements in Machine Learning, this is largely an automated process. However, it is still limited, as the usage of language and comprehension varies based on geography, audience etc. This sometimes results in the sentiments being captured inappropriately. For instance, Sentiment Analysis may not capture certain nuances of language such as sarcasm, jokes etc. This leads to an inaccurate reflection of an IVA’s performance.

To sum up – all of the aforementioned metrics can be effectively leveraged to some extent to assess the performance of an Intelligent Virtual Assistant. However, as we’ve discussed, each of them has certain limitations.

Which brings us back to the question we started with – How do you measure the intelligence of an Intelligent Virtual Assistant?

Read More: The Simplest Way to Build Intelligent Virtual Assistants

A New Framework



When it comes to measuring the performance of an IVA, there is a need to detect and analyze customer behavior at a more granular level, often instantaneously during a conversation. This isn’t something that is feasible with pre-existing customer experience measurement methods – creating the need for a new framework unique to IVAs.

IVAs themselves offer the perfect means of measuring the customer experience of their users – detecting and analyzing the behavior of a customer instantaneously over the course of a conversation.

The performance of an IVA largely depends on its underlying layer of Machine Learning – Intent Detection, NLU (Natural Language Understanding), and NLP (Natural Language Processing), working in sync to accurately help end-users with their queries. An IVA needs understand the message of the user (NLU), recognize the intent of the user (Intent Detection), and process the responses appropriately (NLP). Only an IVA which does all three, will be able to deliver the right customer experience and ultimately generate ROI for the brand that has deployed it.

Keeping all this in mind, Haptik has developed a new industry-first framework for measuring Intelligent Virtual Assistant performance – a simple and efficient to comprehensively assess our solutions at scale, and to help our partners evolve effective strategies for enhancing customer experience. We call it the Intelligence Satisfaction Score or I-SAT.

Haptik’s I-SAT framework aims to measure the effectiveness of an IVA, based on its ability to understand and help the end-user.

Want to learn more about I-SAT and how it can help brands truly get the best out of their Intelligent Virtual Assistants? Then watch Is your virtual assistant really intelligent? – an on-demand webinar presented by Haptik and Opus Research. Our Co-Founder and CEO Aakrit Vaish is joined by Dan Miller (Lead Analyst & Founder, Opus Research) and Derek Top (Research Director, Opus Research), as they discuss the new framework, how it works, and how it’s poised to shake-up the industry.

You can access the webinar below


Related Articles

View All