Open Sourcing Chatbot Ner

Chatbot?

Evolution of automated messaging, which started in 1966 with first Chatbot, ELIZA, has now reached a stage where Chatbots have found their application in several industry domains like personal assistance, banking, e-commerce, healthcare, etc. With early experiments showing positive results for many enterprises, the need for building customized and domain focused Chatbots for specific applications is increasing exponentially.

 

Building a bot which fixes real industry pain points is a combination of design, engineering and research problem. While platforms and frameworks available for addressing engineering and design problems have unidirectional focus, resources addressing research/machine learning for Chatbots are highly generic and scattered. During our journey at Haptik, we ended up building and customizing different machine learning modules specifically focused on building Chatbots on narrow domains and which are targeted at an end to end completion of a specific task such as making travel bookings, gift recommendation, and ordering, lead generation for different businesses, etc.

 

Robot_Blog

 

In this blog, we will talk about one of our integral module Chatbot NER i.e. Named Entity Recognition (NER) which we have open-sourced specifically to facilitate the intelligence of Chatbots targeted at domains like personal assistance, e-commerce, insurance, healthcare, fitness, etc. There are many approaches (i.e. Generative based, Retrieval based, Heuristic based, etc.) used to build conversational bots or dialogue systems and each of these techniques make use of NER somewhere or the other in their respective pipeline as it is one of the most important modules in building the conversational bots. Apart from functionalities available in conventional NER systems, Chatbot NER contains several add on’s which are specifically aids in building Chatbots.

 

 

So What Is Chatbot NER?

 

NER is a subtask of information extraction that seeks to locate and classify named entities in text into predefined categories such as the name of a person, location, organization, contact detail, expressions of time, quantity, monetary value, percentage, etc.

 

For example,

```json
"Remind me to call Mainland China day afer tommorrow at 6:00pm"
```

In this example:

- *Mainland China* is a named entity that belongs to category *restaurant* 
- *day after tommorrow* is a *date*
- *6:00pm* is a *time*

 

Chatbot NER is heuristic based that uses several NLP techniques to extract necessary entities from chat interface. In Chatbot, there are several entities that need to be identified and each entity has to be distinguished based on its type as a different entity has different detection logic. Following is the brief hierarchical representation of the entity classification which we have used in Chatbot NER:

 

 

entity_hierarchy

 

 

We have classified entities into four main types i.e.
*numeral*, *pattern*, *temporal* and *textual*.

 

  • **numeral**: This type will contain all the entities that deal with the numeral or numbers. For example, number detection, budget detection, size detection, etc
  • .**pattern**: This will contain all the detection logics where identification can be done using patterns or regular expressions. For example, email, phone_number, PNR, etc.
  • **temporal**: It will contain detection logics for detecting time and date.
  • **textual**: It identifies entities by looking at the dictionary. This detection mainly contains detection of text (like cuisine, dish, restaurants, etc.), the name of cities, the location of a user, etc.

 

For an in-depth explanation of the approach please have a look at our Approach Documentation.

 

Why use Chatbot NER?

 

  • There are a number of NER’s available like Stanford NER, spaCy NER, etc. but none of these are designed specifically for building Chatbots. There are a lot of customizations required in existing NER’s and we have done it all for you in Chatbot NER.
  • We have already added few entities like *restaurant names, cuisine, city list, time, date, etc.* Please, have a look at our Built-in Entities Documentation.
  • Consistency and normalization in output format across different entities reduces the engineering effort required to plug it in your system.
  • You can create/update entities by just mere adding data. Have a look at the documentation on How to add your own entities?
  • Follow just a few steps to host it on a different server and run it as a separate service.
  •  We are actively working in this area if there are any doubts or issues while setting up or using our service, let us, Haptik, know and we will fix it as soon as we can 🙂

Installation Steps

 

Please, have a look at our installation steps to install Chatbot NER on your system.

 

Conclusion

 

We hope that Chatbot NER helps you add more utilities to your Bot and ease out the process of detecting entities. We do expect feedback from users so that we can improve the same. Currently, our chatbot is a heuristic based that uses several NLP techniques to extract necessary entities from chat interface and we are soon going to come up with the upgraded version of this repository by integrating our ML models into it.


Let’s hope that this repository comes up as a powerful resource and contributes to research and engineering community.

 

For, any questions or support related to this repository, do leave your comments below.

 

Also, don’t forget, we are hiring high-quality engineers. So, if you are interested reach out to us at hello@haptik.ai