Building an NLP Engine to Answer Analytics Questions

Phan Truong Quynh Anh
Le Quynh Giang
Ngo Huu Tri
Nguyen Thi Nha Uyen

Our project - Holistics NLP aims to provide a more user-friendly way for Holistics BI’s (Business Intelligence) new users to interact with the system using plain English. 


Holistics is a self-serviced Business Intelligence system that aims to provide an easy-to-use reporting tool via using drag-and-drop interface for non-tech business users. However, there is still a learning curve with this interface as users still need to familiarize themselves with Holistics concepts such as dataset, dimension/measures, Holistics Expression. Even when users have familiarized themselves with the BI concepts, the process of converting a business question to the steps to get the necessary data can also be confusing and time-consuming. Thus, to bridge the gap between user and Holistics, our project provides an NLP to convert user’s natural language query to SQL query to fetch the required data. 


Our NLP system can process and perform various simple aggregations (count, group by, sum, average, etc.), multiple filterings (range such as from... to..., =, <, >, etc.), inner join between 2 tables. It is suggested to use some tested specific keywords to achieve the best result. The fetched results can be displayed as a table (default option), pie chart, bar chart, line chart. Moreover, while typing, there is also a suggestion system to assist users while building a query for the system. To develop the NLP system, we applied various NLP techniques to process the input and map the keywords from the query to the SQL equivalent (table name, field name, aggregators, filtering condition), including lemmatization, chunking tokens, regex/string match, vectorize words. 



By applying NLP into Holistics, it would reduce users’ struggles and confusion of creating the query manually and attract a new group of users who does not want to spend time getting used to drag-and-drop interface as the conversion from business questions to queries would be done automatically. Moreover, it also boosts the productivity of Holistics’ business customers as it reduces the risk of interrupting the thought process while brainstorming or having multiple continuous adhocs. Additionally, NLP could also provide examples of converting from business question to Holistics’ built-in tool so user could learn how to use Holistics with no need of instruction from data analyst and company could save resources by the decreased training time for new users (less learning curve). Specifically, it would reduce the workload for Data Analysts users because they don’t have to instruct new business users as often as current time so they could focus on their main JTBD, which is maintaining model layer (database layer). 

 


Demo Video

Share by: