What is Topic Modelling?
What is the benefit of applying techniques from Natural Language Processing to unstructured textual data? What kind of conclusions can we expect to draw? How can a company get better insights into their customer experience and profitability?Topic Modelling is one of the tools we use to analyse text data in structured, ordered and quantifiable manner. At the beginning of the process, the analyst is faced with a mass of unorganised documents. Post-analysis, one can expect a structured list of topics, with detailed information about the frequency, association and sentiment. We can glean customer insights almost instantly.
Why is Topic Modelling important?
‘What are people talking about?’ is a question that we often want to answer with regards to social media data. ‘What are people saying about my brand?’ ‘What is driving customer satisfaction?’ ‘How can I take action to improve profitability?’ are more specific questions which an analyst may answer, in part, through collecting and monitoring data: tweets, product reviews, satisfaction questionnaires. We like to add the follow-up question of ‘How are they talking about it?’ which involves sentiment analysis.
However, as text proliferates, understanding the data in any meaningful way becomes much more difficult. It is hard to process written opinions and reviews numbering in the thousands or millions. It becomes tedious, time-consuming and in some cases impossible. A human analyst must rely on one of two choices. The first generalising from a smaller sample of the data. The second is using anecdotal evidence plucked from a small number of reviews. This kind of analysis is clearly open to human and sample bias. The alternative is automation, in the form of topic modelling.
How does Gavagai handle Topic Modelling?
Throughout industry and academia, statistical topic models are the norm. Usually, this means an LDA (Latent Dirichlet Allocation) based model. While these models are impressive in the fact that they work in a supervised manner, we generally find their results to be naive, uninterpretable and impractical. The usual output for a topic in a standard LDA model is a list of words and associated probabilities. Interpreting and labelling output of the model must be done by the analyst. This involves extra work and the labelling is often a difficult task.
Our Explorer Topic Models are fully customizable. We offer an automated topic detection, powered by actual language understanding. The results of this detection are fully adaptable, according to the viewpoint of the analyst. With our sentiment analysis, the Explorer will rapidly give you data-driven insights. You will find out what is driving satisfaction and profit in your business.
What are people talking about in your customer reviews? Well, 86% talk about the branding of the product, while only 10% mention the delivery speed. Coupled with sentiment analysis, almost instant consumer insights can be gleaned (perhaps the 10% who talk about delivery speed are strongly negative).
Gavagai has native topic modelling in Azerbaijani, Albanian, Arabic, Bengali, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Farsi, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Korean, Latvian, Lithuanian, Malay, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Ukrainian, Urdu, and Vietnamese.