A Comparative Analysis of Topic Modeling Techniques for Short Text Data

Loknath Ambati, Dakota State University

Abstract

Massive amount of short texts such as tweets, reviews, and social media posts are available on the internet nowadays. It is important for a wide variety of applications to be able to analyze short texts for content analysis and for insights from the textual data. However, limited number of words in such short text can be challenging for meaningful content analysis. This study aims to investigate topic modeling techniques for short text data by performing a comparative analysis of various topic modeling techniques for efficient topic extraction. From a theoretical perspective, the research will shed light into the strengths and weaknesses of various topic mining techniques that can provide insights into future research aimed at improving these techniques for short text in various application domains. From a practical perspective, the research provides guidance into the applicability of topic modeling to short text data.

 
Feb 8th, 1:00 PM

A Comparative Analysis of Topic Modeling Techniques for Short Text Data

Volstorff A

Massive amount of short texts such as tweets, reviews, and social media posts are available on the internet nowadays. It is important for a wide variety of applications to be able to analyze short texts for content analysis and for insights from the textual data. However, limited number of words in such short text can be challenging for meaningful content analysis. This study aims to investigate topic modeling techniques for short text data by performing a comparative analysis of various topic modeling techniques for efficient topic extraction. From a theoretical perspective, the research will shed light into the strengths and weaknesses of various topic mining techniques that can provide insights into future research aimed at improving these techniques for short text in various application domains. From a practical perspective, the research provides guidance into the applicability of topic modeling to short text data.