Understanding the reason behind the emotions placed in the social media plays a key role to learn mood characterization of any written texts that are not seen before. Knowing how to classify the mood characterization leads this technology to be useful in a variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling algorithm, was used to determine which emotions the tweets on Twitter had in the study. The dataset consists of 4000 tweets that are categorized into 5 different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek, Snowball, and first 5 letters root extraction methods are used to create models. The generated models were tested by using the proposed n-stage LDA method. With the proposed method, we aimed to increase model’s success rate by decreasing the number of words in the dictionary. By using the multi-stages LDA, we were able to perform better (2-stages:70.5%, 3-stages:76.4%) than the state of the art result (60.4%) which was achieved using the plain LDA for 5 classes.
Topic Modeling Latent Dirichlet Allocation Natural Language Processing Emotion Analysis
Understanding the reason behind
the emotions placed in the social media plays a key role to learn mood
characterization of any written texts that are not seen before. Knowing how to
classify the mood characterization leads this technology to be useful in a
variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling
algorithm, was used to determine which emotions the tweets on Twitter had in
the study. The dataset consists of 4000 tweets that are categorized into 5
different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek,
Snowball, and first 5 letters root extraction methods are used to create
models. The generated models were tested by using the proposed n-stage LDA
method. With the proposed method, we aimed to increase model’s success rate by
decreasing the number of words in the dictionary. Using the multi-stage LDA
(2-stages:70.5%, 3-stages:76.375%) method, the success rate was increased
compared to normal LDA (60.375%) for 5 class.
Topic Modeling Latent Dirichlet Allocation Natural Language Processing Emotion Analysis
Birincil Dil | İngilizce |
---|---|
Konular | Mühendislik |
Bölüm | Makaleler |
Yazarlar | |
Yayımlanma Tarihi | 28 Eylül 2019 |
Gönderilme Tarihi | 12 Eylül 2018 |
Yayımlandığı Sayı | Yıl 2019 Cilt: 7 Sayı: 3 |