Pengaruh Metode Penyeimbangan Kelas Terhadap Tingkat Akurasi Analisis Sentimen pada Tweets Berbahasa Indonesia

Husada, Ivan Nathaniel and Toba, Hapnes (2020) Pengaruh Metode Penyeimbangan Kelas Terhadap Tingkat Akurasi Analisis Sentimen pada Tweets Berbahasa Indonesia. Jurnal Teknik Informatika dan Sistem Informasi, 6 (2). pp. 400-413. ISSN 2443-2229

	Text 12. Pengaruh Metode Penyeimbangan Kelas Terhadap Tingkat Akurasi Analisis Sentimen pada Tweets Berbahasa Indonesia.pdf Download (1223Kb)
	Text Turnitin_ Pengaruh Metode Penyeimbangan Kelas Terhadap Tingkat Akurasi Analisis Sentimen pada Tweets Berbahasa Indonesia.pdf Download (4095Kb)

Abstract

Nowadays internet access is getting easier to get. Because of the ease of access to the internet, almost all internet users have social media. Social media is widely used by users to call out their opinions or even to make complaints about a matter and also discuss a topic with other social media users. From many existing social media, one that is popularly used for that activity is Twitter. Sentiment analysis on Twitter has become possible because of the activities of these Twitter users. In this research, the authors explore sentiment analysis with bag-of-words and Term Frequency Inverse Document Frequency (TF-IDF) features extraction based on tweets from Indonesian Twitter users. The data obtained is in imbalanced condition, so that it requires a method to overcome them. The method for overcoming imbalanced dataset uses resampling approach which combine over and under sampling strategies. The results of sentiment analysis accuracies with Naïve Bayes and neural networks before and after input data resampling are also compared. Naïve Bayes methods that will be used are Multinomial Naïve Bayes and Complement Naïve Bayes, while the Neural Network architecture that will be used as a comparison are Recurrent Neural Networks, Long Short-Term Memory, Gated Recurrent Units, Convolutional Neural Networks, and a combination of Convolutional Neural Networks and Long Short-Term Memory. Our experiments show the following harmonic scores (F1) of the sentiment analysis models: the Multinomial Naïve Bayes F1 score is 55.48, Complement Naïve Bayes is 51.33, Recurrent Neural Network is 75.70, Long Short-Term Memory is 78.36, Gated Recurrent Unit is 77.96, Convolutional Neural Network is 76.12, and finally the combination of Convolutional Neural Networks and Long Short-Term Memory achieves 81.14.

Item Type:

Article

Contributors: