Bunyamin, Hendra and Heriyanto , Heriyanto and Novianti, Stevani and Sulistiani, Lisan (2019) Topic Clustering and Classification on Final Project Reports: a Comparison of Traditional and Modern Approaches. IAENG International Journal of Computer Science, 46 (3). pp. 506-511. ISSN 1819-656X
![]() |
Text
2019-IAENG-Topic.pdf Download (1402Kb) |
![]() |
Text
2019-IAENG-Topic-turnitin.pdf Download (2072Kb) |
Abstract
—Text clustering and classification has been studied at large in machine learning literature. For clustering text, topic modeling algorithms are statistical methods to discover unseen structures in archives of documents. Equally important, Convolutional Neural Networks (ConvNets) have been success�fully applied for classifying text without knowing information about syntactic and semantic aspects of a language. In this paper, we utilizes both clustering and classification algorithms to organize and classify topics from final project reports. In clustering task, we examine two techniques, that are Latent Dirichlet Allocation (LDA) functioning as a unigram model and LDA supported by a Skip-gram model. Our results show each topical distribution of words found by the techniques are truly representing keywords from every topic; to elaborate, skip�gram model that works hand in hand with LDA are suitable to acquire topical words from the final report topics. For our classification task, we analyze the application of ConvNets, artificial neural nets with ReLU activation functions, and traditional algorithms. Concretely, our findings suggest that selecting parts of a report that contains essential information is very important for ConvNets to learn. Additionally, tradi�tional algorithms is more preferrable than neural nets-based algorithms if the size of dataset is less than 20,000; as a result, our traditional algorithms, specifically Ridge classifier, Passive�Aggressive, and Support Vector Machines outperform neural nets-based algorithms significantly.
Item Type: | Article | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Contributors: |
|
||||||||||||||||||||
Uncontrolled Keywords: | convolutional neural networks, deep learning, final project report, latent dirichlet allocation, machine learning, skipgram model, text classification, topic mode | ||||||||||||||||||||
Subjects: | T Technology > T Technology (General) | ||||||||||||||||||||
Divisions: | Faculty of Information Technology > 72 Information Technology Department | ||||||||||||||||||||
Depositing User: | Perpustakaan Maranatha | ||||||||||||||||||||
Date Deposited: | 28 Mar 2025 10:12 | ||||||||||||||||||||
Last Modified: | 28 Mar 2025 10:12 | ||||||||||||||||||||
URI: | http://repository.maranatha.edu/id/eprint/33648 |
Actions (login required)
![]() |
View Item |