Topic Clustering and Classification on Final Project Reports: a Comparison of Traditional and Modern Approaches

Bunyamin, Hendra and Heriyanto , Heriyanto and Novianti, Stevani and Sulistiani, Lisan (2019) Topic Clustering and Classification on Final Project Reports: a Comparison of Traditional and Modern Approaches. IAENG International Journal of Computer Science, 46 (3). pp. 506-511. ISSN 1819-656X

[img] Text
2019-IAENG-Topic.pdf

Download (1402Kb)
[img] Text
2019-IAENG-Topic-turnitin.pdf

Download (2072Kb)
Official URL: https://www.iaeng.org/IJCS/issues_v46/issue_3/

Abstract

—Text clustering and classification has been studied at large in machine learning literature. For clustering text, topic modeling algorithms are statistical methods to discover unseen structures in archives of documents. Equally important, Convolutional Neural Networks (ConvNets) have been success�fully applied for classifying text without knowing information about syntactic and semantic aspects of a language. In this paper, we utilizes both clustering and classification algorithms to organize and classify topics from final project reports. In clustering task, we examine two techniques, that are Latent Dirichlet Allocation (LDA) functioning as a unigram model and LDA supported by a Skip-gram model. Our results show each topical distribution of words found by the techniques are truly representing keywords from every topic; to elaborate, skip�gram model that works hand in hand with LDA are suitable to acquire topical words from the final report topics. For our classification task, we analyze the application of ConvNets, artificial neural nets with ReLU activation functions, and traditional algorithms. Concretely, our findings suggest that selecting parts of a report that contains essential information is very important for ConvNets to learn. Additionally, tradi�tional algorithms is more preferrable than neural nets-based algorithms if the size of dataset is less than 20,000; as a result, our traditional algorithms, specifically Ridge classifier, Passive�Aggressive, and Support Vector Machines outperform neural nets-based algorithms significantly.

Item Type: Article
Contributors:
ContributionContributorsNIDN/NIDKEmail
AuthorBunyamin, HendraUNSPECIFIEDUNSPECIFIED
AuthorHeriyanto , Heriyanto UNSPECIFIEDUNSPECIFIED
AuthorNovianti, Stevani UNSPECIFIEDUNSPECIFIED
AuthorSulistiani, Lisan UNSPECIFIEDUNSPECIFIED
Uncontrolled Keywords: convolutional neural networks, deep learning, final project report, latent dirichlet allocation, machine learning, skipgram model, text classification, topic mode
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Information Technology > 72 Information Technology Department
Depositing User: Perpustakaan Maranatha
Date Deposited: 28 Mar 2025 10:12
Last Modified: 28 Mar 2025 10:12
URI: http://repository.maranatha.edu/id/eprint/33648

Actions (login required)

View Item View Item