Utilizing Indonesian Universal Language Model Fine-tuning for Text Classification

Bunyamin, Hendra (2020) Utilizing Indonesian Universal Language Model Fine-tuning for Text Classification. Journal of Information Technology and Computer Science, 5 (3). pp. 325-337. ISSN 2540-9824

[img] Text
2021-JITeCS-Utilizing.pdf

Download (3276Kb)
[img] Text
2021-JITeCS-Utilizing-turnitin.pdf

Download (2903Kb)

Abstract

Inductive transfer learning technique has made a huge impact on the computer vision field. Particularly, computer vision applications including object recognition, segmentation, and classification, are seldom trained from scratch; instead, they are fine-tuned from pretrained models, which are products of learning from huge datasets. In contrast to computer vision, state-of-the-art natural language processing models are still generally trained from the ground up. Accordingly, this research attempts to investigate an adoption of the transfer learning technique for natural language processing. Specifically, we utilize a transfer learning technique named Universal Language Model Fine-tuning (ULMFiT) for doing an Indonesian news text classification task. The dataset for constructing the language model is collected from several news providers from January to December 2017 whereas the dataset employed for text classification task comes from news articles provided by the Agency for the Assessment and Application of Technology (BPPT). To examine the impact of ULMFiT, we provide a baseline that is a vanilla neural network with two hidden layers. Although the performance of ULMFiT on validation set is lower than the one of our baseline, we find that the benefits of ULMFiT for the classification task significantly reduce the overfitting, that is the difference between train and validation accuracies from 4% to nearly zero.

Item Type: Article
Contributors:
ContributionContributorsNIDN/NIDKEmail
AuthorBunyamin, HendraUNSPECIFIEDUNSPECIFIED
Uncontrolled Keywords: Learning, model, classification, training
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Information Technology > 72 Information Technology Department
Depositing User: Perpustakaan Maranatha
Date Deposited: 28 Mar 2025 10:33
Last Modified: 28 Mar 2025 10:33
URI: http://repository.maranatha.edu/id/eprint/33653

Actions (login required)

View Item View Item