Pakistan Science Abstracts
Article details & metrics
No Detail Found!!
The accuracy comparison among word2vec, glove, and fasttext towards convolution neural network (CNN) text classification
Author(s):
1. FORD LUMBAN GAOL: Computer Science Department, BINUS Graduate Program – Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
2. HARCO LESLIE HENDRIC SPITS WARNARS: Computer Science Department, BINUS Graduate Program – Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
3. BENFANO SOEWITO: Computer Science Department, BINUS Graduate Program – Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
4. EDDY MUNTINA DHARMA: Computer Science Department, BINUS Graduate Program – Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
Abstract:
Feature extraction in the field of Text Processing or Natural Language Processing (NLP) has its own challenges due to the characteristics of unstructured text. Thus, the selection of the right feature extraction method can affect the performance of the classification. This study aims to compare the accuracy of 3 word embedding methods namely Word2Vec, GloVe and FastText on text classification using Convolutional Neural Network algorithm. These three methods were chosen because they are able to capture semantic, syntactic, sequences and even context around words. Therefore, the accuracy of these three methods was compared on the classification of news from the data set taken from the UCI KDD Archive, which contains 19,977 news stories and is grouped into 20 news topics. The results show that the word embedding with the Fast Text method performs the best accuracy in the classification process. In fact, the difference in accuracy of the three methods is not crucially significant, so, it can be concluded that its usage depends on the applied data set.
Page(s): 349-359
DOI: DOI not available
Published: Journal: Journal of Theoretical and Applied Information Technology, Volume: 100, Issue: 2, Year: 2022
Keywords:
word embedding , Fasttext , Text Classification , Word2vec , Convolution Neural Network , Glove
References:
References are not available for this document.
Citations
Citations are not available for this document.
0

Citations

0

Downloads

8

Views