Pakistan Science Abstracts
Article details & metrics
No Detail Found!!
A hybrid word embedding model based on admixture of poisson-gamma latent dirichlet allocation model and distributed word- document-topic representation
Author(s):
1. IBRAHIM BAKARI BALA: Universal Basic Education Commission,Wuse Zone 4, Abuja,Nigeria; Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia,Parit Raja 86400, Johor,Malaysia
2. MOHD ZAINURI SARINGAT: Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia,Parit Raja 86400, Johor,Malaysia
3. AIDA MUSTAPHA: Faculty of Computer Science and Information Technology Universiti Tun Hussein Onn Malaysia,Parit Raja 86400, Johor,Malaysia
Abstract:
This paper proposes a hybrid Poisson-Gamma Latent Dirichlet Allocation (PGLDA) model designed for modelling word dependencies to accommodate the semantic representation of words. The new model simultaneously overcomes the shortcomings of complexity by using LDA as the baseline model as well as adequately capturing the words contextual correlation. The Poisson document length distribution was replaced with the admixture of Poisson-Gamma for words correlation modelling when there is a hub word that connects words and topics. Furthermore, the distributed representation of documents (Doc2Vec) and topics (Topic2Vec) vectors are then averaged to form new vectors of words representation to be combined with topics with largest likelihood from PGLDA. Model estimation was achieved by combining the Laplacian approximation of log-likelihood for PGLDA and Feed-Forward Neural Network (FFN) approaches of Doc2Vec and Topic2Vec. The proposed hybrid method was evaluated for precision, recall, and F1 score based on 20 Newsgroups and AG's News datasets. Comparative analysis of F1 score showed that the proposed hybrid model outperformed other methods.
Page(s): 1446-1456
DOI: DOI not available
Published: Journal: Journal of Theoretical and Applied Information Technology, Volume: 98, Issue: 9, Year: 2020
Keywords:
Word2vec , Topic2Vec , Doc2vec , PoissonGamma Distribution
References:
References are not available for this document.
Citations
Citations are not available for this document.
0

Citations

0

Downloads

22

Views