Abstract:
COVID-19 has become one of the most highly orated subject matter in these days. Countries have taken many viable actions to prevent the spread of the virus directed by international recommendations, which led to many disputes concerning wearing a face mask as a preventive measure against the virus. This study aims to assess and compare the overall accuracy, macro precision, macro F-measure and macro recall of the different decision models towards the COVID-19 mask-wearing practices via sentiment analysis. Tweets are labeled and text pre-processing techniques are applied as stemming, normalization, tokenization, and stop-word removal. Subsequently, the tweets are transformed into master feature vectors by applying various feature extraction, feature representation, feature selection and word embedding techniques with five supervised machine learning decision models to predict maskwearing practices reinforced from Twitter tweets. Moreover, the highest macro F-measure and macro precision are found with feature extraction as hybrid-grams, feature representation as TF-IDF, feature selection as Chi-Squared Test, and highest macro recall with feature extraction as BOW, feature representation as TF-IDF, feature selection as ANOVA F-value. Hence, this study concludes that the Naive Bayes (NB) algorithm outperforms other decision models with master feature vectors applied. In addition, it also outperforms word embedding techniques.
Page(s):
116-126
Published:
Journal: Quaid-e-Awam University Research Journal of Engineering, Science and Technology, Volume: 18, Issue: 2, Year: 2020
Keywords:
machine learning
,
Sentiment analysis
,
natural language processing
,
supervised decision models
,
feature engineering
,
maskwearing practices
,
COVID19 pandemic