Pakistan Science Abstracts
Article details & metrics
No Detail Found!!
Urdu news content classification using machine learning algorithms
Author(s):
1. Umair Arshad: Department of Computer Science,University of Lahore Sargodha Campus, Sargodha, Pakistan
2. Khawar Iqbal Malik: Department of Computer Science,University of Lahore Sargodha Campus, Sargodha, Pakistan
3. Hira Arooj: Department of Mathematics and Statistics University of Lahore Sargodha Campus,Sargodha,Pakistan
Abstract:
The world has become a global village, and the flow of news in volume and speed has increased. Engaging computing machines to assist people in dealing with this massive data is necessary. The availability of different types of info on the Internet serves as a source of information for billions of users. Millions of people in our subcontinent speak and understand Urdu. Several classification techniques are available and applied to classify English news like political, Education, Medical, etc. literature shows plenty of research has been done in multiple languages. However, due to a lack of resources, Urdu is still to be worked on. This research evaluates the performance of twelve (12) machine learning classifiers for the Urdu News text Classification problem. The analysis was performed on a relatively extensive and recent Urdu text collection containing over 0.15 million (153,050) labelled instances of eight different classes. In addition, the TF-IDF weighting technique was adopted after applying pre-processing techniques for feature selection and data extraction. After evaluating various machine learning methods, the SVM outperforms the other eleven algorithms with an accuracy of 91.37 %. We also compare its results with different classifiers like linear SVM, Logistic regression, SGD, Naïve bays, ridge regression, and others.
Page(s): 13-20
Published: Journal: Lahore Garrison University Research Journal of Computer Science and Information Technology, Volume: 6, Issue: 1, Year: 2022
Keywords:
Machine learning , TFIDF , Urdu News classification
References:
References are not available for this document.
Citations
Citations are not available for this document.
0

Citations

0

Downloads

26

Views