Pakistan Science Abstracts
Article details & metrics
No Detail Found!!
A Basic Parts of Speech (POS) Tagset for morphological, syntactic and lexical annotations of Saraiki language
Author(s):
1. Farrukh Javed Saleemi: Institute of Southern Punjab Multan, Pakistan
2. Muhammad Nabeel Asghar: Department of Computer Science, Bahauddin Zakariya University Multan, Pakistan
3. Sajid Iqbal: Department of Computer Science, Bahauddin Zakariya University Multan, Pakistan
4. Muhammad Umar Chaudhry: Ai-Hawks, Multan, Pakistan
5. Muhammad Yasir: Department of Computer Science, University of Engineering and Technology Lahore, Faisalabad Campus, Pakistan
6. Sibghat Ullah Bazai: Cyber Security Lab, School of Natural and Computational Sciences, Massey University,Auckland,New Zealand
7. Muhammad Qasim Khan: SKKU, South Korea
Abstract:
One of the important resources required for various Natural Language Processing (NLP) applications like machine translation, information retrieval and text mining, is annotated text corpora. Text corpora annotation process requires parts of speech (POS) tags to mark different parts of text with grammatical annotations in order to identify linguistic properties of a word, sentence or discourse. The process of marking text items is based on two main features 1) grammatical category and 2) context of text (word, sentence or discourse) i.e. relationship with adjacent and related text.Saraiki being one of oldest languages is still resource scarce language in recorded literature as well as in computational context. According to our study, at present, there is no tagset defined for Saraiki language. This work presents first hierarchical POS (MPOST) tag set for the Saraiki language which is designed to be used in morphological, syntactic and lexical annotations of Saraiki language corpora.
Page(s): 77-84
Published: Journal: Journal of Applied and Emerging Sciences , Volume: 11, Issue: 1, Year: 2021
Keywords:
Saraiki , Tagging , Corpora , Parts of Speech POS , Tag set
References:
References are not available for this document.
Citations
Citations are not available for this document.
0

Citations

0

Downloads

67

Views