[1] .The presented research study has developed a novel on Sindhi corpus. Study has performed supervised classification on Sindhi annotate corpus to assess the accuracy of traditional machine learning approaches to solve the NLP problems of Sindhi language. Supervised regarding machine learning methods are evaluated and assessed with 10-fold cross validation. The Sindhi annotated corpus is segmented into 80% training dataset and 20% test dataset. The machine is trained with 80% training dataset. Each fold of cross validation has processed to partition corpus into subsets to analyze the training set and validate the test set. All processes of cross validation have done randomly. The study observes the performance of RF machine learning method better than the SVM non-linear on basis of obtained results., -
[2] 2016.Language Technology Tools and Resources for a Resource-Poor Language: Sindhi”, 51 -58
[3] Mahar , J.A.,G.Q., 2012.,Science Series) 1 43 -47
[4] Mahar, J.A., and Memon, G.Q., “Rule Based Part of
Speech Tagging of Sindhi Language”, IEEE International
Conference on Signal Acquisition and Processing,
pp. 101-106, 2010.
[5] Mahar, J.A., Shaikh, H., and Memon, G.Q., “A Model
for Sindhi Text Segmentation into Word Tokens”, Sindh
University Research Journal (Science Series),
Volume 44, No. 1, pp. 43-47, Jamshoro, Pakistan
[6] Mahar, J.A., and Memon, G.Q., “Sindhi Part of Speech
Tagging System using WordNet”, International Journal
of Computer Theory and Engineering, Volume 2, No. 4,
pp. 538, 2010
[7] Dootio, M.A., and Wagan, A.I., “Syntactic Parsing and
Supervised Analysis of Sindhi Text”, Journal of King
Saud University – Computer and Information Sciences,
[DOI:10.1016/j.jksuci.2017.10.004],
[8] Motlani, R., Lalwani, H., Shrivastava, M., and Sharma,
D.M., “Developing Part-of-Speech Tagger for a Resource
Poor Language: Sindhi”, Proceedings of 7th Conference
on Language and Technology, Poznan, Poland
[9] Motlani, R., Tyers, F.M., and Sharma, D.M., “A Finite-
State Morphological Analyzer for Sindhi”, Proceedings
of 10th International Conference on Language Resources
and Evaluation, 2016
[10] Siraj, “Sindhi Boli”, 2nd Edition, Sindhi Language
Authority, Hyderabad, Sindh, Pakistan,
[11] Bag, M.K., “Sindhi Vyakaran”, Sindhi Adabi Board,
Jamshoro, Sindh, Pakistan, 2015.
[12] Bag, M.K., “Sindhi Vyakaran”, Sindhi Adabi Board,
Jamshoro, Sindh, Pakistan, 2015.
[13] Taylor, A., Mitchell, M., and Beatrice, S., “The Penn
Treebank: An Overview”, Treebanks, pp. 5-22. Springer,
Dordrecht, 2003.
[14] Sarker, A., and Graciela, G., “Portable Automatic Text
Classification for Adverse Drug Reaction Detection via
Multi-Corpus Training”, Journal of Biomedical
Informatics, Volume 53, pp. 196-207,
[15] Onan, A., Serdar, K., and Hasan, B., “Ensemble of
Keyword Extraction Methods and Classifiers in Text
Classification”, Expert Systems with Applications,
Volume 57, pp. 232-247, 2016