Abstract:
Automatic speaker Identification (ASI) is always challenging work for researchers. ASI is a process where a speaker is recognized automatically from his/her voice sample by comparing it with their previously recorded voices. The machine learning approach has been gaining popularity in recent years for ASI. Different machine learning approaches used in ASI in recent years are Convolutional Neural Network (CNN) [14,15,16], Deep Neural Network (DNN) [10,11,12,13], Artificial Neural Network (ANN) [17,18]. This research aims to build an automatic speaker identification system for the Assamese language, which is spoken in the North-Eastern part of India and is one of the low-resource languages. So far, cosine similarity and parallel processing methods have not been used for speaker identification in the Assamese Language, which is the novelty of the current work. The model developed in this work uses Mel-frequency cepstral coefficient (MFCC) to extract important features of speakers' voices to create a training sample set in the first phase. In the present approach, we have used the Speaker's absolute feature vectors (MFCC) directly, without any averaging, in order to retain and exploit the Speaker's unique characteristics. In the second phase, the features in the training sample set of the first phase are compared with the real-time test voice samples using the cosine similarity method to identify the Speaker automatically. Parallel processing is used to compare all the coefficients in the test voice sample with the training voice sample to make the system faster. The effectiveness of the proposed method has been established in terms of precision, recall, f1 score, and accuracy value. The model demonstrated an accuracy of 91% for speaker identification in the Assamese language.
Page(s):
6552-6560
DOI:
DOI not available
Published:
Journal: Journal of Theoretical and Applied Information Technology, Volume: 100, Issue: 21, Year: 2022
Keywords:
Assamese
,
Cosine Similarity
,
Mel Frequency Cepstral Coefficient MFCC
,
Automatic Speaker Identification ASI
,
Speaker Identification