A novel approach for automatic speaker identification of assamese language using cosine similarity and absolute MFCC feature matrix | [Journal of Theoretical and Applied Information Technology • 2022]

Author(s):

1. ANKUMON SARMAH: DIBRUGARH UNIVERSITY, Centre for Computer Science and Applications, India

2. RIZWAN REHMAN: DIBRUGARH UNIVERSITY, Centre for Computer Science and Applications, India

3. PRIYAKSHI MAHANTA: DIBRUGARH UNIVERSITY, Centre for Computer Science and Applications, India

4. KANKANA DUTTA: DIBRUGARH UNIVERSITY, Centre for Computer Science and Applications, India

5. KAUSTUVMONI BORDOLOI: DIBRUGARH UNIVERSITY, Centre for Computer Science and Applications, India

6. KIMASHA BORAH: DIBRUGARH UNIVERSITY, Centre for Computer Science and Applications, India

7. HARJINDER SINGH: D. H. S. K. COLLEGE, Department of BCA and Computer Science, India

Abstract:

Automatic speaker Identification (ASI) is always challenging work for researchers. ASI is a process where a speaker is recognized automatically from his/her voice sample by comparing it with their previously recorded voices. The machine learning approach has been gaining popularity in recent years for ASI. Different machine learning approaches used in ASI in recent years are Convolutional Neural Network (CNN) [14,15,16], Deep Neural Network (DNN) [10,11,12,13], Artificial Neural Network (ANN) [17,18]. This research aims to build an automatic speaker identification system for the Assamese language, which is spoken in the North-Eastern part of India and is one of the low-resource languages. So far, cosine similarity and parallel processing methods have not been used for speaker identification in the Assamese Language, which is the novelty of the current work. The model developed in this work uses Mel-frequency cepstral coefficient (MFCC) to extract important features of speakers' voices to create a training sample set in the first phase. In the present approach, we have used the Speaker's absolute feature vectors (MFCC) directly, without any averaging, in order to retain and exploit the Speaker's unique characteristics. In the second phase, the features in the training sample set of the first phase are compared with the real-time test voice samples using the cosine similarity method to identify the Speaker automatically. Parallel processing is used to compare all the coefficients in the test voice sample with the training voice sample to make the system faster. The effectiveness of the proposed method has been established in terms of precision, recall, f1 score, and accuracy value. The model demonstrated an accuracy of 91% for speaker identification in the Assamese language.

Page(s): 6552-6560

DOI: DOI not available

Published: Journal: Journal of Theoretical and Applied Information Technology, Volume: 100, Issue: 21, Year: 2022

Keywords:

Assamese , Cosine Similarity , Mel Frequency Cepstral Coefficient MFCC , Automatic Speaker Identification ASI , Speaker Identification

References:

References are not available for this document.

Citations

Citations are not available for this document.

Citations

Downloads

Views