Improving the accuracy of the GMM model in the form of the GMM-VSM system in the application of speech language recognition

Subject Areas : General

Fahimeh GHasemian ^{1
*} , Mohamad mahdi Homaion por ²

1 -
2 -

Received: 2010-04-09 Accepted : 2010-04-09 Published : 2011-05-03

Keywords:

Abstract :

The GMM model is one of the most widely used and successful models in the field of automatic language recognition. In this article, a new model called Adapted Weight-GMM (AW-GMM) is presented. This model is similar to GMM, with the difference that the weight of its components in the form of GMM-VSM system is determined based on the strength of the components in differentiating one language from other languages. Also, due to the computational complexity in the GMM-VSM system in the case where a 2-component sequence is considered, a technique for constructing a 2-component sequence has been presented, which can be used to construct higher-order sequences as well. used The evaluations carried out on 4 languages English, Persian, French and German from OGI data show the effectiveness of the presented techniques.

References:

Ziaei A., Ahadi S. M.,Mirrezaie S. M. and Yeganeh H., "Spoken Language Identification Using a New Sequence Kernel-based SVM Back-end Classifier", ISSPIT, 2008, pp.324-329.
Zissman M. A., "Comparision of Four Approaches to Automatic Language Identification of Telephone Speech", IEEE Transactions on Speech and Audio Processing, vol. 4, 1996, pp.31-44.
Li H., Ma B. and Lee C. H., "A Vector space modeling approach to spoken language identification," IEEE Transactions on Audio, Speech and Language Processing, vol. 15, 2007, pp.271-284.
Torres-Carrasquillo P. A., Singer E., Kohler M. A., Greene R. J., Reynolds D. A. and Deller J. A., "Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features", ICSLP, 2002, pp.89-92.
Tong, R.,Bin, M.,Zhu, D.,Li, H., Chng, E. S., "Integrating acoustic, prosodic and phonotactic features for spoken language identification," ICASSP, 2006, pp. 205-208.
Tong R., Ma B., Li H., and Chng E. S., "Target-Oriented Phone Tokenizers for Spoken Language Recognition", ICASSP 2008, pp. 200-203.
Richardson F. S., Campbell W. M., Torres-Carrasquillo P. A., “Discriminative N-gram selection for dialect recognition”, interspeech, 2009, pp. 192-195.
Muthusamy Y. K., Cole R. A., Oshika B. T., "The OGI multi-language telephone speech corpus", ICSLP, 1992.
Available at: http://htk.eng.cam.ac.uk/

An access control model for online social networks using user-to-user relationships
Print Date : 2021-01-27
An Improved Method for Detecting Phishing Websites Using Data Mining on Web Pages
Print Date : 2020-10-21
Using a multi-objective optimization algorithm for tasks allocate in the cloud-based systems to reduce energy consumption
Print Date : 2020-10-21
Incentive reward mechanism for Participants to the human computing system of Intrusion Detection Based on Game Theory
Print Date : 2020-10-21
Fast and accurate concept drift detection from event logs
Print Date : 2020-10-21
Textual analysis of central bank news in forecasting long-term trend of Tehran stock exchange index
Print Date : 2020-10-21

Share To

Article Url

Improving the accuracy of the GMM model in the form of the GMM-VSM system in the application of speech language recognition