Certain explorations on speech enhancement techniques for automatic speaker recognition in noisy environment

Sumithra M G

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/9823

Title:	Certain explorations on speech enhancement techniques for automatic speaker recognition in noisy environment
Researcher:	Sumithra M G
Guide(s):	Thanushkodi K
Keywords:	Automatic Speaker Recognizer Time adaptive discrete wavelet thresholding Speech enhancement technique
Upload Date:	11-Jul-2013
University:	Anna University
Completed Date:	01/06/2011
Abstract:	This Thesis presents a detailed study on speech enhancement algorithm to provide robustness to the Automatic Speaker Recognizer (ASR) in real-life noisy conditions. The main objective of this work is to attenuate the noise component of a noisy speech in order to enhance the quality of the speech processing devices and make them more robust under noisy conditions using wavelet based algorithms and to carry out a comprehensive evaluation and comparison of their performances on speaker recognition task. In this Thesis two new single channel wavelet based speech enhancement methods and a noise robust automatic speaker recognition are developed and reported.. Firstly, a technique using Time Adaptive Discrete Wavelet Thresholding (TADWT) based on Bionic Wavelet Transform (BWT) is proposed. In this approach discrete BWT is used for speech enhancement task, the adaptive nature of the BWT is captured by introducing a time varying linear factor at each scale over time and modified soft thresholding function is used for denoising. This method also provides a good auditory representation (sufficient frequency resolution), good perceptual quality of speech and low computational load. In this thesis, basic spectral subtraction (SS), iterative Wiener filtering (IWF), Ephraim Malah filtering (EMF), Bionic wavelet based thresholding (BWT) techniques and Perceptual wavelet packet transform (PWPT) have been used as baseline methods for speech enhancement tests. Performance evaluation of proposed methods is made based on segmental signal to noise ratio (SSNR), signal to noise ratio (SNR), Itakura-Saito (IS) distance measure and minimum mean square error (MMSE) for the objective speech quality evaluation. The average recognition accuracy of the system is improved while incorporating SE methods as preprocessor while comparing with the recognition rate obtained for degraded speech.
Pagination:	xxviii, 177p.
URI:	http://hdl.handle.net/10603/9823
Appears in Departments:	Faculty of Information and Communication Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	13.98 kB	Adobe PDF	View/Open
02_certificates.pdf		530.9 kB	Adobe PDF	View/Open
03_abstract.pdf		19.8 kB	Adobe PDF	View/Open
04_acknowledgement.pdf		14.21 kB	Adobe PDF	View/Open
05_contents.pdf		68.16 kB	Adobe PDF	View/Open
06_chapter 1.pdf		86.94 kB	Adobe PDF	View/Open
07_chapter 2.pdf		83.02 kB	Adobe PDF	View/Open
08_chapter 3.pdf		5.51 MB	Adobe PDF	View/Open
09_chapter 4.pdf		7.83 MB	Adobe PDF	View/Open
10_chapter 5.pdf		1.67 MB	Adobe PDF	View/Open
11_chapter 6.pdf		28.15 kB	Adobe PDF	View/Open
12_references.pdf		42.02 kB	Adobe PDF	View/Open
13_publications.pdf		24.51 kB	Adobe PDF	View/Open
14_vitae.pdf		13.18 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET