Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/446763
Title: A Metaheuristic Deep Learning Based Robust Speaker Identification in Noisy Environments
Researcher: Thomas, Abraham J V
Guide(s): Nayeemulla Khan
Keywords: Computer Science
Computer Science Artificial Intelligence
Engineering and Technology
University: Vellore Institute of Technology (VIT) University
Completed Date: 2022
Abstract: A biometric system is one in which an individual s behavioral or physiological traits or newlineboth, are given as input and the system analyzes and identifies the individual as a newlinegenuine or malicious user. With the advancement in technologies, different types of newlinebiometric systems are used in many day-to-day applications. Among all biometric newlinesystems, voice recognition is the most convenient and preferred form of biometric newlineidentification among users. Automatic Speaker Recognition (ASR) is a process in newlinewhich the person is identified (speaker identification) or the claim made by the person newlineis verified (speaker verification). While the voice-based biometric, such as the ASR, newlineoffers an additional layer of security to protect the users, implementation of such newlinesystems in real-world needs to face several challenges due to a number of factors. An newlineASR system performs well with clean speech signals but with noisy speech, its newlineperformance suffers significantly. In a real-world scenario, speech signal distortion is newlineunavoidable. The channel mismatch and environmental noises are the two most newlineprominent issues that cause distortion to the voice signal and are unpredictable in a newlinereal-world scenario. Noise affecting an ASR system could be background noise, newlinereverberation, babble noise, etc. newlineTherefore, the objective of this thesis is to study speaker identification in real-world newlineenvironments as it is not plausible to provide noise-free environment. Developing a newlinerobust speaker identification system is a difficult task and must address every form of newlinedistortion. To achieve robustness in the ASR, we have proposed an efficient feature set newlineand classification model. newlineTo start with, for improving the quality of the distorted speech signal obtained in newlinereal-world scenarios, a two-stage speech enhancement algorithm is proposed where newlineEmpirical Mode Decomposition (EMD) is applied to get an improved signal by newlineeliminating noise affected Intrinsic Mode Function (IMF)s at the first stage. This newlineimproved signal is then applied with Wavelet Denoising (WD) in the
Pagination: i-xii, 108
URI: http://hdl.handle.net/10603/446763
Appears in Departments:School of Computing Science and Engineering VIT-Chennai

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File128.97 kBAdobe PDFView/Open
02_prelim pages.pdf284.88 kBAdobe PDFView/Open
03_content.pdf79.44 kBAdobe PDFView/Open
04_abstract.pdf67.49 kBAdobe PDFView/Open
05_chapter 1.pdf196.64 kBAdobe PDFView/Open
06_chapter 2.pdf717.75 kBAdobe PDFView/Open
07_chapter 3.pdf2.01 MBAdobe PDFView/Open
08_chapter 4.pdf1.97 MBAdobe PDFView/Open
09_chapter 5.pdf2.57 MBAdobe PDFView/Open
10_chapter 6.pdf4.63 MBAdobe PDFView/Open
11_chapter 7.pdf67.54 kBAdobe PDFView/Open
12_annexure.pdf134.84 kBAdobe PDFView/Open
80_recommendation.pdf151.07 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: