Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/446763
Title: | A Metaheuristic Deep Learning Based Robust Speaker Identification in Noisy Environments |
Researcher: | Thomas, Abraham J V |
Guide(s): | Nayeemulla Khan |
Keywords: | Computer Science Computer Science Artificial Intelligence Engineering and Technology |
University: | Vellore Institute of Technology (VIT) University |
Completed Date: | 2022 |
Abstract: | A biometric system is one in which an individual s behavioral or physiological traits or newlineboth, are given as input and the system analyzes and identifies the individual as a newlinegenuine or malicious user. With the advancement in technologies, different types of newlinebiometric systems are used in many day-to-day applications. Among all biometric newlinesystems, voice recognition is the most convenient and preferred form of biometric newlineidentification among users. Automatic Speaker Recognition (ASR) is a process in newlinewhich the person is identified (speaker identification) or the claim made by the person newlineis verified (speaker verification). While the voice-based biometric, such as the ASR, newlineoffers an additional layer of security to protect the users, implementation of such newlinesystems in real-world needs to face several challenges due to a number of factors. An newlineASR system performs well with clean speech signals but with noisy speech, its newlineperformance suffers significantly. In a real-world scenario, speech signal distortion is newlineunavoidable. The channel mismatch and environmental noises are the two most newlineprominent issues that cause distortion to the voice signal and are unpredictable in a newlinereal-world scenario. Noise affecting an ASR system could be background noise, newlinereverberation, babble noise, etc. newlineTherefore, the objective of this thesis is to study speaker identification in real-world newlineenvironments as it is not plausible to provide noise-free environment. Developing a newlinerobust speaker identification system is a difficult task and must address every form of newlinedistortion. To achieve robustness in the ASR, we have proposed an efficient feature set newlineand classification model. newlineTo start with, for improving the quality of the distorted speech signal obtained in newlinereal-world scenarios, a two-stage speech enhancement algorithm is proposed where newlineEmpirical Mode Decomposition (EMD) is applied to get an improved signal by newlineeliminating noise affected Intrinsic Mode Function (IMF)s at the first stage. This newlineimproved signal is then applied with Wavelet Denoising (WD) in the |
Pagination: | i-xii, 108 |
URI: | http://hdl.handle.net/10603/446763 |
Appears in Departments: | School of Computing Science and Engineering VIT-Chennai |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 128.97 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 284.88 kB | Adobe PDF | View/Open | |
03_content.pdf | 79.44 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 67.49 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 196.64 kB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 717.75 kB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 2.01 MB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 1.97 MB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 2.57 MB | Adobe PDF | View/Open | |
10_chapter 6.pdf | 4.63 MB | Adobe PDF | View/Open | |
11_chapter 7.pdf | 67.54 kB | Adobe PDF | View/Open | |
12_annexure.pdf | 134.84 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 151.07 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: