Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/423552
Title: | Robust detection of vowels in speech signal with application to children s ASR |
Researcher: | Kumar, Avinash |
Guide(s): | Pradhan, Gayadhar |
Keywords: | Engineering Engineering and Technology Engineering Electrical and Electronic |
University: | National Institute of Technology Patna |
Completed Date: | 2019 |
Abstract: | This thesis proposes acoustic modeling as well as signal processing approaches for robustly detecting vowels and corresponding onset points newline(VOPs) and offset points (VEPs) in a given speech signal. The VOP newlineand VEP are defined as the instant of starting and ending of a vowel, newlinerespectively. The knowledge of vowel and non-vowel regions is then newlineexploited for non-uniformly suppressing the pitch induced mismatch newlinein children s automatic speech recognition (ASR) system. newlineAt first, using mel-frequency cepstral coefficients (MFCCs) as the newlinefront-end features, three-class classifiers (vowels, non-vowels and silences) are developed using recently reported state-of-the-art acoustic newlinemodeling methods for the task of detecting vowels, VOPs and VEPs newlinein a given speech signal. Among the explored acoustic modeling techniques, best performance is observed for pre-trained deep neural networks (DNN). To further enhance the performance, a novel front-end newlinefeature exploiting the temporal and spectral characteristics of the excitation source information in speech signal is proposed. The use of newlinethe proposed feature results in the detection of vowel regions that are newlinequite different from those obtained through the MFCCs. Exploiting newlinethose differences in the obtained evidences taking two different kinds newlineof features, a technique to combine the evidences is also proposed. The newlinestatistical learning based approaches provides significantly improved newlineperformance when compared with the explicit signal processing methods reported in the literature. However, performance of statistical classifiers degrades significantly when speech signal is corrupted by newlineambient noises. newlineIn order to enhance the robustness towards ambient noises, a signal newlineprocessing approach based on non-local means (NLM) estimation is newlinethen proposed for the detection of vowels, VOPs and VEPs. In the newlineNLM algorithm, the signal value at each sample point is estimated newlineas the weighted sum of signal values at other sample points within a newlinesearch neighborhood. |
Pagination: | xxxi, 155p. |
URI: | http://hdl.handle.net/10603/423552 |
Appears in Departments: | Electronics and Communications Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 102.09 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 121.95 kB | Adobe PDF | View/Open | |
03_content.pdf | 242.54 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 51.43 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 126.85 kB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 181.46 kB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 436.7 kB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 997.74 kB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 2.87 MB | Adobe PDF | View/Open | |
10_chapter 6.pdf | 351.27 kB | Adobe PDF | View/Open | |
11_chapter 7.pdf | 102.54 kB | Adobe PDF | View/Open | |
12_annexures.pdf | 144.09 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 158.35 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: