Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/547934
Title: | Some Investigations on Attention Mechanism based Deep Learning Models for Speech Enhancement |
Researcher: | Sivaramakrishna, Yechuri |
Guide(s): | Dayal, V Sunny |
Keywords: | Adaptive Wiener Gain NMF Speech Enhancemen |
University: | Vellore Institute of Technology (VIT-AP) |
Completed Date: | 2024 |
Abstract: | Noise is all around us. When individuals speak, excessive environmental noise newlinecreates transmission issues and has a severe negative impact on intelligibility and speech newlinequality. To address this issue, speech enhancement methods are used to extract clean newlinespeech from environmental disturbances. newlineIn first part, we propose a novel single channel speech enhancement algorithm using newlineiterative constrained Non-negative matrix factorization (NMF) based adaptive Wiener newlinegain for non-stationary noise. The Wiener filter performance depends on the adaptive newlinegain factor value. The adaptive gain factor (and#945;) value is constant regardless of noise type and signal to noise ratio (SNR), so it will affect speech enhancement performance. To overcome this, the adaptive factor value is calculated using a genetic algorithm (GA). newlineHere, the GA adjusts the adaptive Wiener gain based on noise type and SNR level. The newlineGA-based adaptive Wiener gain minimizes Wiener filter estimation errors and improves newlinespeech quality by adjusting the base vector weights of noise and speech. Additionally, newlinethe iterative constraints NMF (IC-NMF) method for calculating the priors from noisy newlinespeech magnitudes. We select the Erlang, Inverse Gamma, Students-t, and Inverse Nak- newlineagami distributions for speech priors and Gaussian distributions for noise priors. Noise and speech samples are well correlated with those distributions. newlineIn the second part, we propose a U-Net with a gated recurrent unit and an efficient newlinechannel attention mechanism for real-time speech enhancement. The proposed U-Net newlinemodel uses skip connections to improve information flow. A novel cross-channel in- newlineteraction can be implemented via the ECA module without dimensionality reduction. newlineIn module testing, choosing an adaptable kernel size for the ECA improved network newlineperformance significantly. Additionally, the U-Net architecture uses gated recurrent newlineunits (GRU), which yields a causal system suitable for real-world use. GRU is used for newlinelearning long-range dependencies. newlineIn the third part, the advanced improv |
Pagination: | xvi,151 |
URI: | http://hdl.handle.net/10603/547934 |
Appears in Departments: | Department of Electronics Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 69.86 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 202.31 kB | Adobe PDF | View/Open | |
03_content.pdf | 51.52 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 87.25 kB | Adobe PDF | View/Open | |
05_chapter_1.pdf | 344.19 kB | Adobe PDF | View/Open | |
06_chapter_2.pdf | 1.06 MB | Adobe PDF | View/Open | |
07_chapter_3.pdf | 663.26 kB | Adobe PDF | View/Open | |
08_chapter_4.pdf | 659.03 kB | Adobe PDF | View/Open | |
09_chapter_5.pdf | 960.1 kB | Adobe PDF | View/Open | |
10_chapter_6.pdf | 1.42 MB | Adobe PDF | View/Open | |
11_chapter_7.pdf | 1.44 MB | Adobe PDF | View/Open | |
12_references and publications.pdf | 221.13 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 46.48 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: