Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/547934
Title: Some Investigations on Attention Mechanism based Deep Learning Models for Speech Enhancement
Researcher: Sivaramakrishna, Yechuri
Guide(s): Dayal, V Sunny
Keywords: Adaptive Wiener Gain
NMF
Speech Enhancemen
University: Vellore Institute of Technology (VIT-AP)
Completed Date: 2024
Abstract: Noise is all around us. When individuals speak, excessive environmental noise newlinecreates transmission issues and has a severe negative impact on intelligibility and speech newlinequality. To address this issue, speech enhancement methods are used to extract clean newlinespeech from environmental disturbances. newlineIn first part, we propose a novel single channel speech enhancement algorithm using newlineiterative constrained Non-negative matrix factorization (NMF) based adaptive Wiener newlinegain for non-stationary noise. The Wiener filter performance depends on the adaptive newlinegain factor value. The adaptive gain factor (and#945;) value is constant regardless of noise type and signal to noise ratio (SNR), so it will affect speech enhancement performance. To overcome this, the adaptive factor value is calculated using a genetic algorithm (GA). newlineHere, the GA adjusts the adaptive Wiener gain based on noise type and SNR level. The newlineGA-based adaptive Wiener gain minimizes Wiener filter estimation errors and improves newlinespeech quality by adjusting the base vector weights of noise and speech. Additionally, newlinethe iterative constraints NMF (IC-NMF) method for calculating the priors from noisy newlinespeech magnitudes. We select the Erlang, Inverse Gamma, Students-t, and Inverse Nak- newlineagami distributions for speech priors and Gaussian distributions for noise priors. Noise and speech samples are well correlated with those distributions. newlineIn the second part, we propose a U-Net with a gated recurrent unit and an efficient newlinechannel attention mechanism for real-time speech enhancement. The proposed U-Net newlinemodel uses skip connections to improve information flow. A novel cross-channel in- newlineteraction can be implemented via the ECA module without dimensionality reduction. newlineIn module testing, choosing an adaptable kernel size for the ECA improved network newlineperformance significantly. Additionally, the U-Net architecture uses gated recurrent newlineunits (GRU), which yields a causal system suitable for real-world use. GRU is used for newlinelearning long-range dependencies. newlineIn the third part, the advanced improv
Pagination: xvi,151
URI: http://hdl.handle.net/10603/547934
Appears in Departments:Department of Electronics Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File69.86 kBAdobe PDFView/Open
02_prelim pages.pdf202.31 kBAdobe PDFView/Open
03_content.pdf51.52 kBAdobe PDFView/Open
04_abstract.pdf87.25 kBAdobe PDFView/Open
05_chapter_1.pdf344.19 kBAdobe PDFView/Open
06_chapter_2.pdf1.06 MBAdobe PDFView/Open
07_chapter_3.pdf663.26 kBAdobe PDFView/Open
08_chapter_4.pdf659.03 kBAdobe PDFView/Open
09_chapter_5.pdf960.1 kBAdobe PDFView/Open
10_chapter_6.pdf1.42 MBAdobe PDFView/Open
11_chapter_7.pdf1.44 MBAdobe PDFView/Open
12_references and publications.pdf221.13 kBAdobe PDFView/Open
80_recommendation.pdf46.48 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: