Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/429864
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.date.accessioned | 2022-12-22T05:31:34Z | - |
dc.date.available | 2022-12-22T05:31:34Z | - |
dc.identifier.uri | http://hdl.handle.net/10603/429864 | - |
dc.description.abstract | Speech signals possess a rich time-varying spectral content, which makes their analysis a challenging signal processing problem. Developing methods for accurate speech analysis has a direct impact on applications such as speech synthesis, speaker recognition, speech recognition, voice morphing, etc. A widely used tool to visualize the time-varying spectral content is the spectrogram, which represents the spectral content of the signal in the joint time-frequency plane. A spectrogram can be viewed as a collection of several localized spectrotemporal patches. By analyzing the structure of two-dimensional (2-D) patterns in the spectrogram, we propose modeling it using 2-D amplitude-modulated and frequency-modulated (AM-FM) sinusoids. The justification for the 2-D AM-FM model for speech can be provided based on the physical process behind its generation. From a speech production perspective, the AM and FM components correspond to the vocal-tract smooth envelope and excitation signal, respectively. We demonstrate that analyzing speech jointly in time and frequency reveals several important characteristics, which are otherwise not evident either in purely time-domain or frequency-domain analysis. The central problem in this dissertation is 2-D demodulation of a speech spectrogram, which yields 2-D AM and FM components. We advocate the use of the Riesz transform, which is a 2-D extension of the Hilbert transform, to demodulate narrowband and pitch adaptive spectrograms. Interestingly, the 2-D AM and FM components obtained as a result of demodulation have potential benefits for speech analysis. We demonstrate the impact of the proposed modeling technique for vocal tract filter estimation, voiced/unvoiced component separation, pitch tracking, speech synthesis, and periodic/aperiodic decomposition of speech signals. The accuracy of the estimated speech parameters is validated considering the task of speech reconstruction. The first part of the thesis is focused on theoretical developments related to 2-D modeling. We con... | - |
dc.language | English | - |
dc.rights | university | - |
dc.title | Spectrotemporal Processing of Speech Signals Using the Riesz Transform | - |
dc.creator.researcher | Dhiman, Jitendra Kumar | - |
dc.subject.keyword | Engineering | - |
dc.subject.keyword | Engineering and Technology | - |
dc.subject.keyword | Engineering Electrical and Electronic | - |
dc.contributor.guide | Seelamantula, Chandra Sekhar | - |
dc.publisher.place | Bangalore | - |
dc.publisher.university | Indian Institute of Science Bangalore | - |
dc.publisher.institution | Electrical Engineering | - |
dc.date.completed | 2021 | - |
dc.date.awarded | 2022 | - |
dc.format.accompanyingmaterial | None | - |
dc.source.university | University | - |
dc.type.degree | Ph.D. | - |
Appears in Departments: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 3.07 MB | Adobe PDF | View/Open |
02_prelim pages.pdf | 1.12 MB | Adobe PDF | View/Open | |
03_table of contents.pdf | 279.36 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 169.5 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 5.01 MB | Adobe PDF | View/Open | |
06_chaper 2.pdf | 14.78 MB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 12.05 MB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 4.24 MB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 3.17 MB | Adobe PDF | View/Open | |
11_annexure.pdf | 772.21 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 4.99 MB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: