Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/426532
Title: | Acoustic Articulatory Mapping Analysis and Improvements with Neural Network Learning Paradigms |
Researcher: | Illa, Aravind |
Guide(s): | Ghosh, Prasanta Kumar |
Keywords: | Engineering Engineering and Technology Engineering Electrical and Electronic |
University: | Indian Institute of Science Bangalore |
Completed Date: | 2021 |
Abstract: | Human speech is one of many acoustic signals we perceive, which carries linguistic and paralinguistic (e.g., speaker identity, emotional state) information. Speech acoustics are produced as a result of different temporally overlapping gestures of speech articulators (such as lips, tongue tip, tongue body, tongue dorsum, velum, and larynx), each of which regulates constriction in different parts of the vocal tract. Estimating speech acoustic representations from articulatory movements is known as articulatory- to-acoustic forward (AAF) mapping i.e., articulatory speech synthesis. While estimating articulatory movements back from the speech acoustics is known as acoustic-to-articulatory inverse (AAI) mapping. These acoustic- articulatory mapping functions are known to be complex and nonlinear. The complexity of this mapping depends on a number of factors. These include the kind of representations used in the acoustic and articulatory spaces. Typically these representations capture both linguistic and paralinguistic aspects in speech. How each of these aspects contributes to the complexity of the mapping is unknown. These representations and, in turn, the acoustic-articulatory mapping are affected by the speaking rate as well. The nature and quality of the mapping vary across speakers. Thus, the complexity of mapping also depends on the amount of data from a speaker as well as the number of speakers used in learning the mapping function. Further, how the language variations impact the mapping requires detailed investigation. This thesis analyzes a few of such factors in detail and develops neural-network based models to learn mapping functions robust to many of these factors. Electromagnetic articulography (EMA) sensor data has been used directly in the past as articulatory representations for learning the acoustic-articulatory mapping function. In this thesis, ... |
Pagination: | 163 |
URI: | http://hdl.handle.net/10603/426532 |
Appears in Departments: | Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 2.42 MB | Adobe PDF | View/Open |
02_prelim pages.pdf | 369.24 kB | Adobe PDF | View/Open | |
03_table of content.pdf | 148.61 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 125.23 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 3.73 MB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 1.05 MB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 6.28 MB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 494.89 kB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 761.13 kB | Adobe PDF | View/Open | |
10_chapter 6.pdf | 978.94 kB | Adobe PDF | View/Open | |
11_chapter 7.pdf | 456.87 kB | Adobe PDF | View/Open | |
12_annexure.pdf | 2.71 MB | Adobe PDF | View/Open | |
80_recommendation.pdf | 2.61 MB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: