Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/426532
Title: Acoustic Articulatory Mapping Analysis and Improvements with Neural Network Learning Paradigms
Researcher: Illa, Aravind
Guide(s): Ghosh, Prasanta Kumar
Keywords: Engineering
Engineering and Technology
Engineering Electrical and Electronic
University: Indian Institute of Science Bangalore
Completed Date: 2021
Abstract: Human speech is one of many acoustic signals we perceive, which carries linguistic and paralinguistic (e.g., speaker identity, emotional state) information. Speech acoustics are produced as a result of different temporally overlapping gestures of speech articulators (such as lips, tongue tip, tongue body, tongue dorsum, velum, and larynx), each of which regulates constriction in different parts of the vocal tract. Estimating speech acoustic representations from articulatory movements is known as articulatory- to-acoustic forward (AAF) mapping i.e., articulatory speech synthesis. While estimating articulatory movements back from the speech acoustics is known as acoustic-to-articulatory inverse (AAI) mapping. These acoustic- articulatory mapping functions are known to be complex and nonlinear. The complexity of this mapping depends on a number of factors. These include the kind of representations used in the acoustic and articulatory spaces. Typically these representations capture both linguistic and paralinguistic aspects in speech. How each of these aspects contributes to the complexity of the mapping is unknown. These representations and, in turn, the acoustic-articulatory mapping are affected by the speaking rate as well. The nature and quality of the mapping vary across speakers. Thus, the complexity of mapping also depends on the amount of data from a speaker as well as the number of speakers used in learning the mapping function. Further, how the language variations impact the mapping requires detailed investigation. This thesis analyzes a few of such factors in detail and develops neural-network based models to learn mapping functions robust to many of these factors. Electromagnetic articulography (EMA) sensor data has been used directly in the past as articulatory representations for learning the acoustic-articulatory mapping function. In this thesis, ...
Pagination: 163
URI: http://hdl.handle.net/10603/426532
Appears in Departments:Electrical Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File2.42 MBAdobe PDFView/Open
02_prelim pages.pdf369.24 kBAdobe PDFView/Open
03_table of content.pdf148.61 kBAdobe PDFView/Open
04_abstract.pdf125.23 kBAdobe PDFView/Open
05_chapter 1.pdf3.73 MBAdobe PDFView/Open
06_chapter 2.pdf1.05 MBAdobe PDFView/Open
07_chapter 3.pdf6.28 MBAdobe PDFView/Open
08_chapter 4.pdf494.89 kBAdobe PDFView/Open
09_chapter 5.pdf761.13 kBAdobe PDFView/Open
10_chapter 6.pdf978.94 kBAdobe PDFView/Open
11_chapter 7.pdf456.87 kBAdobe PDFView/Open
12_annexure.pdf2.71 MBAdobe PDFView/Open
80_recommendation.pdf2.61 MBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: