Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/306355
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.coverage.spatial | ||
dc.date.accessioned | 2020-11-09T11:13:14Z | - |
dc.date.available | 2020-11-09T11:13:14Z | - |
dc.identifier.uri | http://hdl.handle.net/10603/306355 | - |
dc.description.abstract | The performance of speech recognition (ASR) system degrades when there is a mismatch between training and operating environments. The presence of expressive (emotional) speech is one among the mismatches in operating environments as majority of ASR systems are trained using neutral speech. The emotional state of the speaker induces changes in the speech characteristics and effects the ASR system in practical scenarios. The goal of this thesis is to improve the performance of ASR systems in these emotional conditions. The key challenge in addressing this research problem is the lack of resources, where the existing emotional databases are limited in the number of speakers and their size. newline newlineThe main focus of this thesis is to create the required infrastructure to study this challenging problem for low resource Telugu language and present different exploratory studies to evaluate the accuracy of Telugu ASR systems. This thesis investigates several different techniques at various stages of the recognition process that are suitable for building an emotionally robust ASR system. newline newlineIn the first study, prosody modification is employed at the pre-processing level of the speech recognizer. Model-based and feature-space adaptation approaches are also analyzed towards the improvement of ASR systems. These emotion adaptation strategies were studied using various deep neural network (DNNs) architectures and shown to be effective in comparison with baseline Gaussian mixture models (GMMs). The experiments are conducted using IIT Kharagpur simulated emotion speech corpus (IITKGP-SESC) and IIIT-Hyderabad Telugu naturalistic emotional speech corpus (IIIT-H TNESC) newline | |
dc.format.extent | ||
dc.language | English | |
dc.relation | ||
dc.rights | university | |
dc.title | Towards Building a Robust Telugu ASR System for Emotional Speech | |
dc.title.alternative | ||
dc.creator.researcher | Vishnu Vidyadhara Raju V | |
dc.subject.keyword | Computer Science | |
dc.subject.keyword | Computer Science Information Systems | |
dc.subject.keyword | Engineering and Technology | |
dc.description.note | ||
dc.contributor.guide | Anil Kumar Vuppala | |
dc.publisher.place | Hyderabad | |
dc.publisher.university | International Institute of Information Technology, Hyderabad | |
dc.publisher.institution | Electronics and Communication Engineering | |
dc.date.registered | 2015 | |
dc.date.completed | 2020 | |
dc.date.awarded | 2020 | |
dc.format.dimensions | ||
dc.format.accompanyingmaterial | None | |
dc.source.university | University | |
dc.type.degree | Ph.D. | |
Appears in Departments: | Department of Electronic and Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
80_recommendation.pdf | Attached File | 71.68 kB | Adobe PDF | View/Open |
abstract1.pdf | 42.41 kB | Adobe PDF | View/Open | |
certificate11.pdf | 26.28 kB | Adobe PDF | View/Open | |
chapte2_new.pdf | 509.17 kB | Adobe PDF | View/Open | |
chapter1_new.pdf | 141.3 kB | Adobe PDF | View/Open | |
chapter3_new.pdf | 118.83 kB | Adobe PDF | View/Open | |
chapter4_new.pdf | 2.67 MB | Adobe PDF | View/Open | |
chapter5_new.pdf | 1.05 MB | Adobe PDF | View/Open | |
chapter6_new.pdf | 501.51 kB | Adobe PDF | View/Open | |
titlepage1.pdf | 89.91 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: