Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/380777
Title: | Diverse Multilingual and Mixed lingual Emotion Recognition using Perception based Speech Analysis |
Researcher: | Lalitha S |
Guide(s): | Deepa Gupta |
Keywords: | electronics and communication Engineering; Arousal ; Valence; speech analysis; mixed-language; speech emotion recognition;SER; Artificial Neural Networks;Deep Neural Networks; Speech Technology; Cepstrum; Emotion detection; corpus Engineering and Technology |
University: | Amrita Vishwa Vidyapeetham University |
Completed Date: | 2021 |
Abstract: | This thesis focuses on exploring, investigating and analysing perception based speech features for emotion recognition in diverse and mixed-language environments across discrete and dimensional emotion spaces. Majority of the existing approaches suggested in literature for multilingual speech emotion recognition (SER) studies have evolved around exploring new speech features and expanding the existing speech feature vectors for effective emotion recognition. Subtle emotions like disgust and boredom whose sample size are found to be less across majority of the databases are usually less recognized. Besides, the cross corpus SER systems are usually associated with various preprocessing techniques, large speech feature vectors and feature selection mechanisms. For these systems to be applicable in countries like India with population communicating in a mix of diverse languages, they must be further enhanced as existing cross corpus SER works have mostly dealt with 2 to 3 language samples each time during training-testing process. Also, most of the emotion recognition works have been targeted either for discrete or dimensional emotion spaces. The thesis aims to solve the newlinementioned shortcomings and limitations of the prevailing works. The main focus of SER system design in this work involves identifying vital compact set of features through speech analysis for efficient emotion recognition. From the exhaustive literature survey and initial SER studies performed by the author, it is found that human emotions are better perceived through cepstral feature analysis. In this thesis, the initial research work started in search of effective cepstral speech feature combination for a monolingual SER system . Through the experimentation performed, it was found that cepstral features derived from Mel and Bark scales were quiet significant for emotion discrimination across both emotion spaces. Artificial Neural Networks (ANN) and Deep Neural Networks (DNN)were chosen for classification. Next,the proposed monolingual SER system... |
Pagination: | xviii, 171 |
URI: | http://hdl.handle.net/10603/380777 |
Appears in Departments: | Department of Electronics & Communication Engineering (Amrita School of Engineering) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 134.07 kB | Adobe PDF | View/Open |
02_certificate.pdf | 239.59 kB | Adobe PDF | View/Open | |
03_preliminary pages.pdf | 199.66 kB | Adobe PDF | View/Open | |
04_chapter 1.pdf | 124.46 kB | Adobe PDF | View/Open | |
05_chapter 2.pdf | 212.45 kB | Adobe PDF | View/Open | |
06_chapter 3.pdf | 793.8 kB | Adobe PDF | View/Open | |
07_chapter 4.pdf | 370.92 kB | Adobe PDF | View/Open | |
08_chapter 5.pdf | 778.73 kB | Adobe PDF | View/Open | |
09_chapter 6.pdf | 141.17 kB | Adobe PDF | View/Open | |
10_bibliography.pdf | 115.47 kB | Adobe PDF | View/Open | |
11_publications.pdf | 50.81 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 274.8 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: