Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/324398
Title: | An Aggrandized Framework for Improving Large Vocabulary Continuous Speech Recognition LVCSR of Lecture Speech in Indian English |
Researcher: | Disha Kaur Phull |
Guide(s): | Bharadwaja Kumar, G |
Keywords: | Computer Science Computer Science Interdisciplinary Applications Engineering and Technology |
University: | VIT University |
Completed Date: | 2020 |
Abstract: | Automatic Speech Recognition (ASR) is concerned about converting spoken utterances in audio signal into text. It has a wide variety of applications like voice user interfaces which support voice dialing, domotic appliance control and voice search. During the past few decades, drastic developments have been reported in ASR for many languages such as English, Finnish, German, etc. However, development of Indian English (IE) speech recognition models seems to be quite untended. IE is one of the varieties of English spoken in Indian subcontinent showing idiosyncrasy in terms of pronunciation, vocabulary, dialect and accent from English spoken in other parts of world. Also, there is dearth of resources in terms of transcribed speech data and spoken language corpora in IE. In spite of these challenges, an aggrandized framework has been proposed to improve newlinethe speech recognition accuracy in this thesis. In this work, (i) an Indian English newlineacoustic model has been developed that gives 35% lessWord Error Rate (WER) in comparison to existing English acoustic models such as HUB4; (ii) the acoustic-phonetic analysis of vowels and consonants has been carried out to understand the characteristic differences within Indian English varieties; (iii) an effective methodology has been proposed by using Wikipedia dump corpus along with Google search to interpolate and adapt the language models closer to the topic of the spoken lecture which has further reduced the Word Error Rate to 14% with a major decrease in the perplexity of language model; (iv) an effective retrieval method is proposed using Elasticsearch framework to retrieve documents using the key phrases identified from the ASR output to create domain-specific language model that reduces the search time by more than 90% in comparison to conventional search and retrieval mechanism. Finally, from the reported results, one can conclude that the framework proposed drastically reduces the perplexity as well as WER and improves the performance of speech recognition of Indian Englis |
Pagination: | i-ix, 1-111 |
URI: | http://hdl.handle.net/10603/324398 |
Appears in Departments: | School of Computing Science and Engineering -VIT-Chennai |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title page.pdf | Attached File | 154.86 kB | Adobe PDF | View/Open |
02_declaration & certificate.pdf | 277.72 kB | Adobe PDF | View/Open | |
03_abstract.pdf | 85.88 kB | Adobe PDF | View/Open | |
04_acknowledgement.pdf | 65.22 kB | Adobe PDF | View/Open | |
05_table of contents.pdf | 224.4 kB | Adobe PDF | View/Open | |
06_list of figures.pdf | 104.08 kB | Adobe PDF | View/Open | |
07_list of tables.pdf | 72.29 kB | Adobe PDF | View/Open | |
08_list of terms and abbreviations.pdf | 108.34 kB | Adobe PDF | View/Open | |
09_chapter_01.pdf | 922.87 kB | Adobe PDF | View/Open | |
10_chapter_02.pdf | 724.37 kB | Adobe PDF | View/Open | |
11_chapter_03.pdf | 1.49 MB | Adobe PDF | View/Open | |
12_chapter_04.pdf | 1.06 MB | Adobe PDF | View/Open | |
13_chapter_05.pdf | 1.26 MB | Adobe PDF | View/Open | |
14_chapter_06.pdf | 1.34 MB | Adobe PDF | View/Open | |
15_chapter_07.pdf | 346.64 kB | Adobe PDF | View/Open | |
16_references.pdf | 1.13 MB | Adobe PDF | View/Open | |
17_list of publications.pdf | 64.87 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 501.88 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: