An Aggrandized Framework for Improving Large Vocabulary Continuous Speech Recognition LVCSR of Lecture Speech in Indian English

Disha Kaur Phull

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/324398

Title:	An Aggrandized Framework for Improving Large Vocabulary Continuous Speech Recognition LVCSR of Lecture Speech in Indian English
Researcher:	Disha Kaur Phull
Guide(s):	Bharadwaja Kumar, G
Keywords:	Computer Science Computer Science Interdisciplinary Applications Engineering and Technology
University:	VIT University
Completed Date:	2020
Abstract:	Automatic Speech Recognition (ASR) is concerned about converting spoken utterances in audio signal into text. It has a wide variety of applications like voice user interfaces which support voice dialing, domotic appliance control and voice search. During the past few decades, drastic developments have been reported in ASR for many languages such as English, Finnish, German, etc. However, development of Indian English (IE) speech recognition models seems to be quite untended. IE is one of the varieties of English spoken in Indian subcontinent showing idiosyncrasy in terms of pronunciation, vocabulary, dialect and accent from English spoken in other parts of world. Also, there is dearth of resources in terms of transcribed speech data and spoken language corpora in IE. In spite of these challenges, an aggrandized framework has been proposed to improve newlinethe speech recognition accuracy in this thesis. In this work, (i) an Indian English newlineacoustic model has been developed that gives 35% lessWord Error Rate (WER) in comparison to existing English acoustic models such as HUB4; (ii) the acoustic-phonetic analysis of vowels and consonants has been carried out to understand the characteristic differences within Indian English varieties; (iii) an effective methodology has been proposed by using Wikipedia dump corpus along with Google search to interpolate and adapt the language models closer to the topic of the spoken lecture which has further reduced the Word Error Rate to 14% with a major decrease in the perplexity of language model; (iv) an effective retrieval method is proposed using Elasticsearch framework to retrieve documents using the key phrases identified from the ASR output to create domain-specific language model that reduces the search time by more than 90% in comparison to conventional search and retrieval mechanism. Finally, from the reported results, one can conclude that the framework proposed drastically reduces the perplexity as well as WER and improves the performance of speech recognition of Indian Englis
Pagination:	i-ix, 1-111
URI:	http://hdl.handle.net/10603/324398
Appears in Departments:	School of Computing Science and Engineering -VIT-Chennai

Files in This Item:

File	Description	Size	Format
01_title page.pdf	Attached File	154.86 kB	Adobe PDF	View/Open
02_declaration & certificate.pdf		277.72 kB	Adobe PDF	View/Open
03_abstract.pdf		85.88 kB	Adobe PDF	View/Open
04_acknowledgement.pdf		65.22 kB	Adobe PDF	View/Open
05_table of contents.pdf		224.4 kB	Adobe PDF	View/Open
06_list of figures.pdf		104.08 kB	Adobe PDF	View/Open
07_list of tables.pdf		72.29 kB	Adobe PDF	View/Open
08_list of terms and abbreviations.pdf		108.34 kB	Adobe PDF	View/Open
09_chapter_01.pdf		922.87 kB	Adobe PDF	View/Open
10_chapter_02.pdf		724.37 kB	Adobe PDF	View/Open
11_chapter_03.pdf		1.49 MB	Adobe PDF	View/Open
12_chapter_04.pdf		1.06 MB	Adobe PDF	View/Open
13_chapter_05.pdf		1.26 MB	Adobe PDF	View/Open
14_chapter_06.pdf		1.34 MB	Adobe PDF	View/Open
15_chapter_07.pdf		346.64 kB	Adobe PDF	View/Open
16_references.pdf		1.13 MB	Adobe PDF	View/Open
17_list of publications.pdf		64.87 kB	Adobe PDF	View/Open
80_recommendation.pdf		501.88 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET