Towards Building an Automatic Speech Recognition Systems in Indian Context using Deep Learning

Sai Ganesh,Mirishkar

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/535770

Title:	Towards Building an Automatic Speech Recognition Systems in Indian Context using Deep Learning
Researcher:	Sai Ganesh,Mirishkar
Guide(s):	Anil Kumar, Vuppala
Keywords:	Engineering Engineering and Technology Engineering Electrical and Electronic
University:	International Institute of Information Technology, Hyderabad
Completed Date:	2023
Abstract:	Automatic Speech Recognition (ASR) systems are increasingly prevalent in our daily lives, with newlinecommercial applications such as Siri, Alexa, and Google Assistant. However, the focus of these newlinesystems has been largely angled towards English, leaving a considerable portion of non-English newlinespeakers underserved. This is particularly evident in India, a linguistically diverse country with newlinemany languages classified as low-resource in the context of ASR due to the scarcity of annotated newlinespeech data. This thesis aims to bridge this gap, focusing on enhancing ASR systems for Indian newlinelanguages using deep learning methodologies. India is a land of language diversity. There are newlineapproximately 2000 languages spoken around, and among those officially registered are 23. Of newlinethose, very few have ASR capability. This is because building an ASR system requires thousands of hours of annotated speech data, a vast amount of text, and a lexicon that can span all newlinethe words in the languages. The necessity for a comprehensive presence in the diverse Indian newlinemarkets demands the development of multilingual Automatic Speech Recognition (ASR) systems. It s a common scenario where ASR systems for Indian languages have to be implemented newlinein low-resourced contexts. Furthermore, the complexity of the linguistic landscape is amplified due to the high prevalence of bilingualism in the Indian population, leading to frequent newlineinstances of code-switching and linguistic borrowing between languages. Operating concurrent newlineASR systems that can handle code-switching in the Indian context presents a considerable challenge. This predicament has spurred our research endeavors, driving us to focus on constructing newlinea large corpus for one language and leveraging its phonetic space on other language families in newlinemonolingual and multilingual ASR scenarios. newlineThis thesis incorporates a crowd-sourcing strategy to collect an extensive speech corpus, particularly for Telugu. Using this approach, around 2000 hours of Telugu speech data, capturing newlineregional variations through three mo
Pagination:	142
URI:	http://hdl.handle.net/10603/535770
Appears in Departments:	Department of Electronic and Communication Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	45.4 kB	Adobe PDF	View/Open
02_prelim_pages.pdf		183.42 kB	Adobe PDF	View/Open
03_content.pdf		39.61 kB	Adobe PDF	View/Open
04_abstract.pdf		38.64 kB	Adobe PDF	View/Open
05_chapter_1.pdf		103.97 kB	Adobe PDF	View/Open
06_chapter_2.pdf		311.7 kB	Adobe PDF	View/Open
07_chapter_3.pdf		1.56 MB	Adobe PDF	View/Open
08_chapter_4.pdf		165.25 kB	Adobe PDF	View/Open
09_chapter_5.pdf		402.07 kB	Adobe PDF	View/Open
10_chapter_6.pdf		131.74 kB	Adobe PDF	View/Open
11_chapter_7.pdf		308.37 kB	Adobe PDF	View/Open
12_chapter_8.pdf		82.52 kB	Adobe PDF	View/Open
13_annexures.pdf		82.01 kB	Adobe PDF	View/Open
80_recommendation.pdf		95.28 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET