Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/535770
Title: | Towards Building an Automatic Speech Recognition Systems in Indian Context using Deep Learning |
Researcher: | Sai Ganesh,Mirishkar |
Guide(s): | Anil Kumar, Vuppala |
Keywords: | Engineering Engineering and Technology Engineering Electrical and Electronic |
University: | International Institute of Information Technology, Hyderabad |
Completed Date: | 2023 |
Abstract: | Automatic Speech Recognition (ASR) systems are increasingly prevalent in our daily lives, with newlinecommercial applications such as Siri, Alexa, and Google Assistant. However, the focus of these newlinesystems has been largely angled towards English, leaving a considerable portion of non-English newlinespeakers underserved. This is particularly evident in India, a linguistically diverse country with newlinemany languages classified as low-resource in the context of ASR due to the scarcity of annotated newlinespeech data. This thesis aims to bridge this gap, focusing on enhancing ASR systems for Indian newlinelanguages using deep learning methodologies. India is a land of language diversity. There are newlineapproximately 2000 languages spoken around, and among those officially registered are 23. Of newlinethose, very few have ASR capability. This is because building an ASR system requires thousands of hours of annotated speech data, a vast amount of text, and a lexicon that can span all newlinethe words in the languages. The necessity for a comprehensive presence in the diverse Indian newlinemarkets demands the development of multilingual Automatic Speech Recognition (ASR) systems. It s a common scenario where ASR systems for Indian languages have to be implemented newlinein low-resourced contexts. Furthermore, the complexity of the linguistic landscape is amplified due to the high prevalence of bilingualism in the Indian population, leading to frequent newlineinstances of code-switching and linguistic borrowing between languages. Operating concurrent newlineASR systems that can handle code-switching in the Indian context presents a considerable challenge. This predicament has spurred our research endeavors, driving us to focus on constructing newlinea large corpus for one language and leveraging its phonetic space on other language families in newlinemonolingual and multilingual ASR scenarios. newlineThis thesis incorporates a crowd-sourcing strategy to collect an extensive speech corpus, particularly for Telugu. Using this approach, around 2000 hours of Telugu speech data, capturing newlineregional variations through three mo |
Pagination: | 142 |
URI: | http://hdl.handle.net/10603/535770 |
Appears in Departments: | Department of Electronic and Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 45.4 kB | Adobe PDF | View/Open |
02_prelim_pages.pdf | 183.42 kB | Adobe PDF | View/Open | |
03_content.pdf | 39.61 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 38.64 kB | Adobe PDF | View/Open | |
05_chapter_1.pdf | 103.97 kB | Adobe PDF | View/Open | |
06_chapter_2.pdf | 311.7 kB | Adobe PDF | View/Open | |
07_chapter_3.pdf | 1.56 MB | Adobe PDF | View/Open | |
08_chapter_4.pdf | 165.25 kB | Adobe PDF | View/Open | |
09_chapter_5.pdf | 402.07 kB | Adobe PDF | View/Open | |
10_chapter_6.pdf | 131.74 kB | Adobe PDF | View/Open | |
11_chapter_7.pdf | 308.37 kB | Adobe PDF | View/Open | |
12_chapter_8.pdf | 82.52 kB | Adobe PDF | View/Open | |
13_annexures.pdf | 82.01 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 95.28 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: