Algorithms for keyword spotting with application to speech recognition

VIJAYENDRA DESAI

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/300700

Title:	Algorithms for keyword spotting with application to speech recognition
Researcher:	VIJAYENDRA DESAI
Guide(s):	Vishvjit K. Thakar
Keywords:	Engineering Engineering and Technology Engineering Electrical and Electronic
University:	Gujarat Technological University
Completed Date:	2017
Abstract:	The speech recognition system is very useful for the interaction between human and newlinemachine. Language is one of the barriers that create a hindrance to human to human newlineinteractions. In the scenario of arm conflict or natural disasters we need to communicate with newlinespeaker of less prevalent languages. Hence, it is very important and useful to develop a newlinespeech recognition system for low resource language like Gujarati. Various applications of newlinelocal language speech recognition are agriculture, automatic telephone system, voice newlineoperated services. The creation of language and acoustic re-sources, for any given spoken newlinelanguage, are typically a costly task. For example, a large amount of time and money is newlinerequired for the proper creation of annotated speech corpora for Automatic Speech newlineRecognition (ASR) and domain-specific text corpora for Language Modelling (LM). Speech newlinecorpora/corpus is database of speech audio files and text transcriptions of these audio files newlinein a format that can be used to create Acoustic Models. For proper working of the system, it newlineis required to identify the spoken words from the given speech inputs, i.e. Keyword spotting newlineplays a crucial role. In this thesis, our work focuses on in-ear microphone compared to newlineconventional microphone system to minimize the effects of background noise. In addition to newlinethat, we also implement endpoint detection algorithms and tested algorithms to separate the newlinekeywords from the silences and other unwanted noises. For feature extraction, we use Real newlineCepstral Coefficients (RC) and Mel Frequency Cepstral Coefficients (MFCC). We also newlineconfigured two and three layers of neural networks and tested for word recognition. For newlineGujarati speech database generation, various factors are considered such as, speakers of newlinevarious ages (e.g. Child, young, old), gender (e.g., Male, female), accent (kathiyawadi, newlinesortie, ahmedawadi). In future, our keyword spotting algorithm can be used, to drive a newlinerobotic arm hence the speech database has a vocabulary consisting of ten isolated Gujarati newlinewords
Pagination:
URI:	http://hdl.handle.net/10603/300700
Appears in Departments:	Electronics & Telecommunication Enigerring

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	154 kB	Adobe PDF	View/Open
02_declaration.pdf		142.17 kB	Adobe PDF	View/Open
03_certificate.pdf		222.15 kB	Adobe PDF	View/Open
04_abstract.pdf		140.05 kB	Adobe PDF	View/Open
05_acknowledgement.pdf		135.41 kB	Adobe PDF	View/Open
06_content.pdf		124.35 kB	Adobe PDF	View/Open
07_abbreviations.pdf		110.12 kB	Adobe PDF	View/Open
08_figures.pdf		137.1 kB	Adobe PDF	View/Open
09_table.pdf		123.98 kB	Adobe PDF	View/Open
10_chapter_1.pdf		170.43 kB	Adobe PDF	View/Open
11_chapter_2.pdf		1.26 MB	Adobe PDF	View/Open
12_chapter_3.pdf		1.26 MB	Adobe PDF	View/Open
13_chapter_4.pdf		842.03 kB	Adobe PDF	View/Open
14_chapter_5.pdf		411.03 kB	Adobe PDF	View/Open
15_chapter_6.pdf		488.74 kB	Adobe PDF	View/Open
16_chapter_7.pdf		1.83 MB	Adobe PDF	View/Open
17_conclusion.pdf		174.97 kB	Adobe PDF	View/Open
18_refrences.pdf		179.64 kB	Adobe PDF	View/Open
19_publication.pdf		84.39 kB	Adobe PDF	View/Open
80_recommendation.pdf		3.08 MB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET