Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/300700
Title: Algorithms for keyword spotting with application to speech recognition
Researcher: VIJAYENDRA DESAI
Guide(s): Vishvjit K. Thakar
Keywords: Engineering
Engineering and Technology
Engineering Electrical and Electronic
University: Gujarat Technological University
Completed Date: 2017
Abstract: The speech recognition system is very useful for the interaction between human and newlinemachine. Language is one of the barriers that create a hindrance to human to human newlineinteractions. In the scenario of arm conflict or natural disasters we need to communicate with newlinespeaker of less prevalent languages. Hence, it is very important and useful to develop a newlinespeech recognition system for low resource language like Gujarati. Various applications of newlinelocal language speech recognition are agriculture, automatic telephone system, voice newlineoperated services. The creation of language and acoustic re-sources, for any given spoken newlinelanguage, are typically a costly task. For example, a large amount of time and money is newlinerequired for the proper creation of annotated speech corpora for Automatic Speech newlineRecognition (ASR) and domain-specific text corpora for Language Modelling (LM). Speech newlinecorpora/corpus is database of speech audio files and text transcriptions of these audio files newlinein a format that can be used to create Acoustic Models. For proper working of the system, it newlineis required to identify the spoken words from the given speech inputs, i.e. Keyword spotting newlineplays a crucial role. In this thesis, our work focuses on in-ear microphone compared to newlineconventional microphone system to minimize the effects of background noise. In addition to newlinethat, we also implement endpoint detection algorithms and tested algorithms to separate the newlinekeywords from the silences and other unwanted noises. For feature extraction, we use Real newlineCepstral Coefficients (RC) and Mel Frequency Cepstral Coefficients (MFCC). We also newlineconfigured two and three layers of neural networks and tested for word recognition. For newlineGujarati speech database generation, various factors are considered such as, speakers of newlinevarious ages (e.g. Child, young, old), gender (e.g., Male, female), accent (kathiyawadi, newlinesortie, ahmedawadi). In future, our keyword spotting algorithm can be used, to drive a newlinerobotic arm hence the speech database has a vocabulary consisting of ten isolated Gujarati newlinewords
Pagination: 
URI: http://hdl.handle.net/10603/300700
Appears in Departments:Electronics & Telecommunication Enigerring

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File154 kBAdobe PDFView/Open
02_declaration.pdf142.17 kBAdobe PDFView/Open
03_certificate.pdf222.15 kBAdobe PDFView/Open
04_abstract.pdf140.05 kBAdobe PDFView/Open
05_acknowledgement.pdf135.41 kBAdobe PDFView/Open
06_content.pdf124.35 kBAdobe PDFView/Open
07_abbreviations.pdf110.12 kBAdobe PDFView/Open
08_figures.pdf137.1 kBAdobe PDFView/Open
09_table.pdf123.98 kBAdobe PDFView/Open
10_chapter_1.pdf170.43 kBAdobe PDFView/Open
11_chapter_2.pdf1.26 MBAdobe PDFView/Open
12_chapter_3.pdf1.26 MBAdobe PDFView/Open
13_chapter_4.pdf842.03 kBAdobe PDFView/Open
14_chapter_5.pdf411.03 kBAdobe PDFView/Open
15_chapter_6.pdf488.74 kBAdobe PDFView/Open
16_chapter_7.pdf1.83 MBAdobe PDFView/Open
17_conclusion.pdf174.97 kBAdobe PDFView/Open
18_refrences.pdf179.64 kBAdobe PDFView/Open
19_publication.pdf84.39 kBAdobe PDFView/Open
80_recommendation.pdf3.08 MBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: