Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/561178
Title: Dialect Classification and Multi Dialect Speech Recognition
Researcher: Rashmi Kethireddy
Guide(s): Suryakanth V Gangashetty
Keywords: Computer Science
Computer Science Artificial Intelligence
Engineering and Technology
University: International Institute of Information Technology, Hyderabad
Completed Date: 2024
Abstract: Keywords: dialect classification; zero-time windowing; single frequency filtering; frequency domain newlinelinear prediction; convolution neural network; ECAPA-TDNN; deepspeech; multi-dialect automatic speech newlinerecognition; Indian English ASR newlineMajor goal of this thesis is to study the dialectal variations and improve the performance of speech newlinerecognition with an embeddings derived from improved dialect classification system. Initial studies focused newlineon improvement of dialect classification system with three major dialects (AU:Australian, UK:Britain, and newlineUS:American) of English. newlineIn order to improve the performance of dialect classification system and based on the analysis of dialectal newlinevariations, advanced signal processing approaches were proposed to investigate for dialect classification newlinewith traditional i-vector system. The features that provide high spectral resolution will help to capture newlinesubtle differences between dialects. So, this thesis proposed to use single frequency filtering (SFF) and newlinezero-time windowing (ZTW) based features that provide high spectral resolution without compromising newlinetemporal resolution. Along with frame level spectral resolution, longer temporal context will constitute newlinefor dialect classification. So, approaches that enhance the temporal context of proposed features (SFF and newlineZTW) approaches such as delta and double delta coefficients (and#916;+and#916;and#916;), shifted delta coefficients (SDCs) newlineare experimented. It is observed that dialect classification system has given promising performance with newlinethe proposed features with temporal context provided by and#916;+and#916;and#916; and SDCs. Further, signal processing newlineapproaches that can provide long temporal summarization such as frequency domain linear prediction newline(FDLP) are proposed for dialect classification. From experiments, with FDLP based features, it is observed newlinethat long temporal summarization provided by FDLP based features is advantageous for discriminating newlinedialects. So, both the signal processing approaches that provide high spectral resolution (SFF and ZTW) and newlinelong temporal sum
Pagination: 
URI: http://hdl.handle.net/10603/561178
Appears in Departments:Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
80_recommendation.pdfAttached File113.37 kBAdobe PDFView/Open
abstract.pdf70.8 kBAdobe PDFView/Open
annexures.pdf103.64 kBAdobe PDFView/Open
chapter_1.pdf1.67 MBAdobe PDFView/Open
chapter_2.pdf292.84 kBAdobe PDFView/Open
chapter_3.pdf3.7 MBAdobe PDFView/Open
chapter_4.pdf163.29 kBAdobe PDFView/Open
chapter_5.pdf3.74 MBAdobe PDFView/Open
chapter_6.pdf327.19 kBAdobe PDFView/Open
chapter_7.pdf589.53 kBAdobe PDFView/Open
chapter_8.pdf1.21 MBAdobe PDFView/Open
chapter_9.pdf69.52 kBAdobe PDFView/Open
content.pdf76.5 kBAdobe PDFView/Open
preliminary pages.pdf241.1 kBAdobe PDFView/Open
title page.pdf45.72 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: