Dialect Classification and Multi Dialect Speech Recognition

Rashmi Kethireddy

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/561178

Title:	Dialect Classification and Multi Dialect Speech Recognition
Researcher:	Rashmi Kethireddy
Guide(s):	Suryakanth V Gangashetty
Keywords:	Computer Science Computer Science Artificial Intelligence Engineering and Technology
University:	International Institute of Information Technology, Hyderabad
Completed Date:	2024
Abstract:	Keywords: dialect classification; zero-time windowing; single frequency filtering; frequency domain newlinelinear prediction; convolution neural network; ECAPA-TDNN; deepspeech; multi-dialect automatic speech newlinerecognition; Indian English ASR newlineMajor goal of this thesis is to study the dialectal variations and improve the performance of speech newlinerecognition with an embeddings derived from improved dialect classification system. Initial studies focused newlineon improvement of dialect classification system with three major dialects (AU:Australian, UK:Britain, and newlineUS:American) of English. newlineIn order to improve the performance of dialect classification system and based on the analysis of dialectal newlinevariations, advanced signal processing approaches were proposed to investigate for dialect classification newlinewith traditional i-vector system. The features that provide high spectral resolution will help to capture newlinesubtle differences between dialects. So, this thesis proposed to use single frequency filtering (SFF) and newlinezero-time windowing (ZTW) based features that provide high spectral resolution without compromising newlinetemporal resolution. Along with frame level spectral resolution, longer temporal context will constitute newlinefor dialect classification. So, approaches that enhance the temporal context of proposed features (SFF and newlineZTW) approaches such as delta and double delta coefficients (and#916;+and#916;and#916;), shifted delta coefficients (SDCs) newlineare experimented. It is observed that dialect classification system has given promising performance with newlinethe proposed features with temporal context provided by and#916;+and#916;and#916; and SDCs. Further, signal processing newlineapproaches that can provide long temporal summarization such as frequency domain linear prediction newline(FDLP) are proposed for dialect classification. From experiments, with FDLP based features, it is observed newlinethat long temporal summarization provided by FDLP based features is advantageous for discriminating newlinedialects. So, both the signal processing approaches that provide high spectral resolution (SFF and ZTW) and newlinelong temporal sum
Pagination:
URI:	http://hdl.handle.net/10603/561178
Appears in Departments:	Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
80_recommendation.pdf	Attached File	113.37 kB	Adobe PDF	View/Open
abstract.pdf		70.8 kB	Adobe PDF	View/Open
annexures.pdf		103.64 kB	Adobe PDF	View/Open
chapter_1.pdf		1.67 MB	Adobe PDF	View/Open
chapter_2.pdf		292.84 kB	Adobe PDF	View/Open
chapter_3.pdf		3.7 MB	Adobe PDF	View/Open
chapter_4.pdf		163.29 kB	Adobe PDF	View/Open
chapter_5.pdf		3.74 MB	Adobe PDF	View/Open
chapter_6.pdf		327.19 kB	Adobe PDF	View/Open
chapter_7.pdf		589.53 kB	Adobe PDF	View/Open
chapter_8.pdf		1.21 MB	Adobe PDF	View/Open
chapter_9.pdf		69.52 kB	Adobe PDF	View/Open
content.pdf		76.5 kB	Adobe PDF	View/Open
preliminary pages.pdf		241.1 kB	Adobe PDF	View/Open
title page.pdf		45.72 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET