Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/561178
Title: | Dialect Classification and Multi Dialect Speech Recognition |
Researcher: | Rashmi Kethireddy |
Guide(s): | Suryakanth V Gangashetty |
Keywords: | Computer Science Computer Science Artificial Intelligence Engineering and Technology |
University: | International Institute of Information Technology, Hyderabad |
Completed Date: | 2024 |
Abstract: | Keywords: dialect classification; zero-time windowing; single frequency filtering; frequency domain newlinelinear prediction; convolution neural network; ECAPA-TDNN; deepspeech; multi-dialect automatic speech newlinerecognition; Indian English ASR newlineMajor goal of this thesis is to study the dialectal variations and improve the performance of speech newlinerecognition with an embeddings derived from improved dialect classification system. Initial studies focused newlineon improvement of dialect classification system with three major dialects (AU:Australian, UK:Britain, and newlineUS:American) of English. newlineIn order to improve the performance of dialect classification system and based on the analysis of dialectal newlinevariations, advanced signal processing approaches were proposed to investigate for dialect classification newlinewith traditional i-vector system. The features that provide high spectral resolution will help to capture newlinesubtle differences between dialects. So, this thesis proposed to use single frequency filtering (SFF) and newlinezero-time windowing (ZTW) based features that provide high spectral resolution without compromising newlinetemporal resolution. Along with frame level spectral resolution, longer temporal context will constitute newlinefor dialect classification. So, approaches that enhance the temporal context of proposed features (SFF and newlineZTW) approaches such as delta and double delta coefficients (and#916;+and#916;and#916;), shifted delta coefficients (SDCs) newlineare experimented. It is observed that dialect classification system has given promising performance with newlinethe proposed features with temporal context provided by and#916;+and#916;and#916; and SDCs. Further, signal processing newlineapproaches that can provide long temporal summarization such as frequency domain linear prediction newline(FDLP) are proposed for dialect classification. From experiments, with FDLP based features, it is observed newlinethat long temporal summarization provided by FDLP based features is advantageous for discriminating newlinedialects. So, both the signal processing approaches that provide high spectral resolution (SFF and ZTW) and newlinelong temporal sum |
Pagination: | |
URI: | http://hdl.handle.net/10603/561178 |
Appears in Departments: | Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
80_recommendation.pdf | Attached File | 113.37 kB | Adobe PDF | View/Open |
abstract.pdf | 70.8 kB | Adobe PDF | View/Open | |
annexures.pdf | 103.64 kB | Adobe PDF | View/Open | |
chapter_1.pdf | 1.67 MB | Adobe PDF | View/Open | |
chapter_2.pdf | 292.84 kB | Adobe PDF | View/Open | |
chapter_3.pdf | 3.7 MB | Adobe PDF | View/Open | |
chapter_4.pdf | 163.29 kB | Adobe PDF | View/Open | |
chapter_5.pdf | 3.74 MB | Adobe PDF | View/Open | |
chapter_6.pdf | 327.19 kB | Adobe PDF | View/Open | |
chapter_7.pdf | 589.53 kB | Adobe PDF | View/Open | |
chapter_8.pdf | 1.21 MB | Adobe PDF | View/Open | |
chapter_9.pdf | 69.52 kB | Adobe PDF | View/Open | |
content.pdf | 76.5 kB | Adobe PDF | View/Open | |
preliminary pages.pdf | 241.1 kB | Adobe PDF | View/Open | |
title page.pdf | 45.72 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: