Implicit system for spoken language diarization

Mishra, Jagabandhu

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/589012

Title:	Implicit system for spoken language diarization
Researcher:	Mishra, Jagabandhu
Guide(s):	Mahadeva Prasanna, S R
Keywords:	Engineering and Technology Engineering Electrical and Electronic Explicit language representation Implicit language representation Language change detection Language discrimination Self-supervised implicit representation Speaker diarization Spoken language diarization (LD)
University:	Indian Institute of Technology Dharwad
Completed Date:	2024
Abstract:	Spoken language diarization (LD) is a task to automatically segment and label the monolingual segments present in the given code-switched (CS) test utterance. Language information mod- eling can be performed using implicit or explicit framework. Most of the work available in the literature used the explicit framework of language modeling. However, generalizing the frame- work for low/zero resource languages, implicit frameworks are preferable over explicit. The acoustic similarity between the languages, when uttered by a single speaker poses a challenge to obtaining the discriminative language representation implicitly. The same is analyzed through a human subjective study. Motivating by the outcome of the subjective study the requirement of larger neighborhood information is incorporated through the analysis window duration and the a priori language knowledge through computational models to derive the implicit language rep- resentations from speech signals. The performance of language change detection (LCD) using the derived implicit representations is at par with the explicit representations. newline newlineA fixed segmentation-based LD framework is initially proposed to perform the LD task. Observing the confusion in the boundary regions, a change point-based LD framework is pro- posed to perform the LD task. It is observed that the LD performance is improved by includ- ing change point information while segmentation. Due to the short segment duration of the secondary language, the performance of the LD degrades drastically while dealing with the practical dataset. A self-supervised implicit language representation extraction framework is proposed to obtain better language discrimination in a short duration. The self-supervised im- plicit representation is able to resolve the issue and improve the LD performance. The use of self-supervised representation improves the performance to 33.24 Jaccard error rate (JER) from 54.74. Further, the use of LD with change point-based segmentation improves the LD performance to 28.82 JER.
Pagination:	xxix, 171 p.
URI:	http://hdl.handle.net/10603/589012
Appears in Departments:	Department of Electrical Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	292.53 kB	Adobe PDF	View/Open
02_prelim page.pdf		416.07 kB	Adobe PDF	View/Open
03_content.pdf		71.45 kB	Adobe PDF	View/Open
04_abstract.pdf		102.11 kB	Adobe PDF	View/Open
05_chapter 1.pdf		1.91 MB	Adobe PDF	View/Open
06_chapter 2.pdf		612.56 kB	Adobe PDF	View/Open
07_chapter 3.pdf		2.53 MB	Adobe PDF	View/Open
08_chapter 4.pdf		1.6 MB	Adobe PDF	View/Open
09_chapter 5.pdf		2.68 MB	Adobe PDF	View/Open
10_chapter 6.pdf		75.59 kB	Adobe PDF	View/Open
11_annexures.pdf		445.51 kB	Adobe PDF	View/Open
80_recommendation.pdf		304.87 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET