Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/589012
Title: Implicit system for spoken language diarization
Researcher: Mishra, Jagabandhu
Guide(s): Mahadeva Prasanna, S R
Keywords: Engineering and Technology
Engineering Electrical and Electronic
Explicit language representation
Implicit language representation
Language change detection
Language discrimination
Self-supervised implicit representation
Speaker diarization
Spoken language diarization (LD)
University: Indian Institute of Technology Dharwad
Completed Date: 2024
Abstract: Spoken language diarization (LD) is a task to automatically segment and label the monolingual segments present in the given code-switched (CS) test utterance. Language information mod- eling can be performed using implicit or explicit framework. Most of the work available in the literature used the explicit framework of language modeling. However, generalizing the frame- work for low/zero resource languages, implicit frameworks are preferable over explicit. The acoustic similarity between the languages, when uttered by a single speaker poses a challenge to obtaining the discriminative language representation implicitly. The same is analyzed through a human subjective study. Motivating by the outcome of the subjective study the requirement of larger neighborhood information is incorporated through the analysis window duration and the a priori language knowledge through computational models to derive the implicit language rep- resentations from speech signals. The performance of language change detection (LCD) using the derived implicit representations is at par with the explicit representations. newline newlineA fixed segmentation-based LD framework is initially proposed to perform the LD task. Observing the confusion in the boundary regions, a change point-based LD framework is pro- posed to perform the LD task. It is observed that the LD performance is improved by includ- ing change point information while segmentation. Due to the short segment duration of the secondary language, the performance of the LD degrades drastically while dealing with the practical dataset. A self-supervised implicit language representation extraction framework is proposed to obtain better language discrimination in a short duration. The self-supervised im- plicit representation is able to resolve the issue and improve the LD performance. The use of self-supervised representation improves the performance to 33.24 Jaccard error rate (JER) from 54.74. Further, the use of LD with change point-based segmentation improves the LD performance to 28.82 JER.
Pagination: xxix, 171 p.
URI: http://hdl.handle.net/10603/589012
Appears in Departments:Department of Electrical Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File292.53 kBAdobe PDFView/Open
02_prelim page.pdf416.07 kBAdobe PDFView/Open
03_content.pdf71.45 kBAdobe PDFView/Open
04_abstract.pdf102.11 kBAdobe PDFView/Open
05_chapter 1.pdf1.91 MBAdobe PDFView/Open
06_chapter 2.pdf612.56 kBAdobe PDFView/Open
07_chapter 3.pdf2.53 MBAdobe PDFView/Open
08_chapter 4.pdf1.6 MBAdobe PDFView/Open
09_chapter 5.pdf2.68 MBAdobe PDFView/Open
10_chapter 6.pdf75.59 kBAdobe PDFView/Open
11_annexures.pdf445.51 kBAdobe PDFView/Open
80_recommendation.pdf304.87 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: