Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/589012
Title: | Implicit system for spoken language diarization |
Researcher: | Mishra, Jagabandhu |
Guide(s): | Mahadeva Prasanna, S R |
Keywords: | Engineering and Technology Engineering Electrical and Electronic Explicit language representation Implicit language representation Language change detection Language discrimination Self-supervised implicit representation Speaker diarization Spoken language diarization (LD) |
University: | Indian Institute of Technology Dharwad |
Completed Date: | 2024 |
Abstract: | Spoken language diarization (LD) is a task to automatically segment and label the monolingual segments present in the given code-switched (CS) test utterance. Language information mod- eling can be performed using implicit or explicit framework. Most of the work available in the literature used the explicit framework of language modeling. However, generalizing the frame- work for low/zero resource languages, implicit frameworks are preferable over explicit. The acoustic similarity between the languages, when uttered by a single speaker poses a challenge to obtaining the discriminative language representation implicitly. The same is analyzed through a human subjective study. Motivating by the outcome of the subjective study the requirement of larger neighborhood information is incorporated through the analysis window duration and the a priori language knowledge through computational models to derive the implicit language rep- resentations from speech signals. The performance of language change detection (LCD) using the derived implicit representations is at par with the explicit representations. newline newlineA fixed segmentation-based LD framework is initially proposed to perform the LD task. Observing the confusion in the boundary regions, a change point-based LD framework is pro- posed to perform the LD task. It is observed that the LD performance is improved by includ- ing change point information while segmentation. Due to the short segment duration of the secondary language, the performance of the LD degrades drastically while dealing with the practical dataset. A self-supervised implicit language representation extraction framework is proposed to obtain better language discrimination in a short duration. The self-supervised im- plicit representation is able to resolve the issue and improve the LD performance. The use of self-supervised representation improves the performance to 33.24 Jaccard error rate (JER) from 54.74. Further, the use of LD with change point-based segmentation improves the LD performance to 28.82 JER. |
Pagination: | xxix, 171 p. |
URI: | http://hdl.handle.net/10603/589012 |
Appears in Departments: | Department of Electrical Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 292.53 kB | Adobe PDF | View/Open |
02_prelim page.pdf | 416.07 kB | Adobe PDF | View/Open | |
03_content.pdf | 71.45 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 102.11 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 1.91 MB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 612.56 kB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 2.53 MB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 1.6 MB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 2.68 MB | Adobe PDF | View/Open | |
10_chapter 6.pdf | 75.59 kB | Adobe PDF | View/Open | |
11_annexures.pdf | 445.51 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 304.87 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: