Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/299275
Title: | Generative model driven representation learning with discriminative classifier for environmental audio scene and sound event recognition |
Researcher: | Jayalakshmi S L |
Guide(s): | Chandrakala S |
Keywords: | Engineering and Technology Engineering Engineering Electrical and Electronic audio scene sound event recognition |
University: | Anna University |
Completed Date: | 2019 |
Abstract: | The analysis of sound information is very helpful in multimedia information retrieval, audio surveillance, audio tagging, and forensic applications. Environmental Audio Scene Recognition (EASR) and Sound Event Recognition (SER) are the principle tasks that are related to audio surveillance systems. Environmental Audio Scene Recognition refers to the process of recognizing the context or environment of an audio stream, with applications in devices requiring contextual awareness. Sound Event Recognition aims to recognize the occurrence of a monophonic event in a specific environment. Environmental Audio Scene Recognition and Sound Event Recognition are challenging tasks due to the presence of multiple sound sources, background noises and overlapping or polyphonic contexts. In the environmental audio scene recognition task, the typical duration of an environmental audio scene is long, and it is in the range of few seconds to few tens of seconds. Different scenes of the same class may take different durations. An important issue is that when the data of different classes are more confusable, generative model-based classifiers such as Gaussian Mixture Model (GMM) are not appropriate since a model is built for each class using the samples of that class only. This contributes to the idea of a discriminative model-based approach that recognize the examples of environmental audio scenes. Discriminative model based classifiers such as support vector machines (SVMs) focus on modeling the decision boundaries between classes. Another issue is that SVM can handle fixed dimensional data only. In this thesis, these issues are addressed by proposing a hybrid framework that learns model-driven representations for environmental audio scenes and sound events with the help of generative models newline |
Pagination: | xxi, 126p. |
URI: | http://hdl.handle.net/10603/299275 |
Appears in Departments: | Faculty of Information and Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 40.38 kB | Adobe PDF | View/Open |
02_certificates.pdf | 359.12 kB | Adobe PDF | View/Open | |
03_abstracts.pdf | 44.65 kB | Adobe PDF | View/Open | |
04_acknowledgements.pdf | 41.91 kB | Adobe PDF | View/Open | |
05_contents.pdf | 30.27 kB | Adobe PDF | View/Open | |
06_listofabbreviations.pdf | 49.83 kB | Adobe PDF | View/Open | |
07_chapter1.pdf | 149.79 kB | Adobe PDF | View/Open | |
08_chapter2.pdf | 166.41 kB | Adobe PDF | View/Open | |
09_chapter3.pdf | 120.78 kB | Adobe PDF | View/Open | |
10_chapter4.pdf | 267.18 kB | Adobe PDF | View/Open | |
11_chapter5.pdf | 468.4 kB | Adobe PDF | View/Open | |
12_chapter6.pdf | 441.53 kB | Adobe PDF | View/Open | |
13_conclusion.pdf | 61.47 kB | Adobe PDF | View/Open | |
14_references.pdf | 100.71 kB | Adobe PDF | View/Open | |
15_listofpublications.pdf | 58 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 94.03 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: