Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/394299
Title: | Design and development of a framework for visual attention based automatic image description generation with game theoretic optimization |
Researcher: | Sreela S R |
Guide(s): | Suman Mary Idicula |
Keywords: | Automatic image description generation Computer Science Computer Science Interdisciplinary Applications Engineering and Technology |
University: | Cochin University of Science and Technology |
Completed Date: | 2022 |
Abstract: | This thesis deals with the development of an automatic image description system newlineand its applications. We have considered the encoder-decoder architecture newlinewith a visual attention mechanism for image description generation. The newlinesystem uses a Densely connected convolutional neural network as an encoder newlineand Bidirectional LSTM as a decoder. The visual attention mechanism is also newlineincorporated in this work. The optimization of the caption is also done using a newlineCooperative game-theoretic search. Two particular applications are designed newlinewith the proposed architecture of the system. An aid for visually impaired newlinepeople and medical image captioning are the two applications implemented. newlineAn enduring vision of Artificial Intelligence is to build robots that can newlinerecognize and learn the visual world and who can speak about it in natural newlinelanguage. A significant advancement in object recognition is happening in recent newlineyears. Automatic image description generation is a demanding problem newlinein Computer Vision and Natural Language Processing. The sentence annotation newlineof image and video enhances image indexing, searching, and retrieval, newlinevital for Content-Based Image Retrieval(CBIR). The applications of image newlinedescription generation systems are in biomedicine, military, commerce, digital newlinelibraries, education, and web searching. The description should contain newlinethe scene, action, objects, etc. newlineThe automatic image description generation system proposed in this thesis newlinemainly consists of various phases, such as image feature extraction, visualattention, caption generation, and caption optimization. Image feature extraction newlineis experimented with using multiple CNNs such as VGG, Resnet, newlineDensenet, etc. Densenet gives better performance for the automatic image newlinedescription generation system. newlineThe visual attention model is used for finding salient regions in the image. newlineThe spatial attention, channel-wise attention, and layer-wise attention are newlinedesigned and developed. |
Pagination: | 150 |
URI: | http://hdl.handle.net/10603/394299 |
Appears in Departments: | Department of Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 53.19 kB | Adobe PDF | View/Open |
02_declaration.pdf | 40.84 kB | Adobe PDF | View/Open | |
03_certificate.pdf | 41.75 kB | Adobe PDF | View/Open | |
04_acknowledgement.pdf | 44.6 kB | Adobe PDF | View/Open | |
05_content.pdf | 44.8 kB | Adobe PDF | View/Open | |
06_list of graph and table.pdf | 44.08 kB | Adobe PDF | View/Open | |
07_abstract.pdf | 43.87 kB | Adobe PDF | View/Open | |
08_chapter1.pdf | 62.65 kB | Adobe PDF | View/Open | |
09_chapter2.pdf | 422.02 kB | Adobe PDF | View/Open | |
10_chapter3.pdf | 136.69 kB | Adobe PDF | View/Open | |
11_chapter4.pdf | 345.89 kB | Adobe PDF | View/Open | |
12_chapter5.pdf | 195.63 kB | Adobe PDF | View/Open | |
13_chapter6.pdf | 309.74 kB | Adobe PDF | View/Open | |
14_chapter7.pdf | 120.31 kB | Adobe PDF | View/Open | |
15_chapter8.pdf | 1.4 MB | Adobe PDF | View/Open | |
16_chapter9.pdf | 288.2 kB | Adobe PDF | View/Open | |
17_chapter10.pdf | 46.12 kB | Adobe PDF | View/Open | |
18_reference.pdf | 92.72 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 59.4 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: