Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/308474
Title: | Learning Representations for Word Images |
Researcher: | Krishnan Praveen |
Guide(s): | Jawahar C.V. |
Keywords: | Computer Science Computer Science Information Systems Engineering and Technology |
University: | International Institute of Information Technology, Hyderabad |
Completed Date: | 2020 |
Abstract: | Representation learning has been a key investigation in pattern recognition. The primary goal of this thesis is to learn efficient representations for word images from scanned document images. An ideal representation should be invariant to multiple fonts, handwritten styles and less sensitive to noise and degradations. In this work, we choose the paradigm of learning from data using deep neural networks. newline newlineThe first contribution of this thesis is a simple technique to generate large amounts of synthetic data, useful for pre-training deep neural networks. This led to the creation of IIIT-HWS dataset which is now widely used in the document community. The other major contributions of this thesis are: (a) the design of a deep convolutional architecture (named as HWNet) for learning an efficient holistic representation for word images, (b) a joint embedding scheme to project words and textual strings onto a common subspace, and (c) a novel form of word image representation which respects the word form along with its semantic meaning. The learned representations are evaluated under the tasks of word spotting and word recognition. We report state-of-the-art performance on popular datasets under both modern/historical and handwritten/printed document images while keeping the representation size compact in nature. Finally, in order to validate the proposed representations of this thesis, we present some interesting use cases such as (i) finding similarity between a pair of handwritten documents images, (ii) searching for keywords from online lecture videos, and (iii) building word retrieval system for Indic scripts. |
Pagination: | |
URI: | http://hdl.handle.net/10603/308474 |
Appears in Departments: | Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
80_recommendation.pdf | Attached File | 202.94 kB | Adobe PDF | View/Open |
certificate.pdf | 44.19 kB | Adobe PDF | View/Open | |
chapter1.pdf | 2.96 MB | Adobe PDF | View/Open | |
chapter2.pdf | 7.08 MB | Adobe PDF | View/Open | |
chapter3.pdf | 4.28 MB | Adobe PDF | View/Open | |
chapter4.pdf | 3.68 MB | Adobe PDF | View/Open | |
chapter5.pdf | 3.28 MB | Adobe PDF | View/Open | |
chapter6.pdf | 8.85 MB | Adobe PDF | View/Open | |
preliminary_pages.pdf | 876.05 kB | Adobe PDF | View/Open | |
title.pdf | 75.69 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: