Aligning Textual and Visual Data Towards Scalable Multimedia Retrieval

Kompalli Pramod Sankar

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/125750

Full metadata record

DC Field	Value	Language
dc.coverage.spatial	Computer Vision Pattern Recognition
dc.date.accessioned	2017-01-18T05:42:44Z	-
dc.date.available	2017-01-18T05:42:44Z	-
dc.identifier.uri	http://hdl.handle.net/10603/125750	-
dc.description.abstract	The search and retrieval of images and videos from large repositories of multimedia, is acknowledged as a hard challenge. With existing solutions, one cannot obtain detailed, semantic description for a given multimedia document. Towards addressing this challenge, we observe that several multimedia collections contain similar parallel information. For example, the content of a news broadcast is also available in the form of newspaper articles. If a correspondence could be obtained between the videos and such parallel information, one could access one medium using the other. Different Multimedia, Parallel Information pairs, require different alignment techniques, depending on the granularity at which entities could be matched across them. We choose four pairs of multimedia, along with parallel information obtained in the text domain. The framework that we propose begins with an assumption that we could segment the multimedia and the text into meaningful entities that could correspond to each other. The problem then, is to identify features and learn to match a text entity to a multimedia segment and vice versa. Such a matching scheme could be refined using additional constraints, such as temporal ordering and occurrence statistics. We build algorithms that could align across i. movies and scripts, and ii. document images with lexicon. Further, we relax the constraint in the above assumption, such that the segmentation of the multimedia is not available apriori. The problem now, is to perform a joint inference of segmentation and annotation. A large number of putative segmentations are matched against the information extracted from the parallel text, with the joint inference achieved through dynamic programming. This approach was successfully demonstrated on i. Cricket videos with commentaries, and ii. word images using the text equivalent of the word. As a consequence of the approaches proposed in this thesis, we were able to demonstrate text-based retrieval systems over large multimedia collections.
dc.format.extent	xxviii,167
dc.language	English
dc.relation	171
dc.rights	self
dc.title	Aligning Textual and Visual Data Towards Scalable Multimedia Retrieval
dc.title.alternative
dc.creator.researcher	Kompalli Pramod Sankar
dc.subject.keyword	Document Recognition and Retrieval
dc.subject.keyword	Multimedia Retrieval
dc.subject.keyword	Video Annotation
dc.description.note
dc.contributor.guide	Prof Jawahar C.V.
dc.publisher.place	Hyderabad
dc.publisher.university	International Institute of Information Technology, Hyderabad
dc.publisher.institution	Computer Science and Engineering
dc.date.registered	30-7-2004
dc.date.completed	13/05/2015
dc.date.awarded	31/07/2015
dc.format.dimensions
dc.format.accompanyingmaterial	None
dc.source.university	University
dc.type.degree	Ph.D.
Appears in Departments:	Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	70.21 kB	Adobe PDF	View/Open
02_certificate.pdf		39.86 kB	Adobe PDF	View/Open
03_acknowledgements.pdf		63.65 kB	Adobe PDF	View/Open
04_abstract.pdf		118.16 kB	Adobe PDF	View/Open
05_contents.pdf		168.15 kB	Adobe PDF	View/Open
06_list_of_tables_figures.pdf		650.61 kB	Adobe PDF	View/Open
07_chapter 1.pdf		2.48 MB	Adobe PDF	View/Open
08_chapter 2.pdf		8.11 MB	Adobe PDF	View/Open
09_chapter 3.pdf		21.72 MB	Adobe PDF	View/Open
10_chapter 4.pdf		21.4 MB	Adobe PDF	View/Open
11_chapter 5.pdf		6.45 MB	Adobe PDF	View/Open
12_chapter 6.pdf		2.99 MB	Adobe PDF	View/Open
13_chapter 7.pdf		152.99 kB	Adobe PDF	View/Open
14_references.pdf		619.3 kB	Adobe PDF	View/Open

Show simple item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET