Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/168240
Full metadata record
DC FieldValueLanguage
dc.coverage.spatial
dc.date.accessioned2017-08-18T08:30:24Z-
dc.date.available2017-08-18T08:30:24Z-
dc.identifier.urihttp://hdl.handle.net/10603/168240-
dc.description.abstractWith overwhelming amount of visual data on the internet, it is beyond doubt that a search capability for this data is needed. In this thesis, we will demonstrate that images and videos can be retrieved using the pose of the humans present in them. Here pose is the 2D/3D spatial arrangement of anatomical body parts like arms and legs. Retrieving humans using pose has commercial implications in domains such as dance (query being a dance pose) and sports (query being a shot). In this thesis, we propose three pose representations that can be used for retrieval. newline newlineOur first pose representation is based on the output of human pose estimation algorithms (HPE). We improve the reliability of these algorithms by proposing an evaluator that predicts if a HPE algorithm has succeeded. For our second pose representation, we introduce deep poselets for pose-sensitive detection of various body parts that are built on convolutional neural network (CNN) features. Second, using these detector responses, we construct a bag-of-poselets representation. Our third pose representation learns a deep neural network which maps the input image to a very low dimensional space where similar poses are close by and dissimilar poses are farther away. We show that pose retrieval system using these low dimensional representation is on par with the deep poselet representation. newline newlineFinally, we describe a method for real time video retrieval where the task is to match the 2D human pose of a query. The method is scalable and is applied to a dataset of 22 movies totaling more than three million frames. Apart from the query modalities, we introduce two other areas of novelty. First, we show that pose retrieval can proceed using a low dimensional representation. Second, we show that the precision of the results can be improved substantially by combining the outputs of independent human pose estimation algorithms. The performance of the system is assessed quantitatively over a range of pose queries.
dc.format.extentxviii,91
dc.languageEnglish
dc.relation
dc.rightsself
dc.titleHuman Pose Retrieval for Image and Video collections
dc.title.alternative
dc.creator.researcherNataraj J
dc.subject.keywordComputer vision
dc.subject.keywordDeep learning
dc.subject.keywordMachine Learning
dc.description.note
dc.contributor.guideC V Jawahar and Andrew Zisserman
dc.publisher.placeHyderabad
dc.publisher.universityInternational Institute of Information Technology, Hyderabad
dc.publisher.institutionComputer Science and Engineering
dc.date.registered31-7-2009
dc.date.completed10/07/2017
dc.date.awarded31/07/2017
dc.format.dimensions
dc.format.accompanyingmaterialNone
dc.source.universityUniversity
dc.type.degreePh.D.
Appears in Departments:Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File48.84 kBAdobe PDFView/Open
02_copyright.pdf54.14 kBAdobe PDFView/Open
03_certificate.pdf19.17 kBAdobe PDFView/Open
04_acknowledgement.pdf69.81 kBAdobe PDFView/Open
05_abstract.pdf76.57 kBAdobe PDFView/Open
06_table of contents.pdf68.32 kBAdobe PDFView/Open
07_list of figures and tables.pdf209.58 kBAdobe PDFView/Open
08_chapter1.pdf172.77 kBAdobe PDFView/Open
09_chapter2.pdf2.62 MBAdobe PDFView/Open
10_chapter3.pdf2.9 MBAdobe PDFView/Open
11_chapter4.pdf5.83 MBAdobe PDFView/Open
12_chapter5.pdf981.06 kBAdobe PDFView/Open
13_chapter6.pdf1.2 MBAdobe PDFView/Open
14_chapter7.pdf319.2 kBAdobe PDFView/Open
15_chapter8.pdf97.92 kBAdobe PDFView/Open
16_relatedpublications.pdf46.43 kBAdobe PDFView/Open
17_bibliography.pdf85.96 kBAdobe PDFView/Open


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).