Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/168240
Title: Human Pose Retrieval for Image and Video collections
Researcher: Nataraj J
Guide(s): C V Jawahar and Andrew Zisserman
Keywords: Computer vision
Deep learning
Machine Learning
University: International Institute of Information Technology, Hyderabad
Completed Date: 10/07/2017
Abstract: With overwhelming amount of visual data on the internet, it is beyond doubt that a search capability for this data is needed. In this thesis, we will demonstrate that images and videos can be retrieved using the pose of the humans present in them. Here pose is the 2D/3D spatial arrangement of anatomical body parts like arms and legs. Retrieving humans using pose has commercial implications in domains such as dance (query being a dance pose) and sports (query being a shot). In this thesis, we propose three pose representations that can be used for retrieval. newline newlineOur first pose representation is based on the output of human pose estimation algorithms (HPE). We improve the reliability of these algorithms by proposing an evaluator that predicts if a HPE algorithm has succeeded. For our second pose representation, we introduce deep poselets for pose-sensitive detection of various body parts that are built on convolutional neural network (CNN) features. Second, using these detector responses, we construct a bag-of-poselets representation. Our third pose representation learns a deep neural network which maps the input image to a very low dimensional space where similar poses are close by and dissimilar poses are farther away. We show that pose retrieval system using these low dimensional representation is on par with the deep poselet representation. newline newlineFinally, we describe a method for real time video retrieval where the task is to match the 2D human pose of a query. The method is scalable and is applied to a dataset of 22 movies totaling more than three million frames. Apart from the query modalities, we introduce two other areas of novelty. First, we show that pose retrieval can proceed using a low dimensional representation. Second, we show that the precision of the results can be improved substantially by combining the outputs of independent human pose estimation algorithms. The performance of the system is assessed quantitatively over a range of pose queries.
Pagination: xviii,91
URI: http://hdl.handle.net/10603/168240
Appears in Departments:Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File48.84 kBAdobe PDFView/Open
02_copyright.pdf54.14 kBAdobe PDFView/Open
03_certificate.pdf19.17 kBAdobe PDFView/Open
04_acknowledgement.pdf69.81 kBAdobe PDFView/Open
05_abstract.pdf76.57 kBAdobe PDFView/Open
06_table of contents.pdf68.32 kBAdobe PDFView/Open
07_list of figures and tables.pdf209.58 kBAdobe PDFView/Open
08_chapter1.pdf172.77 kBAdobe PDFView/Open
09_chapter2.pdf2.62 MBAdobe PDFView/Open
10_chapter3.pdf2.9 MBAdobe PDFView/Open
11_chapter4.pdf5.83 MBAdobe PDFView/Open
12_chapter5.pdf981.06 kBAdobe PDFView/Open
13_chapter6.pdf1.2 MBAdobe PDFView/Open
14_chapter7.pdf319.2 kBAdobe PDFView/Open
15_chapter8.pdf97.92 kBAdobe PDFView/Open
16_relatedpublications.pdf46.43 kBAdobe PDFView/Open
17_bibliography.pdf85.96 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: