Human Pose Retrieval for Image and Video collections

Nataraj J

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/168240

Title:	Human Pose Retrieval for Image and Video collections
Researcher:	Nataraj J
Guide(s):	C V Jawahar and Andrew Zisserman
Keywords:	Computer vision Deep learning Machine Learning
University:	International Institute of Information Technology, Hyderabad
Completed Date:	10/07/2017
Abstract:	With overwhelming amount of visual data on the internet, it is beyond doubt that a search capability for this data is needed. In this thesis, we will demonstrate that images and videos can be retrieved using the pose of the humans present in them. Here pose is the 2D/3D spatial arrangement of anatomical body parts like arms and legs. Retrieving humans using pose has commercial implications in domains such as dance (query being a dance pose) and sports (query being a shot). In this thesis, we propose three pose representations that can be used for retrieval. newline newlineOur first pose representation is based on the output of human pose estimation algorithms (HPE). We improve the reliability of these algorithms by proposing an evaluator that predicts if a HPE algorithm has succeeded. For our second pose representation, we introduce deep poselets for pose-sensitive detection of various body parts that are built on convolutional neural network (CNN) features. Second, using these detector responses, we construct a bag-of-poselets representation. Our third pose representation learns a deep neural network which maps the input image to a very low dimensional space where similar poses are close by and dissimilar poses are farther away. We show that pose retrieval system using these low dimensional representation is on par with the deep poselet representation. newline newlineFinally, we describe a method for real time video retrieval where the task is to match the 2D human pose of a query. The method is scalable and is applied to a dataset of 22 movies totaling more than three million frames. Apart from the query modalities, we introduce two other areas of novelty. First, we show that pose retrieval can proceed using a low dimensional representation. Second, we show that the precision of the results can be improved substantially by combining the outputs of independent human pose estimation algorithms. The performance of the system is assessed quantitatively over a range of pose queries.
Pagination:	xviii,91
URI:	http://hdl.handle.net/10603/168240
Appears in Departments:	Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	48.84 kB	Adobe PDF	View/Open
02_copyright.pdf		54.14 kB	Adobe PDF	View/Open
03_certificate.pdf		19.17 kB	Adobe PDF	View/Open
04_acknowledgement.pdf		69.81 kB	Adobe PDF	View/Open
05_abstract.pdf		76.57 kB	Adobe PDF	View/Open
06_table of contents.pdf		68.32 kB	Adobe PDF	View/Open
07_list of figures and tables.pdf		209.58 kB	Adobe PDF	View/Open
08_chapter1.pdf		172.77 kB	Adobe PDF	View/Open
09_chapter2.pdf		2.62 MB	Adobe PDF	View/Open
10_chapter3.pdf		2.9 MB	Adobe PDF	View/Open
11_chapter4.pdf		5.83 MB	Adobe PDF	View/Open
12_chapter5.pdf		981.06 kB	Adobe PDF	View/Open
13_chapter6.pdf		1.2 MB	Adobe PDF	View/Open
14_chapter7.pdf		319.2 kB	Adobe PDF	View/Open
15_chapter8.pdf		97.92 kB	Adobe PDF	View/Open
16_relatedpublications.pdf		46.43 kB	Adobe PDF	View/Open
17_bibliography.pdf		85.96 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET