An Optimized Deep Learning Framework for Continuous Sign Language Recognition

Neena Aloysius

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/601387

Title:	An Optimized Deep Learning Framework for Continuous Sign Language Recognition
Researcher:	Neena Aloysius
Guide(s):	Geetha M
Keywords:	Computer Science Computer Science Artificial Intelligence; Deep Learning; Sign Language;Language Translation; E-Governance services; sign language; Continuous sign language; Vision based; Movement epenthesis Engineering and Technology
University:	Amrita Vishwa Vidyapeetham University
Completed Date:	2024
Abstract:	Sign language is a form of movement language that conveys semantic information through hand and arm motions, facial expressions, and head/body postures, serving as a crucial communication medium for the deaf community. Researchers are motivated by the desire to integrate deaf newlineindividuals into mainstream society, leading to a growing interest in automatic sign language recognition systems. This recognition involves interpreting static or dynamic signing within the one-arm distance 3D space around the upper body of the signer.In this work, a comprehensive literature review is conducted within the domains of visionbased Continuous Sign Language Recognition (CSLR) and Sign Language Translation (SLT). The deep Learning (DL) strategy is adopted by all the recent works. Any DL-based CSLR framework has three main modules - feature extraction, sequence learning and alignment learning. Feature extraction is usually done by a CNN. Most of the works have used LSTMs for sequence learning. Notably, it has been observed that the latest Transformer model and its variants are under-explored for these tasks. Furthermore, there is a gap in the literature concerning the investigation of position encoding schemes specific to the Transformer architecture, which is particularly valuable as the architecture lacks inherent sequential information. Therefore, an extensive literature study is conducted on Transformers, their variants, and the available position encoding schemes.This research began with the exploration of new positioning schemes for the Transformer model within the context of CSLR and SLT. Consequently, a novel positioning scheme was introduced, utilizing Gated Recurrent Unit (GRU) as the relative position encoder, and the multi-head attention (MHA) mechanism was modified to integrate relative position embeddings. The resulting Transformer, incorporating both positioning schemes, is referred to as GRU-RST. Furthermore, it was demonstrated that relative positioning outperformed absolute position encoding for Transformer ...
Pagination:	x, 109
URI:	http://hdl.handle.net/10603/601387
Appears in Departments:	Amrita School of Computing

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	337.07 kB	Adobe PDF	View/Open
02_prelim pages.pdf		1.07 MB	Adobe PDF	View/Open
03_contents.pdf		66.95 kB	Adobe PDF	View/Open
04_abstract.pdf		54.15 kB	Adobe PDF	View/Open
05_chapter 1.pdf		2.76 MB	Adobe PDF	View/Open
06_chapter 2.pdf		152.15 kB	Adobe PDF	View/Open
07_chapter 3.pdf		615.4 kB	Adobe PDF	View/Open
08_chapter 4.pdf		153.21 kB	Adobe PDF	View/Open
09_chapter 5.pdf		1.14 MB	Adobe PDF	View/Open
10_chapter 6.pdf		667.07 kB	Adobe PDF	View/Open
11_chapter 7.pdf		1.03 MB	Adobe PDF	View/Open
12_chapter 8.pdf		2 MB	Adobe PDF	View/Open
13_chapter 9.pdf		870.51 kB	Adobe PDF	View/Open
14_chapter 10.pdf		53.92 kB	Adobe PDF	View/Open
15_chapter 11.pdf		52.27 kB	Adobe PDF	View/Open
16_annexure.pdf		119.31 kB	Adobe PDF	View/Open
80_recommendation.pdf		346.46 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET