Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/601387
Title: An Optimized Deep Learning Framework for Continuous Sign Language Recognition
Researcher: Neena Aloysius
Guide(s): Geetha M
Keywords: Computer Science
Computer Science Artificial Intelligence; Deep Learning; Sign Language;Language Translation; E-Governance services; sign language; Continuous sign language; Vision based; Movement epenthesis
Engineering and Technology
University: Amrita Vishwa Vidyapeetham University
Completed Date: 2024
Abstract: Sign language is a form of movement language that conveys semantic information through hand and arm motions, facial expressions, and head/body postures, serving as a crucial communication medium for the deaf community. Researchers are motivated by the desire to integrate deaf newlineindividuals into mainstream society, leading to a growing interest in automatic sign language recognition systems. This recognition involves interpreting static or dynamic signing within the one-arm distance 3D space around the upper body of the signer.In this work, a comprehensive literature review is conducted within the domains of visionbased Continuous Sign Language Recognition (CSLR) and Sign Language Translation (SLT). The deep Learning (DL) strategy is adopted by all the recent works. Any DL-based CSLR framework has three main modules - feature extraction, sequence learning and alignment learning. Feature extraction is usually done by a CNN. Most of the works have used LSTMs for sequence learning. Notably, it has been observed that the latest Transformer model and its variants are under-explored for these tasks. Furthermore, there is a gap in the literature concerning the investigation of position encoding schemes specific to the Transformer architecture, which is particularly valuable as the architecture lacks inherent sequential information. Therefore, an extensive literature study is conducted on Transformers, their variants, and the available position encoding schemes.This research began with the exploration of new positioning schemes for the Transformer model within the context of CSLR and SLT. Consequently, a novel positioning scheme was introduced, utilizing Gated Recurrent Unit (GRU) as the relative position encoder, and the multi-head attention (MHA) mechanism was modified to integrate relative position embeddings. The resulting Transformer, incorporating both positioning schemes, is referred to as GRU-RST. Furthermore, it was demonstrated that relative positioning outperformed absolute position encoding for Transformer ...
Pagination: x, 109
URI: http://hdl.handle.net/10603/601387
Appears in Departments:Amrita School of Computing

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File337.07 kBAdobe PDFView/Open
02_prelim pages.pdf1.07 MBAdobe PDFView/Open
03_contents.pdf66.95 kBAdobe PDFView/Open
04_abstract.pdf54.15 kBAdobe PDFView/Open
05_chapter 1.pdf2.76 MBAdobe PDFView/Open
06_chapter 2.pdf152.15 kBAdobe PDFView/Open
07_chapter 3.pdf615.4 kBAdobe PDFView/Open
08_chapter 4.pdf153.21 kBAdobe PDFView/Open
09_chapter 5.pdf1.14 MBAdobe PDFView/Open
10_chapter 6.pdf667.07 kBAdobe PDFView/Open
11_chapter 7.pdf1.03 MBAdobe PDFView/Open
12_chapter 8.pdf2 MBAdobe PDFView/Open
13_chapter 9.pdf870.51 kBAdobe PDFView/Open
14_chapter 10.pdf53.92 kBAdobe PDFView/Open
15_chapter 11.pdf52.27 kBAdobe PDFView/Open
16_annexure.pdf119.31 kBAdobe PDFView/Open
80_recommendation.pdf346.46 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: