Computational Representation of Paninian Rules of Sanskrit Grammar for Dictionary Independent Neural Machine Translation

Bakarola, Vishvajitsinh Dharmendrasinh

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/574325

Title:	Computational Representation of Paninian Rules of Sanskrit Grammar for Dictionary Independent Neural Machine Translation
Researcher:	Bakarola, Vishvajitsinh Dharmendrasinh
Guide(s):	Nasriwala, Jitendra Vinodchandra
Keywords:	Engineering and Technology Machine Learning Machine translation
University:	Uka Tarsadia University
Completed Date:	2024
Abstract:	Since the advent of digital computers, machine translation has been a captivating pursuit. Over the past seven decades, significant efforts have been dedicated to achieve human-level fluent machine translation. Early on, rule-based approaches dominated, followed by the emergence of phrase-based and statistical methods. However, the field experienced a transformative shift with the introduction of neural machine translation, which harnesses the power of neural networks to enable translation without heavy reliance on explicit linguistic rules. Each approach in machine translation has specific dependencies and limitations. Rule-based methods heavily rely on extensive linguistic rules for both the source and target languages. On the other hand, statistical and phrase-based approaches depends on mathematical models and large corpora. In contrast, neural machine translation models heavily rely on large-scale language corpora and require intensive parallel computational resources. Our research work makes a threefold contribution to the field. newlineIn our research endeavor s initial contribution, we presented multi-domain bilingual parallel corpora - SAHAAYAK 2023. The bilingual parallel corpora contains of two language pairs, Sanskrit Hindi and Sanskrit-Gujarati, having sample size of 1.5M and 149,622, respectively. We found that prior to this attempt no corpus is publicly available with Sanskrit as one of the languages in the pair. To make the universal adaptability of the corpus, the significant efforts were made in making the corpus balanced by incorporating samples from several domains. newlineIn the methodological framework introduced in the research study, the cutting-edge neural machine translation models have been devised, which was subsequently trained using our SAHAAYAK 2023 corpora. The translation model has achieved the highest BLEU score of 61.83 on 25.7K test samples of Sanskrit-Hindi and 36.51 on 10K test samples of Sanskrit-Gujarati.
Pagination:	xxii;145p
URI:	http://hdl.handle.net/10603/574325
Appears in Departments:	Faculty of Engineering and Technology

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	62.05 kB	Adobe PDF	View/Open
02_preliminary pages.pdf		2.73 MB	Adobe PDF	View/Open
03_contents.pdf		231.16 kB	Adobe PDF	View/Open
04_abstract.pdf		339.79 kB	Adobe PDF	View/Open
05_chapter 1.pdf		4.21 MB	Adobe PDF	View/Open
06_chapter 2.pdf		5.42 MB	Adobe PDF	View/Open
07_chapter 3.pdf		2.71 MB	Adobe PDF	View/Open
08_chapter 4.pdf		2.42 MB	Adobe PDF	View/Open
09_chapter 5.pdf		5.1 MB	Adobe PDF	View/Open
10_chapter 6.pdf		160.61 kB	Adobe PDF	View/Open
11_chapter 7.pdf		99.1 kB	Adobe PDF	View/Open
12_annexure.pdf		4.3 MB	Adobe PDF	View/Open
80_recommendation.pdf		134.04 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET