Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/574325
Title: Computational Representation of Paninian Rules of Sanskrit Grammar for Dictionary Independent Neural Machine Translation
Researcher: Bakarola, Vishvajitsinh Dharmendrasinh
Guide(s): Nasriwala, Jitendra Vinodchandra
Keywords: Engineering and Technology
Machine Learning
Machine translation
University: Uka Tarsadia University
Completed Date: 2024
Abstract: Since the advent of digital computers, machine translation has been a captivating pursuit. Over the past seven decades, significant efforts have been dedicated to achieve human-level fluent machine translation. Early on, rule-based approaches dominated, followed by the emergence of phrase-based and statistical methods. However, the field experienced a transformative shift with the introduction of neural machine translation, which harnesses the power of neural networks to enable translation without heavy reliance on explicit linguistic rules. Each approach in machine translation has specific dependencies and limitations. Rule-based methods heavily rely on extensive linguistic rules for both the source and target languages. On the other hand, statistical and phrase-based approaches depends on mathematical models and large corpora. In contrast, neural machine translation models heavily rely on large-scale language corpora and require intensive parallel computational resources. Our research work makes a threefold contribution to the field. newlineIn our research endeavor s initial contribution, we presented multi-domain bilingual parallel corpora - SAHAAYAK 2023. The bilingual parallel corpora contains of two language pairs, Sanskrit Hindi and Sanskrit-Gujarati, having sample size of 1.5M and 149,622, respectively. We found that prior to this attempt no corpus is publicly available with Sanskrit as one of the languages in the pair. To make the universal adaptability of the corpus, the significant efforts were made in making the corpus balanced by incorporating samples from several domains. newlineIn the methodological framework introduced in the research study, the cutting-edge neural machine translation models have been devised, which was subsequently trained using our SAHAAYAK 2023 corpora. The translation model has achieved the highest BLEU score of 61.83 on 25.7K test samples of Sanskrit-Hindi and 36.51 on 10K test samples of Sanskrit-Gujarati.
Pagination: xxii;145p
URI: http://hdl.handle.net/10603/574325
Appears in Departments:Faculty of Engineering and Technology

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File62.05 kBAdobe PDFView/Open
02_preliminary pages.pdf2.73 MBAdobe PDFView/Open
03_contents.pdf231.16 kBAdobe PDFView/Open
04_abstract.pdf339.79 kBAdobe PDFView/Open
05_chapter 1.pdf4.21 MBAdobe PDFView/Open
06_chapter 2.pdf5.42 MBAdobe PDFView/Open
07_chapter 3.pdf2.71 MBAdobe PDFView/Open
08_chapter 4.pdf2.42 MBAdobe PDFView/Open
09_chapter 5.pdf5.1 MBAdobe PDFView/Open
10_chapter 6.pdf160.61 kBAdobe PDFView/Open
11_chapter 7.pdf99.1 kBAdobe PDFView/Open
12_annexure.pdf4.3 MBAdobe PDFView/Open
80_recommendation.pdf134.04 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: