Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/574325
Title: | Computational Representation of Paninian Rules of Sanskrit Grammar for Dictionary Independent Neural Machine Translation |
Researcher: | Bakarola, Vishvajitsinh Dharmendrasinh |
Guide(s): | Nasriwala, Jitendra Vinodchandra |
Keywords: | Engineering and Technology Machine Learning Machine translation |
University: | Uka Tarsadia University |
Completed Date: | 2024 |
Abstract: | Since the advent of digital computers, machine translation has been a captivating pursuit. Over the past seven decades, significant efforts have been dedicated to achieve human-level fluent machine translation. Early on, rule-based approaches dominated, followed by the emergence of phrase-based and statistical methods. However, the field experienced a transformative shift with the introduction of neural machine translation, which harnesses the power of neural networks to enable translation without heavy reliance on explicit linguistic rules. Each approach in machine translation has specific dependencies and limitations. Rule-based methods heavily rely on extensive linguistic rules for both the source and target languages. On the other hand, statistical and phrase-based approaches depends on mathematical models and large corpora. In contrast, neural machine translation models heavily rely on large-scale language corpora and require intensive parallel computational resources. Our research work makes a threefold contribution to the field. newlineIn our research endeavor s initial contribution, we presented multi-domain bilingual parallel corpora - SAHAAYAK 2023. The bilingual parallel corpora contains of two language pairs, Sanskrit Hindi and Sanskrit-Gujarati, having sample size of 1.5M and 149,622, respectively. We found that prior to this attempt no corpus is publicly available with Sanskrit as one of the languages in the pair. To make the universal adaptability of the corpus, the significant efforts were made in making the corpus balanced by incorporating samples from several domains. newlineIn the methodological framework introduced in the research study, the cutting-edge neural machine translation models have been devised, which was subsequently trained using our SAHAAYAK 2023 corpora. The translation model has achieved the highest BLEU score of 61.83 on 25.7K test samples of Sanskrit-Hindi and 36.51 on 10K test samples of Sanskrit-Gujarati. |
Pagination: | xxii;145p |
URI: | http://hdl.handle.net/10603/574325 |
Appears in Departments: | Faculty of Engineering and Technology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 62.05 kB | Adobe PDF | View/Open |
02_preliminary pages.pdf | 2.73 MB | Adobe PDF | View/Open | |
03_contents.pdf | 231.16 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 339.79 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 4.21 MB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 5.42 MB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 2.71 MB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 2.42 MB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 5.1 MB | Adobe PDF | View/Open | |
10_chapter 6.pdf | 160.61 kB | Adobe PDF | View/Open | |
11_chapter 7.pdf | 99.1 kB | Adobe PDF | View/Open | |
12_annexure.pdf | 4.3 MB | Adobe PDF | View/Open | |
80_recommendation.pdf | 134.04 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: