Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/476964
Title: | Development and evaluation of hybrid machine translation systems for english to indian language under low resource conditions |
Researcher: | Mrinalini Kannan |
Guide(s): | Vijayalakshmi P |
Keywords: | Bilingual Language Machine Translation Parts-of-Speech |
University: | Anna University |
Completed Date: | 2022 |
Abstract: | India is a multi-cultural and multilingual country with 22 official newlinelanguages belonging to different linguistic families. Since the time of British newlinecolonial rule in India, English has been used as the linguistic medium (L1 newlinelanguage) for administrative and higher education purposes. Postindependence, newlinethe use of regional languages (as L1 language) along with newlineEnglish (as L2 language) has been encouraged in the states across the country. newlineHowever, the usage of either L1 or L2 language varies among the common newlinepeople. Thus, it is essential to develop machine translation (MT) systems newlinefrom English-to-Indian languages for smoother transactions and newlinecommunication across the country. Among the seven linguistic families of newlineSouth Asia, Indo-Aryan and Dravidian languages account for over 90% of newlineIndian speakers. On this note, the current research work proposes to develop newlinean efficient statistical-based (SMT) and neural-based (NMT) machine newlinetranslation systems for translation from English to two Indian languages newlinenamely, Tamil (a Dravidian language) and Hindi (an Indo-Aryan language). newlineMost of the well-established data-driven approaches for developing newlineMT systems require huge amount of parallel text in the source and target newlinelanguage, to train an efficient translation model. Availability of such huge newlineparallel corpora between English and Indian languages is scarce. Further, newlinedomain-specific parallel corpora required to develop highly efficient and newlineapplication-oriented MT systems are also not available. The proposed work newlinemakes use of punctuation marks and re-ordering to augment the parallel data newlineavailable for training the SMT and NMT systems. newline |
Pagination: | xix,183p. |
URI: | http://hdl.handle.net/10603/476964 |
Appears in Departments: | Faculty of Information and Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 22.31 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 2.32 MB | Adobe PDF | View/Open | |
03_contents.pdf | 126.87 kB | Adobe PDF | View/Open | |
04_abstracts.pdf | 84.37 kB | Adobe PDF | View/Open | |
05_chapter1.pdf | 385.33 kB | Adobe PDF | View/Open | |
06_chapter2.pdf | 429.65 kB | Adobe PDF | View/Open | |
07_chapter3.pdf | 2.27 MB | Adobe PDF | View/Open | |
08_chapter4.pdf | 1.45 MB | Adobe PDF | View/Open | |
09_chapter5.pdf | 563.12 kB | Adobe PDF | View/Open | |
10_annexures.pdf | 111.21 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 95.19 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: