Development and evaluation of hybrid machine translation systems for english to indian language under low resource conditions

Mrinalini Kannan

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/476964

Full metadata record

DC Field	Value	Language
dc.coverage.spatial	Development and evaluation of hybrid machine translation systems for english to indian language under low resource conditions
dc.date.accessioned	2023-04-19T07:00:03Z	-
dc.date.available	2023-04-19T07:00:03Z	-
dc.identifier.uri	http://hdl.handle.net/10603/476964	-
dc.description.abstract	India is a multi-cultural and multilingual country with 22 official newlinelanguages belonging to different linguistic families. Since the time of British newlinecolonial rule in India, English has been used as the linguistic medium (L1 newlinelanguage) for administrative and higher education purposes. Postindependence, newlinethe use of regional languages (as L1 language) along with newlineEnglish (as L2 language) has been encouraged in the states across the country. newlineHowever, the usage of either L1 or L2 language varies among the common newlinepeople. Thus, it is essential to develop machine translation (MT) systems newlinefrom English-to-Indian languages for smoother transactions and newlinecommunication across the country. Among the seven linguistic families of newlineSouth Asia, Indo-Aryan and Dravidian languages account for over 90% of newlineIndian speakers. On this note, the current research work proposes to develop newlinean efficient statistical-based (SMT) and neural-based (NMT) machine newlinetranslation systems for translation from English to two Indian languages newlinenamely, Tamil (a Dravidian language) and Hindi (an Indo-Aryan language). newlineMost of the well-established data-driven approaches for developing newlineMT systems require huge amount of parallel text in the source and target newlinelanguage, to train an efficient translation model. Availability of such huge newlineparallel corpora between English and Indian languages is scarce. Further, newlinedomain-specific parallel corpora required to develop highly efficient and newlineapplication-oriented MT systems are also not available. The proposed work newlinemakes use of punctuation marks and re-ordering to augment the parallel data newlineavailable for training the SMT and NMT systems. newline
dc.format.extent	xix,183p.
dc.language	English
dc.relation	p.169-182
dc.rights	university
dc.title	Development and evaluation of hybrid machine translation systems for english to indian language under low resource conditions
dc.title.alternative
dc.creator.researcher	Mrinalini Kannan
dc.subject.keyword	Bilingual Language
dc.subject.keyword	Machine Translation
dc.subject.keyword	Parts-of-Speech
dc.description.note
dc.contributor.guide	Vijayalakshmi P
dc.publisher.place	Chennai
dc.publisher.university	Anna University
dc.publisher.institution	Faculty of Information and Communication Engineering
dc.date.registered
dc.date.completed	2022
dc.date.awarded	2022
dc.format.dimensions	21cm
dc.format.accompanyingmaterial	None
dc.source.university	University
dc.type.degree	Ph.D.
Appears in Departments:	Faculty of Information and Communication Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	22.31 kB	Adobe PDF	View/Open
02_prelim pages.pdf		2.32 MB	Adobe PDF	View/Open
03_contents.pdf		126.87 kB	Adobe PDF	View/Open
04_abstracts.pdf		84.37 kB	Adobe PDF	View/Open
05_chapter1.pdf		385.33 kB	Adobe PDF	View/Open
06_chapter2.pdf		429.65 kB	Adobe PDF	View/Open
07_chapter3.pdf		2.27 MB	Adobe PDF	View/Open
08_chapter4.pdf		1.45 MB	Adobe PDF	View/Open
09_chapter5.pdf		563.12 kB	Adobe PDF	View/Open
10_annexures.pdf		111.21 kB	Adobe PDF	View/Open
80_recommendation.pdf		95.19 kB	Adobe PDF	View/Open

Show simple item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET