Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/537432
Title: | A System for Simplification of Idiomatic Gujarati Text for Improved Interlingual Language Processing |
Researcher: | Modh, Jatinkumar Chamanlal |
Guide(s): | Saini, Jatinderkumar R. |
Keywords: | Computer Science Computer Science Theory and Methods Engineering and Technology |
University: | Gujarat Technological University |
Completed Date: | 2023 |
Abstract: | quotAll existing Gujarati machine translation systems, including Microsoft Translator and Google Translate face the problem with the idiomatic Gujarati text. They are unable to properly translate Gujarati idioms. The proposed system simplifies the Gujarati idioms by correctly recognizing all Gujarati idiom phrases present in the input and replacing them with the corresponding Gujarati meanings of the idiom. As part of interlingual processing, this model translates all Gujarati idioms into the same Gujarati language but with simplified Gujarati meaning. The text is to be translated into the same language, but in a simplified form, in interlingual language processing. The result provided by the proposed system is simplified Gujarati text that does not contain any Gujarati idioms. The purpose of the research is to aid in the translation of Gujarati idioms into any language in the world by simplifying idioms. newlineOverall 3472 distinct and 6081 non-distinct Gujarati idioms are collected, analyzed and classified. This research work classifies the Gujarati idioms into N-gram, M-meaning, root idioms, inflected idioms and personage idioms. Because Gujarati idioms are used in a variety of formats and in a variety of contexts in real life, recognizing them all can be a difficult job for any machine translation system. The proposed system detects all inflected and static idiom formats from the Gujarati text by employing a dictionary-based approach and a rule-based approach to generate dynamic idiom forms. Dynamic generation and detection of idioms are possible using the newly generated 15 suffix and diacritics-based rules. In the case of multiple meaning idioms, a context-based search algorithm determines the particular meaning of the idiom using surrounding contextual words. newlineIn addition, the readability complexity prediction model calculates the readability complexity score and predicts the complexity type for the idiomatic Gujarati text by considering four different parameters. This is innovative and the first in the Gujarati l |
Pagination: | xx, 108p |
URI: | http://hdl.handle.net/10603/537432 |
Appears in Departments: | Computer Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 16.47 kB | Adobe PDF | View/Open |
03_abstract.pdf | 7.57 kB | Adobe PDF | View/Open | |
06_contents.pdf | 611.35 kB | Adobe PDF | View/Open | |
10_chapter1.pdf | 853.77 kB | Adobe PDF | View/Open | |
11_chapter2.pdf | 227.95 kB | Adobe PDF | View/Open | |
12_chapter3.pdf | 2.91 MB | Adobe PDF | View/Open | |
13_chapter4.pdf | 1.56 MB | Adobe PDF | View/Open | |
14_chapter5.pdf | 117.52 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 290.66 kB | Adobe PDF | View/Open | |
prelim pages.pdf | 1.89 MB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: