Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/411302
Title: | Extraction of Multiword Expressions from Hindi Text Document |
Researcher: | Mishra, Atul |
Guide(s): | Shaikh, Soharab Hossain and Sanyal, Ratna |
Keywords: | Computer Science Computer Science Artificial Intelligence Engineering and Technology |
University: | BML Munjal University, Gurugram |
Completed Date: | 2022 |
Abstract: | Multiword expressions (MWEs) are a significant challenge in many fields of newlinelanguage technology. Multiword extraction from random text data has grown in newlinepopularity among the NLP community. This topic of research is strongly connected to statistical analysis and artificial intelligence. This thesis presents a detailed literature assessment and numerous strategies for building an automated newlineMultiword extraction system. The overall contribution of the thesis has been newlinedivided into six parts. newlineIn this study, a method of Hindi MWEs has been proposed, and the significance of boundary threshold calculations in this study. The main objective of this dissertation work is to develop a generalized mechanism for the extraction of Hindi multiword expressions. The primary goal of this research is to build an approach for extracting Hindi MWEs using syntactical and statistical idiosyncrasy newline(i.e., the structure of linguistic patterns and association) and context connection newlinebetween their constituent words. Various combination strategies of different newlineclassifiers based on these properties may be applied to develop a multi word extraction mechanism. Hence, creating a best-performing combination strategy is also an objective of this dissertation. newlineThere are various hurdles in designing a method using these properties. In statistical filtering, calculating the boundary threshold is a challenging task. newlineAnother issue is to combine multiple filters since different combination strategies newlinemay be possible. Thus, recognizing the best combination strategy is also a challenge. In the Hybrid method, Semantic Similarity has been used. The study developed a web application using the Flask framework to automatically extract the Hindi MWEs using the Association based and Hybrid methods. newlineThe methods, evaluation results, and findings in each contribution have been presented in different chapters. The proposed technique is evaluated using the HDTB Treebank and TDIL dataset, which is freely available. |
Pagination: | XVII, 109 |
URI: | http://hdl.handle.net/10603/411302 |
Appears in Departments: | School of Engineering and Technology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf.pdf | Attached File | 202.92 kB | Adobe PDF | View/Open |
02_declaration.pdf.pdf | 67.34 kB | Adobe PDF | View/Open | |
03_certificate.pdf.pdf | 552.88 kB | Adobe PDF | View/Open | |
04_acknowledgement.pdf.pdf | 585.79 kB | Adobe PDF | View/Open | |
05_contents.pdf.pdf | 726.43 kB | Adobe PDF | View/Open | |
06_list of graph and table.pdf | 1.21 MB | Adobe PDF | View/Open | |
07_abstract.pdf.pdf | 549.84 kB | Adobe PDF | View/Open | |
08_chapter1.pdf.pdf | 4.42 MB | Adobe PDF | View/Open | |
09_chapter2.pdf.pdf | 5.74 MB | Adobe PDF | View/Open | |
10_chapter3.pdf.pdf | 4.26 MB | Adobe PDF | View/Open | |
11_chapter4.pdf.pdf | 2.41 MB | Adobe PDF | View/Open | |
12_chapter5.pdf.pdf | 3.99 MB | Adobe PDF | View/Open | |
13_chapter6.pdf.pdf | 3.7 MB | Adobe PDF | View/Open | |
14_summary.pdf.pdf | 185.09 kB | Adobe PDF | View/Open | |
15_bibliography.pdf.pdf | 2.84 MB | Adobe PDF | View/Open | |
80_recommendation.pdf | 551.8 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: