Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/467879
Title: | Design and Development of Document Summarization Algorithms for Kannada |
Researcher: | Arpitha Swamy |
Guide(s): | Srinath, S |
Keywords: | Computer Science Computer Science Artificial Intelligence Data mining Engineering and Technology Natural language processing (Computer science) |
University: | JSS Science and Technology University, Mysuru |
Completed Date: | 2021 |
Abstract: | The fast growth of Internet overwhelmed the people by large amount of information available online. To obtain the data required by theuser from this huge amount of documents text is time consuming and difficult process. So, there is a need of some system which deals with this information overload. newlineThere are various modes of Information Retrieval system - Text Mining, Machine Translation, Text Categorization, Text Summarization etc. In this thesis, an effort has been placed on studying certain aspects of Text Summarization as a solution to the information overload problem. Text Summarization is the process of condensing the original text in the document into shorter form. newlineLiterature shows a very few attempts in generating the summary of documents in Indian languages. The main reason behind this is the complexity involved in developing the resources and tools for Indian languages. In the exhaustive research work of this thesis, somenew ideas have been included to the literature oftext summarization considering Kannada language as the case. newlineIn the first contribution, dataset required for developing text summarization algorithm for Kannada has been developed. The data required for developing Parts-Of-Speech(POS) tagging tool and Named Entity Recognition (NER) tool has been developed. newline In the second contribution,algorithms for generating extractive summary of Kannada document have been developed.Techniques Latent semantic analysis, Clustering, Fuzzy logic and Fused approach has been applied to produce the extractive summary. newlineThe POS tagging tool for Kannada using Conditional Random Field (CRF) has been developed in the third contribution. newlineIn the fourth Contribution,algorithms have been developed to recognize and classify the named entities in Kannada document using CRF and rule-based approach. newlineAn algorithm for creating an abstractive summary using template-based approach for Kannada document has been developedin thefifth Contribution newline newline |
Pagination: | 208p |
URI: | http://hdl.handle.net/10603/467879 |
Appears in Departments: | Department of Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
80_recommendation.pdf | Attached File | 269.18 kB | Adobe PDF | View/Open |
abstract.pdf | 232.39 kB | Adobe PDF | View/Open | |
annexure.pdf | 149.98 kB | Adobe PDF | View/Open | |
chapter 10.pdf | 647.5 kB | Adobe PDF | View/Open | |
chapter 11.pdf | 598.26 kB | Adobe PDF | View/Open | |
chapter 1.pdf | 369.66 kB | Adobe PDF | View/Open | |
chapter 2.pdf | 377.31 kB | Adobe PDF | View/Open | |
chapter 3.pdf | 577.6 kB | Adobe PDF | View/Open | |
chapter 6.pdf | 483.39 kB | Adobe PDF | View/Open | |
chapter 7.pdf | 332.94 kB | Adobe PDF | View/Open | |
chapter 8.pdf | 590.53 kB | Adobe PDF | View/Open | |
chapter 9.pdf | 620.99 kB | Adobe PDF | View/Open | |
pelim pages.pdf | 900.36 kB | Adobe PDF | View/Open | |
table of contents.pdf | 161.18 kB | Adobe PDF | View/Open | |
tittle.pdf | 258.67 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: