Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/467879
Title: Design and Development of Document Summarization Algorithms for Kannada
Researcher: Arpitha Swamy
Guide(s): Srinath, S
Keywords: Computer Science
Computer Science Artificial Intelligence
Data mining
Engineering and Technology
Natural language processing (Computer science)
University: JSS Science and Technology University, Mysuru
Completed Date: 2021
Abstract: The fast growth of Internet overwhelmed the people by large amount of information available online. To obtain the data required by theuser from this huge amount of documents text is time consuming and difficult process. So, there is a need of some system which deals with this information overload. newlineThere are various modes of Information Retrieval system - Text Mining, Machine Translation, Text Categorization, Text Summarization etc. In this thesis, an effort has been placed on studying certain aspects of Text Summarization as a solution to the information overload problem. Text Summarization is the process of condensing the original text in the document into shorter form. newlineLiterature shows a very few attempts in generating the summary of documents in Indian languages. The main reason behind this is the complexity involved in developing the resources and tools for Indian languages. In the exhaustive research work of this thesis, somenew ideas have been included to the literature oftext summarization considering Kannada language as the case. newlineIn the first contribution, dataset required for developing text summarization algorithm for Kannada has been developed. The data required for developing Parts-Of-Speech(POS) tagging tool and Named Entity Recognition (NER) tool has been developed. newline In the second contribution,algorithms for generating extractive summary of Kannada document have been developed.Techniques Latent semantic analysis, Clustering, Fuzzy logic and Fused approach has been applied to produce the extractive summary. newlineThe POS tagging tool for Kannada using Conditional Random Field (CRF) has been developed in the third contribution. newlineIn the fourth Contribution,algorithms have been developed to recognize and classify the named entities in Kannada document using CRF and rule-based approach. newlineAn algorithm for creating an abstractive summary using template-based approach for Kannada document has been developedin thefifth Contribution newline newline
Pagination: 208p
URI: http://hdl.handle.net/10603/467879
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
80_recommendation.pdfAttached File269.18 kBAdobe PDFView/Open
abstract.pdf232.39 kBAdobe PDFView/Open
annexure.pdf149.98 kBAdobe PDFView/Open
chapter 10.pdf647.5 kBAdobe PDFView/Open
chapter 11.pdf598.26 kBAdobe PDFView/Open
chapter 1.pdf369.66 kBAdobe PDFView/Open
chapter 2.pdf377.31 kBAdobe PDFView/Open
chapter 3.pdf577.6 kBAdobe PDFView/Open
chapter 6.pdf483.39 kBAdobe PDFView/Open
chapter 7.pdf332.94 kBAdobe PDFView/Open
chapter 8.pdf590.53 kBAdobe PDFView/Open
chapter 9.pdf620.99 kBAdobe PDFView/Open
pelim pages.pdf900.36 kBAdobe PDFView/Open
table of contents.pdf161.18 kBAdobe PDFView/Open
tittle.pdf258.67 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: