Design and Development of Document Summarization Algorithms for Kannada

Arpitha Swamy

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/467879

Title:	Design and Development of Document Summarization Algorithms for Kannada
Researcher:	Arpitha Swamy
Guide(s):	Srinath, S
Keywords:	Computer Science Computer Science Artificial Intelligence Data mining Engineering and Technology Natural language processing (Computer science)
University:	JSS Science and Technology University, Mysuru
Completed Date:	2021
Abstract:	The fast growth of Internet overwhelmed the people by large amount of information available online. To obtain the data required by theuser from this huge amount of documents text is time consuming and difficult process. So, there is a need of some system which deals with this information overload. newlineThere are various modes of Information Retrieval system - Text Mining, Machine Translation, Text Categorization, Text Summarization etc. In this thesis, an effort has been placed on studying certain aspects of Text Summarization as a solution to the information overload problem. Text Summarization is the process of condensing the original text in the document into shorter form. newlineLiterature shows a very few attempts in generating the summary of documents in Indian languages. The main reason behind this is the complexity involved in developing the resources and tools for Indian languages. In the exhaustive research work of this thesis, somenew ideas have been included to the literature oftext summarization considering Kannada language as the case. newlineIn the first contribution, dataset required for developing text summarization algorithm for Kannada has been developed. The data required for developing Parts-Of-Speech(POS) tagging tool and Named Entity Recognition (NER) tool has been developed. newline In the second contribution,algorithms for generating extractive summary of Kannada document have been developed.Techniques Latent semantic analysis, Clustering, Fuzzy logic and Fused approach has been applied to produce the extractive summary. newlineThe POS tagging tool for Kannada using Conditional Random Field (CRF) has been developed in the third contribution. newlineIn the fourth Contribution,algorithms have been developed to recognize and classify the named entities in Kannada document using CRF and rule-based approach. newlineAn algorithm for creating an abstractive summary using template-based approach for Kannada document has been developedin thefifth Contribution newline newline
Pagination:	208p
URI:	http://hdl.handle.net/10603/467879
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
80_recommendation.pdf	Attached File	269.18 kB	Adobe PDF	View/Open
abstract.pdf		232.39 kB	Adobe PDF	View/Open
annexure.pdf		149.98 kB	Adobe PDF	View/Open
chapter 10.pdf		647.5 kB	Adobe PDF	View/Open
chapter 11.pdf		598.26 kB	Adobe PDF	View/Open
chapter 1.pdf		369.66 kB	Adobe PDF	View/Open
chapter 2.pdf		377.31 kB	Adobe PDF	View/Open
chapter 3.pdf		577.6 kB	Adobe PDF	View/Open
chapter 6.pdf		483.39 kB	Adobe PDF	View/Open
chapter 7.pdf		332.94 kB	Adobe PDF	View/Open
chapter 8.pdf		590.53 kB	Adobe PDF	View/Open
chapter 9.pdf		620.99 kB	Adobe PDF	View/Open
pelim pages.pdf		900.36 kB	Adobe PDF	View/Open
table of contents.pdf		161.18 kB	Adobe PDF	View/Open
tittle.pdf		258.67 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET