Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/30136
Title: Enhanced graph based techniques for Single and multi document Summarization
Researcher: Hariharan S
Guide(s): Srinivasan R
Keywords: Inverse Document Frequency
Term Frequency
Term Occurrence
World Wide Web
Upload Date: 8-Dec-2014
University: Anna University
Completed Date: 01/05/2010
Abstract: The World Wide Web has become one of the largest information newlineand knowledge repositories in the world Inspite of its easy access it is newlinevirtually impossible for any user to browse or read a large number of such newlineindividual documents available online Text summarization fulfils such newlineinformation seeking goals by providing a method for the user to quickly view newlinethe highlights or relevant portions of document collection With tons of newlineinformation uploaded on the web on a daily basis the task of summarizing newlinebecomes a necessity Also locating and browsing information quickly from a newlinecollection of documents within a short span of time becomes possible with the newlinehelp of summarization This has led to large scale research efforts in text newlinesummarization The issues discussed above necessitate the need for an newlineautomated summarization system The objective of this thesis is to find newlineenhancements to existing graph based methods for summarizing single newlinedocuments and multi document clusters newlineThe objective of automated text summarization is to condense the newlinegiven text to its essential contents based upon the user s choice of brevity newlineThe summarization techniques are broadly categorized into two schemes newlineextraction and abstraction Extraction involves picking up the most important newlinesentences from a document using statistical approaches To measure the newlinesimilarity among the documents, several choices are available like cosine newlinedice and jaccard Also several approaches like Term Frequency TF Term newlineOccurrence TO Inverse Document Frequency IDF and TF multiplied by newlineIDF TF IDF that would influence the content similarity are investigated in newlinethis report newline newline
Pagination: xx, 132p.
URI: http://hdl.handle.net/10603/30136
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File42.24 kBAdobe PDFView/Open
02_certificate.pdf5.85 kBAdobe PDFView/Open
03_abstract.pdf14.19 kBAdobe PDFView/Open
04_acknowledgement.pdf7.22 kBAdobe PDFView/Open
05_content.pdf39.53 kBAdobe PDFView/Open
06_chapter1.pdf55.43 kBAdobe PDFView/Open
07_chapter2.pdf98.58 kBAdobe PDFView/Open
08_chapter3.pdf79.73 kBAdobe PDFView/Open
09_chapter4.pdf184.52 kBAdobe PDFView/Open
10_chapter5.pdf186.96 kBAdobe PDFView/Open
11_chapter6.pdf103.62 kBAdobe PDFView/Open
12_chapter7.pdf8.85 kBAdobe PDFView/Open
13_reference.pdf56.73 kBAdobe PDFView/Open
14_publication.pdf6.84 kBAdobe PDFView/Open
15_vitae.pdf5.99 kBAdobe PDFView/Open


Items in Shodhganga are protected by copyright, with all rights reserved, unless otherwise indicated.