Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/366542
Title: An Efficient Technique for Information Extraction from Multiple Documents
Researcher: Praveen K. Wilson
Guide(s): J. R. Jeba
Keywords: Computer Science
Computer Science Information Systems
Engineering and Technology
University: Noorul Islam Centre for Higher Education
Completed Date: 2021
Abstract: As the amount of online information grows, the problem of extracting required information becomes more difficult, which leads to information overload. If this is the situation information extraction cannot be done manually and needs some computer aided approaches. Automatic text summarization is a system of summarizing text by computer when a text is given as input and the output is a shorter and less redundant form of the original text. But recently information about a single topic is found in various sources such as websites, journals, newspapers, text books etc., for which multi-document summarization is required. In the proposed research work it is trying to find an efficient algorithm for extracting relevant information from multiple textual resources. Hence the aim is to develop a summary document which is semantically generated and properly ordered. Even though a number of techniques are available for sentence extraction no one can perform well as that of a human expert. From the analysis it can be observed that semantic approaches can perform better than the existing approaches such as syntactical and statistical, since they are considering the meanings rather that its form. Here it is proposing a combinational semantic approach using the semantic tool Themsets, for analyzing all the input resources and extract reliable and important information from each of the document. The significant sentences extracted newlineare then re-organized to get a better summary. newlineThe entire process of information extraction has been implemented as a sequence of three phases namely anaphora resolution, similarity calculation and sentence extraction and finally the retrieved sentences reordering. Finally the evaluation of the proposed algorithm has been done by calculating the parameter values such as precision, recall and F-measure which are relevant evaluation parameters in the natural language processing application scenario and shows a better performance than the traditional approaches. Since the process of information has a wide
Pagination: 1259
URI: http://hdl.handle.net/10603/366542
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
80_recommendation.pdfAttached File201.56 kBAdobe PDFView/Open
certificate.pdf341.2 kBAdobe PDFView/Open
chapter-1.pdf312.57 kBAdobe PDFView/Open
chapter-2.pdf313.73 kBAdobe PDFView/Open
chapter-3.pdf289.73 kBAdobe PDFView/Open
chapter-4.pdf391.69 kBAdobe PDFView/Open
chapter-5.pdf243.55 kBAdobe PDFView/Open
chapter-6.pdf426.11 kBAdobe PDFView/Open
chapter-7.pdf124.24 kBAdobe PDFView/Open
front page.pdf212.85 kBAdobe PDFView/Open
list of publications.pdf121.49 kBAdobe PDFView/Open
references.pdf210.01 kBAdobe PDFView/Open
table of contents.pdf470.45 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: