Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/455118
Title: Study and analysis of document mining Using optimization techniques
Researcher: Thirumoorthy, K
Guide(s): Muneeswaran, K
Keywords: Engineering and Technology
Computer Science
Computer Science Software Engineering
Document classification
Document clustering
Hybrid Jaya optimization
University: Anna University
Completed Date: 2021
Abstract: In this digital era, the internet acts as an important medium for newlinecommunication. Every day, the internet users generate a vast amount of data in newlineWWW repository for communication. The internet users are contributing data in newlinethe form of text such as emails, tweets, product/movie reviews, discussion text, newlinechat, personal/technical blogs, etc. The quest for knowledge in a vast data pool is newlinea challenging task. The document mining techniques are used to get the needed newlineinformation from the unstructured text corpus in the easiest way. The document newlinemining techniques such as text summarization, topic modeling, text clustering, newlinetext feature selection, text classification, sentiment analysis are used to manage newlineand retrieve the needed information from unstructured text corpus. This research newlinework enhances the document classification techniques and document clustering newlinetechniques by using the Jaya optimization algorithm. This research work is newlinesegmented into two phases. newlineIn the first phase, the proposed research work deploys a novel hybrid newlinefeature selection method based on binary Jaya optimization algorithm to obtain newlinethe appropriate subset of optimal features for document classification problem. newlineFeature selection plays a vital role to reduce the high dimension of the feature newlinespace in the text document classification. The dimension reduction of feature newlinespace reduces the computation cost and improves the text classification newlineaccuracy. Hence, the identification of a proper subset of the significant features newlineof the text corpus is needed to classify the data in less computational time with newlinehigher accuracy. This work introduces the new hybrid feature selection method newlinebased on normalized difference measure and binary Jaya optimization algorithm newlineto obtain the appropriate subset of optimal features from the text corpus. The newlineerror rate is used as a minimizing objective function to measure the fitness of a newlinesolution. The nominated optimal feature subsets are evaluated using Naive newlineBayes and Support Vector Machine classifier with various popular benchmark newlinetext corpus datasets newline
Pagination: xx,125p.
URI: http://hdl.handle.net/10603/455118
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File28.6 kBAdobe PDFView/Open
02_prelim pages.pdf489.73 kBAdobe PDFView/Open
03_content.pdf14.42 kBAdobe PDFView/Open
04_abstract.pdf9.73 kBAdobe PDFView/Open
05_chapter 1.pdf326.95 kBAdobe PDFView/Open
06_chapter 2.pdf263.2 kBAdobe PDFView/Open
07_chapter 3.pdf833.54 kBAdobe PDFView/Open
08_chapter 4.pdf977.68 kBAdobe PDFView/Open
09_annexures.pdf80.59 kBAdobe PDFView/Open
80_recommendation.pdf66.8 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: