Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/342607
Title: Optimized algorithms for classification of text documents
Researcher: Maruthupandi, J
Guide(s): Vimala Devi, K
Keywords: Engineering and Technology
Computer Science
Computer Science Software Engineering
Fuzzy text
Text mining
HABBFO
University: Anna University
Completed Date: 2019
Abstract: The usage and management of a massive volume of data has emerged as an interesting area of research as it poses innumerable challenges. Effectual strategies are requisite to make text mining more perspective. In the first approach, a fuzzy text classification algorithm is proposed for categorizing the provided set of documents. Fuzzy methodologies are adopted to diminish the dimensionality problem. The high dimensional documents are transmuted to low-dimensional fuzzy relevance vectors. The entire space is cleaved into sub-regions which are then integrated to form disparate categories. The experiential outcomes confirmed that this system has better speed and efficacy. In the second work, a wrapper centered Hybrid Artificial Bee Colony and Bacterial Foraging Optimization (HABBFO) structure has been proposed to choose the utmost pertinent feature subset for prediction. Pre-processing steps namely a) tokenization, b) stop-word removal along with c) stemming are done to extort features. Several experiments are done and it is perceived that the proposed system outperformed the other prevailing works in the domain of Feature Selection (FS). In the third approach, a novel multi-perspective centered similarity structure is developed to maximize the accuracy and performance of the similarity measures between 2 documents and document sets. The proposed measures consider 3 cases. The 1st case is when the feature is existent in both the documents. The 2nd case is when the feature is existent in only one document and the 3rd case is when the feature is existent is none of the documents. The similarity betwixt the 2 document sets is also evaluated. The efficacy of such proposed similarity measure was ascertained on disparate world-datasets for text applications encompassing single level classification and k-means clustering newline
Pagination: xix,173 p.
URI: http://hdl.handle.net/10603/342607
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File96.19 kBAdobe PDFView/Open
02_certificates.pdf42.35 kBAdobe PDFView/Open
03_vivaproceedings.pdf92.29 kBAdobe PDFView/Open
04_bonafidecertificate.pdf50.7 kBAdobe PDFView/Open
05_abstracts.pdf65.03 kBAdobe PDFView/Open
06_acknowledgements.pdf59.55 kBAdobe PDFView/Open
07_contents.pdf75.62 kBAdobe PDFView/Open
08_listoftables.pdf94.33 kBAdobe PDFView/Open
09_listoffigures.pdf92.86 kBAdobe PDFView/Open
10_listofabbreviations.pdf56.29 kBAdobe PDFView/Open
11_chapter1.pdf365.59 kBAdobe PDFView/Open
12_chapter2.pdf364.01 kBAdobe PDFView/Open
13_chapter3.pdf422.89 kBAdobe PDFView/Open
14_chapter4.pdf400.89 kBAdobe PDFView/Open
15_chapter5.pdf531.33 kBAdobe PDFView/Open
16_chapter6.pdf575.2 kBAdobe PDFView/Open
17_chapter7.pdf84.18 kBAdobe PDFView/Open
18_conclusion.pdf84.18 kBAdobe PDFView/Open
19_references.pdf193.03 kBAdobe PDFView/Open
20_listofpublications.pdf67.26 kBAdobe PDFView/Open
80_recommendation.pdf101.8 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: